Every performance decision that shipped, in one place: what it is,
where it lives, why it’s safe, and the invariant that keeps it valid.
If you change code near one of these, the invariant column is the
contract you must re-verify. Measured numbers come from
benchmarks.md and the CLAUDE.md decision log.
The headline result: on the 100-project synthetic workspace, vx’s
all-hits path is ~3.9× faster than Turbo and ~5.4× faster than Nx,
and the no-cache run lands within 4% of the bare-shell floor.
| # | What | Where | Why / effect | Invariant to preserve |
|---|
| 1 | xxHash3 for every cache-key site (was SHA-256) | util/hash.ts, cache/cache.ts:key, execute-task.ts, fingerprint.ts, project-loader.ts | ~5× faster derivation on the warm path; 16-hex keys match Turbo’s xxh64 width | Not cryptographic — fine for content addressing, never use for auth/integrity vs. an adversary |
| 2 | Seed-chained key folding (xxh3(part, prevDigest)) instead of a streaming hasher | cache/cache.ts:key | Bun has no streaming xxh3; chaining avoids concatenating a big key buffer | Every variable-length part must be length-prefixed or \0-delimited so part boundaries stay unambiguous |
| 3 | Workspace fingerprint computed once per run, folded into every task key | workspace/fingerprint.ts | One lockfile read/hash instead of N | Must cover every supported lockfile + pnpm-workspace.yaml; fixed file order |
| 4 | Config module-cache busting by content hash, not mtime | workspace/project-loader.ts | Same content → Bun module-cache hit; changed content → fresh eval. Hashing a <10 KB config is ~50 µs | The hash must cover the full file bytes; mtime is not reliable (Bun mtimeNs undefined; ms granularity misses rapid edits) |
| 5 | Per-run hashCache for repeated file/config hashes | orchestrator/prepare.ts → execute-task.ts | Shared files (presets) hash once per run | Cache is per-run only; nothing may persist across runs without entering the key itself |
| # | What | Where | Why / effect | Invariant to preserve |
|---|
| 6 | git ls-files --cached --others --exclude-standard instead of an FS walk + ignore parsing | cache/inputs.ts | Turbo/Nx parity; git’s C-speed ignore handling; correct nested-.gitignore anchoring (fixed a real v13 bug) | vx hard-requires git — no fallback walker. Absent git → UserError, never silent degradation |
| 7 | One workspace-root git snapshot per run, partitioned per project (gitFilesCache) | cache/inputs.ts, invalidated in execute-task.ts | One git ls-files spawn instead of P | Staleness rule: after a SAVE the project’s entry is dropped (executed tasks may write undeclared files only git can see). After a RESTORE the exact changed paths are recorded (GitFilesCache.markOutputsChanged); downstream tasks re-spawn git only when their input globs match a changed path — zero spawns in src→dist layouts (was: one spawn per restored project) |
| 8 | O(P log P) nested-project-boundary computation (sort + contiguous prefix scan; was O(P²)) | workspace/nested-dirs.ts | Negligible at 10 projects, real at 1000 | Prefix match must include the trailing path.sep so pkg/a is not treated as parent of pkg/ab |
| # | What | Where | Why / effect | Invariant to preserve |
|---|
| 9 | O(N + E) scheduler tick: per-node dep counters + ready priority queue (was O(N²) rescan per completion) | graph/scheduler.ts | Tick cost independent of graph size | Priority contract: higher transitive-reverse-dep count first; ties break in graph-insertion order (binary-search insert respects existing equals) |
| 9b | Bitset transitive-dependent closure in reverse-topo order (was memoized DFS over string Sets) | graph/scheduler.ts:computeReverseDepCount | Set closures were O(N²) entries: 8.5 s of a 10 s warm run on a 1090-package / 100-layer repo; bitsets = O(E·N/32), single-digit ms | Counts must stay EXACT (popcount of the closure), not a summed approximation — diamonds double-count under naive summing |
| 10 | Iterative cycle detection with Uint8Array color array (was recursion + Map) | graph/task-graph.ts:detectCycle | No V8 stack ceiling on deep dependsOn chains; no per-node Map cost | Must still report the cycle path in the error |
| 11 | Group tasks execute with zero I/O — hash rolled up from upstream outcomes | execute-task.ts:computeGroupHash | Umbrella tasks (install, ci) cost microseconds | Group hash must fold every upstream id:hash pair, sorted, so it stays order-independent |
| # | What | Where | Why / effect | Invariant to preserve |
|---|
| 12 | Hand-rolled in-process tar parse/extract (kept over Bun.Archive) | cache/tar.ts | Benchmarked: Bun.Archive is 15–400× slower for our artifact shape (KB–MB, flat trees) — fixed JS-bridge overhead dominates small archives. Also: no tar subprocess fork per restore | Security checks are part of the parser: reject absolute paths, .., hardlinks; unlink symlinks before write; re-verify resolved target stays under dest |
| 13 | Restore skip via output_files rows + stat check (isOutputsCurrent) | cache/cache.ts | Warm-warm hit = N stats, zero writes, zero decompress | Stored size/mode/mtime fingerprint must match what extractOutputs produces (mtime compared at floor-to-second — tar headers carry seconds) |
| 14 | SQLite metadata index, WAL, busy_timeout = 5000, one handle per run | cache/cache.ts | Indexed lookups; concurrent vx run invocations don’t crash | Entry metadata lives in SQL only (v17 artifacts carry just stdout + outputs) — never reintroduce a meta.json that can drift from the rows |
| 15 | Atomic artifact publish: unique tmp name → rename (no pre-rm) | cache/cache.ts:writeArtifactAndIndex | Concurrent saves of the same hash are either-or; readers never see partial bytes | POSIX rename replaces atomically; the pre-rm variant reintroduces a delete-after-rename race |
| 16 | Single-transaction batch writes: recordRuns, prune deletes (+ parallel artifact rm) | cache/cache.ts | One fsync instead of N | Prune’s IN-list binding must stay under SQLite’s 999-placeholder limit per statement |
| 17 | v17 artifact = exactly stdout + outputs/<rel>; identical bytes local and remote | cache/cache.ts, layered-cache.ts | No stage-dir repack for upload — save re-reads the just-written artifact and PUTs it verbatim; remote hit writes the body straight to disk | Local and remote layers must keep transporting the same byte format; metadata travels out-of-band (SQL row / HTTP headers) |
| 17b | Async remote prefetch: derive stable keys up front, fire remote GETs in the background, overlapping network with execution | orchestrator/remote-prefetch.ts, cache/layered-cache.ts (prefetch + inflight map) | A remote-served warm run no longer pays remote-GET latency on each task’s critical path; the GETs race alongside execution and land in local before execute-task needs them | Remote-only (gated on LayeredCache; local-only runs do NO upfront key pass and NO local probing — byte-identical). Stable-key-only (a task whose inputs could match an upstream output is skipped → lazy read-through). At-most-once per key (prefetch + get share the inflight map; a settled-false miss blocks a second probe). Provenance stays remote so the outcome is cache-hit-remote. Caller awaits the prefetch pool before cache.close() so no ingest hits a closed DB |
| # | What | Where | Why / effect | Invariant to preserve |
|---|
| 18 | Logger buffers chunks as string[], joins on flush (was += accumulation) | orchestrator/logger.ts | += was O(N²) over total bytes for chatty tasks | Per-task ordering within a stream must be append-only |
| 19 | Memoized Bun.color ANSI lookups | orchestrator/colors.ts | Called thousands of times with one of four hex strings | Cache key is the color string; gating (NO_COLOR etc.) happens before lookup |
| 20 | Bun.Glob for filter matching + recursive listing (was hand-rolled regex / readdir recursion) | workspace/filter.ts, cache/layered-cache.ts | Native glob engine | Glob semantics are now Bun’s — brace/bracket behavior changes with Bun upgrades |
| 21 | Concurrent project discovery (Promise.all over package globs) | workspace/workspace.ts | Was serialized | Dedupe pass after must keep deterministic order |
| 22 | AbortSignal.timeout for remote-cache fetches | cache/remote-cache.ts | Drops the manual controller + setTimeout ceremony | Catch both AbortError and TimeoutError |
| 23 | toPosix fast path when path.sep === '/' | util/paths.ts | Skips split/join on the dominant platform | Windows is unsupported anyway; revisit if that changes |
| 24 | Hoisted dynamic imports out of per-task paths | execute-task.ts, layered-cache.ts | await import() per task was measurable | — |
| 25 | Bun.spawn everywhere (with resourceUsage()) | exec/runner.ts | Native spawn + free cpu_ms / peak-RSS capture per child | — |
| # | What | Where | Why / effect | Invariant to preserve |
|---|
| 26 | Bitset transitive-dependent closure for scheduler priority (was Set-DFS) | graph/scheduler.ts:computeReverseDepCount | 8.5 s → ms on dense 100-layer graphs; O(E·N/32) | Counts stay EXACT (popcount); own Kahn pass — never trust Map insertion order |
| 27 | Bitset package-graph closures, sorted-name indexing (sort-free materialization) | workspace/package-graph.ts | 68 ms → ~ms at 1090 projects | Cyclic dep graphs fall back wholesale to legacy DFS |
| 28 | Binary-search git-file partitioning (was O(P·F) startsWith) | cache/inputs.ts:populateGitFilesCache | 54 ms → ~5 ms at 1090×9k | Prefix ranges need the array sorted with the SAME comparator as the search |
| 29 | Parallel project discovery; one readdir replaces 4 config exists-probes | workspace/workspace.ts:listProjects | serial I/O → concurrent | Duplicate-name check stays a deterministic sequential pass |
| 30 | Frontier ^task expansion (nearest holder per path, sparse bridging kept) — v19 | graph/task-graph.ts | 8.5× fewer edges on dense graphs; shrinks group-hash sorts, dep sorts, closures, upstream folds at once | Holder’s own dependsOn owns deeper ordering (Turbo parity, documented stop) |
| 31 | Git blob OIDs as input-file hashes; dirty files get in-process blob OIDs; both git spawns concurrent — v20 | cache/inputs.ts, cache/cache.ts:hashFile | Clean-tree hashing: zero reads/stats/SQLite. 3.2× warm run-phase at 15k files (245→76 ms) | A file’s key contribution must never flip across dirty↔clean (uniform blob-OID domain); symlinks/conflict stages never trusted |
| 32 | Early cutoff: downstream keys fold upstream OUTPUT content identity — v21 REVERTED in v22 | orchestrator/upstream.ts (now folds the upstream INPUT key) | Removed: pure-input transitive hashing replaced it (owner: “rely only on task input hashes”). Identical-output rebuilds re-run dependents again — rare in practice; see caching.md. | n/a — no output content participates in any cache key in v22 |
From benchmarks.md and the perf-pass backlog — candidates with a
measured or suspected win, parked until profiled:
- Batched cache-entry lookup for the all-hits path (Nx does one
getBatch query; we probe per task). A previous getMetaBatch
attempt was removed in the 2026-05 dead-code pass because it never
wired into execute-task.ts — re-attempt only with the wiring.
- Group-task reverse-index for
^build fan-out re-resolution.
- Memoized
taskConfigHash for projects sharing a preset config
object (hash by object identity per run).
relPosix fast path for ASCII workspace-root prefixes in the
enumeration loop.
Stale claims, for the record: benchmarks.md’s “hardlink restore”
bullet describes a pre-v17 design that never shipped in this form —
restores are tar extracts with a stat-check fast path (see #12/#13);
tar hardlink entries are rejected as a security measure.