agenticow — Git for Agent Memory

01 Q1 · What is this?

Git, but for an AI agent's memory

In one plain sentence, then grounded all the way down to earth.

agenticow is "Git for Agent Memory": instead of copying an AI agent's entire vector memory every time you want a private version of it, agenticow branches it — copy-on-write, like Git — so a fork, checkpoint, or rollback costs about half a millisecond and 162 bytes no matter how big the memory is.

Here's the everyday version. Modern AI agents remember things by turning text into long lists of numbers — embeddings — and storing them in a vector database so they can later find "the most similar thing I've seen before." That store is the agent's memory.

The expensive habit: the moment you want a separate version of that memory — one per customer, a throwaway sandbox to test a risky import, a safety checkpoint before a tool call — every normal vector database makes you copy the whole thing. agenticow branches it instead. A branch records only its own edits plus a pointer to its parent — 162 bytes, constant time, independent of base size.

The repo's own analogy: "Git doesn't make developers write better code — it lets thousands of them work concurrently, isolate mistakes, roll back, and merge through CI. agenticow is the same shape for cheap-model fleets." ("COW" = Copy-On-Write — and yes, the name is a pun on the cow.)

One glowing violet base page with thin transparent mint-green overlay sheets floating above it; each overlay records only a few new marks while the base shows through untouched. — Friendly view

Technical view — full copy vs. branch

A full copy duplicates the whole index — 496 MB · 67 ms at 1M vectors, every time. A branch stores only its edits plus a parent pointer — 162 B · 0.47 ms, flat. (README "Why".)

Honest framing the repo insists onagenticow is "Git for agent memory, not a way to make agents smarter… The honest claim is leverage, not intelligence." It's plumbing that makes running many isolated agent-memories cheap and safe — it does not improve the model's reasoning.

02 Q2 · What problem does it solve?

Every AI agent needs a memory. The trouble starts when you need more than one version of it.

The natural fix — copy the memory — is so expensive that most people quietly stop doing the safe, smart things. That avoidance is the real cost.

Start from the very beginning. An AI agent's memory is just everything it has been given and has learned — your documents, your corrections, what's happened so far. One memory, shared by everyone using it, is fine… until the day you need a second version of it: a private memory for one customer, a safe place to test a risky new batch of documents, or a "save point" you can return to if something goes wrong.

The obvious way to make a second version is to copy the memory. And that's the trap. An agent's memory is big — so copying it is like photocopying an entire filing cabinet just to jot one note on one page, or duplicating a 50 GB video game every single time you want a save point. It's so slow and so wasteful that, in practice, people just don't — and then they live with the consequences of not having separate versions.

The core problem in one lineIt's not that copying memory is impossible — it's that it's expensive enough that you avoid the safe, smart things (per-user privacy, save-points, sandboxing, experiments). agenticow makes a separate version cost 162 bytes instead of hundreds of megabytes — so you simply stop having to choose.

Four ways this quietly bites you — and how agenticow fixes each

Everyone's data ends up in one shared brain

Without

You'd love to give each customer their own private memory, but a full copy per customer costs hundreds of MB each — unaffordable. So you cram everyone into one shared memory, and one customer's private notes start surfacing in another customer's answers.

With

Each customer gets their own branch — 162 bytes, instant. Private and isolated by construction. The repo measured 0 leaks across 1,000 customers.

You're afraid to teach it anything new

Without

You want to feed the agent a big new batch of documents to make it smarter — but if the batch is messy and makes it worse, there's no clean undo. You'd be rebuilding from a backup and losing work. So you play it safe and the agent stays frozen.

With

Drop a checkpoint, feed in the new docs, and if it's bad, roll back in about half a millisecond. Experimenting is finally safe, so the agent keeps getting better.

One bad document poisons the whole memory

Without

The agent ingests a document from somewhere you don't fully trust — a web page, a user upload. If it's malicious or garbage, it's now blended into the real memory, quietly corrupting answers, and pulling it back out cleanly is a nightmare.

With

Ingest it into a throwaway branch first, inspect it, and if it's bad, discard the branch — the repo's red-team run reports 0 vectors ever reached the real memory (rolled back in 1.1 ms).

You can only afford to try once

Without

You'd like to try 100 different ways of organizing the memory and keep whichever works best. But each attempt is a full copy — 100× the storage and time — so you try one or two and settle for "good enough."

With

100 branches cost about 16 KB total. Try them all in parallel, keep the winner, and throw the rest away for free.

Wait — how do you even end up here?

It's almost a rite of passage. You ship an agent with one shared memory and it works beautifully. Then real life shows up — usually within the first week or two of anything real:

A second customer signs up, and their data can't be allowed to mingle with the first one's.
You want to try a big new import without risking the memory that already works.
A tool call goes sideways, and you wish you'd kept a restore point from five minutes ago.
You'd like to run a few experiments at once and keep whichever turns out best.

None of these are exotic — they show up the moment more than one person, or more than one experiment, touches the same agent. For most teams that's not "someday," it's week two. And every single one of them needs the exact same thing underneath: a separate version of the memory. That's the fork in the road — and where the trouble starts.

What people normally do about it — and why it quietly hurts

The usual workaround	What it quietly costs you
Cram everyone into one shared memory	No real isolation — one user's data bleeds into another's answers.
Copy the whole index for each version	Storage and time explode — about 496 MB and 67 ms each, every time.
Snapshot to disk, reload on failure	Slow and coarse; you lose everything since the last snapshot.
Play it safe and just don't experiment	The agent freezes in place and quietly stops getting better.

Every one of those is the same root cause — copying a memory is expensive — and agenticow's answer is the same elegant move every time: don't copy it, branch it. One shared base, a weightless 162-byte overlay per version — and isolation, instant rollback, diff, and merge all come along for free. The thing that was "496 MB and 67 ms — every time" becomes "162 bytes and 0.47 ms, flat."

…and it's deliberately easy to adoptThere's no new database to run and no migration. It's one npm install agenticow, then a single base.branch('name') call — or the agenticow branch CLI with no code at all. See how to start →

Left: a towering stack of identical bulky filing-cabinet copies being photocopied over and over, tinted coral to feel wasteful. Right: a single violet base with many thin weightless mint overlay sheets — copy-on-write branches. — Friendly view

Technical view — the cost of N versions

Full copies grow with every version — into gigabytes. Branches stay flat in kilobytes. The repo's measured multi-tenant run: 1,000 tenants at 2.4 KB each, 530× less disk than full copies.

03 Q3 · Why is that a problem now?

Cheap-model fleets win on orchestration, not on smarter execution

The lever is running many cheap isolated attempts and throwing failures away — which only works if failure is free.

The repo's thesis: "smarter orchestration, not smarter execution." With cheap models, the win comes from running many isolated attempts and discarding the bad ones — so "'throw it away and try a fresh independent attempt' beats 'make it reflect on its mistake.'"

Reasoning scaffolds can backfire on cheap models. The README cites a scaffolding ablation (FRAMES benchmark, cheap models deepseek-v4-pro + glm-5.2) where clever reasoning prompts make cheap models do worse, not better.
So you parallelize instead. Run a thousand cheap, isolated attempts off one shared memory, keep the winners, drop the rest. That only works if spinning up (and throwing away) an isolated memory is nearly free.
That's exactly what cheap branching unlocks. agenticow is "infrastructure that turns 'run 1,000 cheap agents safely' into a tractable, near-free operation" — it "makes failure free."

A glowing timeline with mint checkpoint flags and a bright rewind arrow snapping back to a clean checkpoint, while a small cluster of coral poison specks is discarded and fades away. — Friendly view

Technical view — reflect-and-retry vs. fan-out-and-discard

The shift: not "make one agent reflect harder," but "run many cheap isolated attempts and keep the winners." agenticow is the plumbing that makes the throwaway branches free.

04 Q4 · How does it solve it?

Copy-on-write branches + a read-through query that merges the lineage

A branch stores only its edits; a query walks the chain child → … → base and merges the results.

Copy-on-write, in plain terms: a branch "records only its own edits plus a pointer to its parent." Nothing is copied up front — the base is shared and read straight through. You only ever store what a branch actually changes.

A base page with transparent overlay sheets; one overlay carries a small coral X (a tombstone hiding something on the base) while the base shows through untouched beneath. — Friendly view

Technical view — the read-through merge

A query "walks the lineage chain (child → … → base), merges each store's results, lets the child win on any id collision, masks anything the branch tombstoned, and re-ranks by exact distance."

The Git-shaped operations you get

branch() / fork() — make an isolated child (162 B, ~0.5 ms).
delete() — a copy-on-write tombstone that hides ancestor ids without touching the base.
checkpoint() / rollback() — freeze a restore point, then discard everything since (rollback p50 0.571 ms).
diff() — returns { added, overridden, deleted }, just like a Git diff.
promote(target) — "replay this branch's edits into target" (a Git-style merge).
lineage() / status() — an auditable parent/label/timestamp trail.

Built on ruvectorThe vector file format (RVF) and the native search live in the upstream ruvector Rust engine; agenticow is the JavaScript copy-on-write / lineage layer on top, with one runtime dependency — @ruvector/rvf-node.

05 Q5 · What does a solved state look like?

From 496 MB / 67 ms per copy to 162 B / 0.47 ms per branch

The before → after, stated as concrete, reproducible numbers from the repo.

"Solved" isn't abstract here — it's a set of measured numbers you can reproduce with npx agenticow bench.

Before · normal vector DB

To snapshot / fork / checkpoint, full-copy the index: at 1M vectors that's 496 MB and 67 ms, every time. 1,000 versions = gigabytes of near-identical copies — so in practice you don't give everyone their own memory.

→

After · agenticow

Branch it copy-on-write: 162 bytes and 0.47 ms, flat, any base size. Rollback p50 0.571 ms. 1,000 tenants at 2.4 KB each, 530× less disk, 0/200 leaks.

Operation at 1M-vector scale	Normal vector DB	agenticow	Win
Make a private version (fork/snapshot)	496 MB	162 B	~3000× smaller
Time to create it	67 ms	0.47 ms	~83× faster
Roll back to a checkpoint	re-copy	0.571 ms	p50
1,000 per-tenant memories on disk	full copies	2.4 KB ea.	530× less

The before/after again: a heavy stack of full filing-cabinet copies versus one base with weightless overlay branches. — Friendly view

Technical view — the headline gaps

The two headline gaps, drawn to scale: a branch is a sliver next to a full copy — ~3000× smaller and ~83× faster to create.

Read these numbers honestlyThis is a win on branching cost, not raw search speed. agenticow "concedes raw single-index ANN throughput" and runs "~6.3× behind hnswlib at 1M-vector scale" — a deliberate trade (see §11). The leverage is cheap isolation, instant rollback, and auditable lineage.

06 The "oh — that's what it's for" moment

Sofia runs a small SaaS writing-assistant

One named, ordinary person. A real before → after. No database degree required.

Sofia runs a small SaaS writing-assistant. Every customer's assistant should remember their documents, their style, their corrections — privately. She's not a database engineer; she just needs each customer to have their own memory without it costing a fortune or leaking across accounts.

One shared violet base memory at the center, with thin mint branches fanning out to a ring of separate customer workspaces, each isolated from the others by an invisible boundary. — Friendly view — Sofia's one base, a branch per customer

Before

To give each customer a private memory, Sofia's only option is to copy the shared base index for every one of them — hundreds of MB and tens of ms per customer, every time. A thousand customers is hundreds of GB of near-identical copies. So in practice she can't afford per-customer memory, and bolts everyone into one shared store — where one customer's data can surface in another's results.

→

After agenticow

Sofia keeps one base and gives each customer a branch — 162 bytes, ~0.5 ms, any base size. A query merges the customer's branch over the shared base, child-wins, tombstones masked — so each customer sees their own memory and nothing else ("0/200 leaks, 2.4 KB/tenant, 530× less disk"). If a nightly import goes wrong, she rollback()s that one customer in ~0.5 ms without touching anyone else.

The "oh, that's what it's for" line: It's the difference between photocopying the entire filing cabinet for every customer, and giving each one a transparent overlay sheet that only records what they changed.

The honest hedge the repo demandsThe per-tenant isolation is real and measured, but Sofia's win is cheap isolation, instant rollback, and auditable lineage — not "the fastest possible search." If she needed maximum raw search speed on one static index, the repo itself says to use a dedicated ANN library.

07 Q6 · Where else does this apply?

A gallery of real, runnable uses

Eight scenarios — each its own card. Every command maps to a real script in the repo's examples/.

This is the fun part: half a dozen or so real examples of how you actually use this — to make your agent (sorry in advance) udderly more efficient. Open any card for the situation, the exact command, what it does, and a diagram of what happens. (Every one maps to a script in examples/, per examples/README.md.)

A detailed Git-style branch graph: a violet trunk of base commits with mint feature branches splitting off, a few merging back through a glowing gate, and one or two ending in faint coral dead-ends. — Friendly view — every demo is the same move

Technical view — how the 8 demos relate

The eight demos in four groups: fan out, isolate & rewind, merge, and verify — every one built on the same cheap copy-on-write branch.

1Run a thousand cheap agents in parallel — safelyparallel-agents.mjs

SituationYou want to fan a task out to many cheap-model attempts and keep only the winners.

What it does"Fork N branches from a base, ingest + tombstone per branch, query each, roll one back." Each agent gets an isolated memory off one shared base.

What you getN independent agent-memories for ~162 bytes each, so "try 1,000 attempts and throw the bad ones away" is near-free.

$ node examples/parallel-agents.mjscopy

Visual — one base, many forks, a few discarded

2Per-customer / multi-tenant memory (one base, a branch per tenant)multi-tenant-saas.mjs

SituationA SaaS product where every customer needs private memory that can't leak.

What it doesSpins up "1,000 isolated tenant branches" over a shared base and probes for cross-tenant leaks.

What you getMeasured 0/200 leaks, 2.4 KB/tenant, 530× less disk than full copies.

$ node examples/multi-tenant-saas.mjscopy

Visual — a shared base, isolated tenant overlays

3Sandbox an untrusted document (red-team / prompt-injection safety)red-team-sandbox.mjs

SituationYou must ingest a document you don't trust without letting it poison the real memory.

What it doesThe untrusted doc lands in an isolated fork; a deterministic injection-distance probe gates it; if it's an exploit you rollback().

What you get"1.1 ms, 0 vectors reached base" — the attack is contained and erased without ever touching production memory.

$ node examples/red-team-sandbox.mjscopy

Visual — poison quarantined, base untouched

4Time-travel debugging & crash-recovery checkpointstime-travel-debug.mjs

SituationA latent bug corrupted the agent's memory and you need to rewind past it.

What it doescheckpoint() freezes restore points; rollback(checkpointId) discards everything since — no replaying the agent's steps.

What you getThe corrupted state rewound to a known-good point in well under a millisecond (rollback p50 0.571 ms).

$ node examples/time-travel-debug.mjscopy

Visual — rewind to a clean checkpoint

5A/B test at scale, then promote the winnerab-at-scale.mjs · promotion-pipeline.mjs

SituationYou want to try many memory variants and merge the best one into production.

What it doesBranches 128 variants off one base; the pipeline runs agent → sandbox → review → prod, using diff() and promote(target) to "replay this branch's edits into target."

What you getA scaled experiment plus a gated path to ship the winner — a Git-style merge.

$ node examples/ab-at-scale.mjscopy

Visual — 128 variants → gate → one promoted

6Compliance, lineage & GDPR right-to-erasurecompliance-lineage.mjs

SituationA user invokes "delete my data" and you must prove it's gone.

What it doeslineage() / status() give an auditable parent/label/timestamp trail; because each user's data lives in their own branch layer, you drop that layer to surgically erase them.

What you getProvable, scoped deletion plus an audit trail — instead of hunting one user's vectors out of a shared index.

$ node examples/compliance-lineage.mjscopy

Visual — detach one user's layer, the rest intact

7Verify the headline numbers yourselfbench

SituationYou don't believe "162 bytes / 0.47 ms / 83× / 3000×."

What it doesRuns the benchmark and the acceptance suite that produce bench/acceptance-results.json.

What you getThe branch-cost, rollback-latency, and disk-savings figures reproduced on your own machine — "tests 8/8 passing."

$ npx agenticow benchcopy

Visual — the bench readout

8See it run end-to-end with no codedemo

SituationYou just want to watch the whole branch → checkpoint → rollback → diff story once.

What it doesRuns the scripted end-to-end walkthrough built into the CLI.

What you getA guided tour of every core operation against a real store — the fastest way to "get it."

$ agenticow democopy

Visual — the scripted walkthrough

08 "I already have a vector DB — why this too?"

Why this vs. the vector database you already use

Answered head-on — including the trade agenticow names against itself.

You might already use…	What agenticow changes
Pinecone / Chroma / pgvector / hnswlib	Those are built to search one index fast. To get a second isolated version you full-copy the index. agenticow keeps search "good enough" and makes branching the cheap primitive — 162 B / 0.47 ms vs 496 MB / 67 ms. It's often used on top of an engine, not instead of one.
"I'll just snapshot it myself"	A snapshot is still a whole copy. agenticow's branches are copy-on-write and queryable live (read-through merge), diff-able (`{ added, overridden, deleted }`), and merge-able (`promote`). It's version control, not backups.
Any of the above	The one thing nothing else here gives you: constant-cost, constant-size branching of a vector memory with Git-shaped semantics — branch / checkpoint / rollback / diff / promote / lineage. That's the moat.

The before/after split once more: heavy full-copy filing cabinets versus one base with weightless branch overlays. — Friendly view

Technical view — the deliberate trade

agenticow's wedge: it gives up the raw-search-speed race ("~6.3× behind hnswlib at 1M", a deliberate trade) to own the corner nothing else here occupies — constant-cost branching.

09 Q · How would you implement it?

What's actually inside — one small npm package

agenticow is a JavaScript library + CLI, not a model or a service. Here's the real layout.

One npm package, ESM, Node ≥ 18, one runtime dependency. This is the layout (reconstructed from package.json files + README references) — annotated so you know what each part is for.

agenticow/ # one npm package · ESM · Node ≥ 18 · MIT © ruvnet ├ src/ │ ├ index.js # the library: open / branch / fork / query / delete / │ │ # checkpoint / rollback / diff / promote / lineage ← the heart │ └ index.d.ts # TypeScript types ├ bin/agenticow.js # the CLI: init · ingest · branch · checkpoint · rollback · │ # diff · promote · query · lineage · demo · bench · acceptance ← front door ├ examples/ # 16 runnable .mjs (parallel-agents, multi-tenant-saas, │ # red-team-sandbox, time-travel-debug, ab-at-scale …) ← the proof ├ bench/ # bench.js · acceptance.js · claim-ladder.js · acceptance-results.json ├ test/ # *.test.js (README badge: tests 8/8 passing) └ # depends on → @ruvector/rvf-node ^0.2.0 # prebuilt native (Rust/NAPI) RVF engine

Use it in your own JavaScript

Open a base, branch it, and the branch is isolated and instantly queryable. Roll back to wipe a bad ingest without touching the clean memory.

$ npm install agenticowcopy

import { open } from 'agenticow';

const base  = open('memory.rvf', { dimension: 1536 });
base.ingest([{ id: 1, vector: embedding }, /* ... */]);

const agent = base.branch('agent-a');        // ~0.5 ms / 162 B, any base size
agent.ingest([{ id: 9001, vector: newMemory }]);

const hits  = agent.query(queryVector, 10);  // -> [{ id, distance, branch }, ...]

const ckpt  = agent.checkpoint('clean');
agent.ingest([{ id: 666, vector: poison }]);
agent.rollback(ckpt.id);                      // poison gone, clean memory intact

Technical view — where agenticow sits in your stack

agenticow is a thin JavaScript copy-on-write/lineage layer; the vector format (RVF) and native search live below it in the ruvector engine.

10 Q7 · How do you start?

Two ways to start — both real, both in the README

One npm install, then either the CLI (no code) or the API.

Install it. Requirements: Node ≥ 18, ESM.
$ npm install agenticowcopy
CLI, no code. Init a store, ingest, branch per user, query the read-through, see the diff:
$ agenticow init mem.rvf --dim 128 $ agenticow ingest mem.rvf --n 5000 $ agenticow branch mem.rvf --as user-42 $ agenticow query mem.rvf.user-42.rvf --k 10 $ agenticow diff mem.rvf.user-42.rvfcopy
Or use the API (import { open } from 'agenticow') — see §09 for the worked snippet. To just watch it run, agenticow demo.

A timeline with checkpoint flags and a rewind arrow — the checkpoint/rollback you get for free the moment you branch. — Friendly view

Technical view — two on-ramps

Pick the CLI to kick the tires with no code, or the API to wire branching into your own agent.

Native fast-path noteThe native cross-branch ANN merge (fork(..., { nativeAnn: true }), recall@10 ≈ 1.0) ships for linux-x64-gnu today; on macOS / Windows / linux-arm64 it "degrade[s] gracefully to the exact read-through path" — same answers, just not the native speed-up.

Open the repo on GitHub ↗

11 In the repo's own words — don't soften them

Honest limits

agenticow is unusually candid about what it is and isn't. So is this page.

Not a faster search engineIt "concedes raw single-index ANN throughput" and runs "~6.3× behind hnswlib at 1M-vector scale" (SIFT-1M, recall@10 ≈ 0.97) — a deliberate trade. "If you need maximum raw similarity-search speed on a static index, use a dedicated ANN library."

Native cross-branch ANN is Linux-only todayThe fast native path ships for linux-x64-gnu only; "darwin / win / linux-arm64 are pending a CI cross-compile and degrade gracefully to the exact read-through path."

It does not make models smarter"The honest claim is leverage, not intelligence." The cognitive quality of a branch is explicitly out of scope — agenticow moves and isolates memory; it doesn't improve reasoning.

Selection must be external and deterministicThe repo warns against a cheap LM-judge: "a verifier-gated LM-judge picks worse than a plain majority vote." The promotion gate must be tests / regex / checkers — "a scoring function, not validated AI cognition."

"Exotic" applications are vision / PoCParallel "selves," Darwin-style memory evolution, simulated orgs — research demos, not validated capabilities. The distribution / marketplace / merge-policy layer is "roadmap, not shipped."

A cosine reopen quirk"rvf-node does not persist the cosine metric across a file reopen"; agenticow drives the engine with "L2 over L2-normalized vectors" for cosine. Reopen with { metric: 'cosine' } or use save() / load().

A version string lags, and it's very newThe README body still says agenticow@0.2.1 while package.json and npm are at 0.2.3 — read the version that matches what you install. The project was created 2026-06-28 and is early-stage infrastructure.

Stop copying your agent's memory. Branch it.

Git, but for an AI agent's memory

Every AI agent needs a memory. The trouble starts when you need more than one version of it.

Four ways this quietly bites you — and how agenticow fixes each

Everyone's data ends up in one shared brain

You're afraid to teach it anything new

One bad document poisons the whole memory

You can only afford to try once

Wait — how do you even end up here?

What people normally do about it — and why it quietly hurts

Cheap-model fleets win on orchestration, not on smarter execution

Copy-on-write branches + a read-through query that merges the lineage

The Git-shaped operations you get

From 496 MB / 67 ms per copy to 162 B / 0.47 ms per branch

Before · normal vector DB

After · agenticow

Sofia runs a small SaaS writing-assistant

Before

After agenticow

A gallery of real, runnable uses

Why this vs. the vector database you already use

What's actually inside — one small npm package

Use it in your own JavaScript

Two ways to start — both real, both in the README

Honest limits