01
Q1 · What is this?
Git, but for an AI agent's memory
In one plain sentence, then grounded all the way down to earth.
agenticow is "Git for Agent Memory": instead of copying an AI agent's entire vector memory every time you want a private version of it, agenticow branches it — copy-on-write, like Git — so a fork, checkpoint, or rollback costs about half a millisecond and 162 bytes no matter how big the memory is.
Here's the everyday version. Modern AI agents remember things by turning text into long lists of numbers — embeddings — and storing them in a vector database so they can later find "the most similar thing I've seen before." That store is the agent's memory.
The expensive habit: the moment you want a separate version of that memory — one per customer, a throwaway sandbox to test a risky import, a safety checkpoint before a tool call — every normal vector database makes you copy the whole thing. agenticow branches it instead. A branch records only its own edits plus a pointer to its parent — 162 bytes, constant time, independent of base size.
The repo's own analogy: "Git doesn't make developers write better code — it lets thousands of them work concurrently, isolate mistakes, roll back, and merge through CI. agenticow is the same shape for cheap-model fleets." ("COW" = Copy-On-Write — and yes, the name is a pun on the cow.)

496 MB · 67 ms at 1M vectors, every time. A branch stores only its edits plus a parent pointer — 162 B · 0.47 ms, flat. (README "Why".)
02
Q2 · What problem does it solve?
Every AI agent needs a memory. The trouble starts when you need more than one version of it.
The natural fix — copy the memory — is so expensive that most people quietly stop doing the safe, smart things. That avoidance is the real cost.
Start from the very beginning. An AI agent's memory is just everything it has been given and has learned — your documents, your corrections, what's happened so far. One memory, shared by everyone using it, is fine… until the day you need a second version of it: a private memory for one customer, a safe place to test a risky new batch of documents, or a "save point" you can return to if something goes wrong.
The obvious way to make a second version is to copy the memory. And that's the trap. An agent's memory is big — so copying it is like photocopying an entire filing cabinet just to jot one note on one page, or duplicating a 50 GB video game every single time you want a save point. It's so slow and so wasteful that, in practice, people just don't — and then they live with the consequences of not having separate versions.
Four ways this quietly bites you — and how agenticow fixes each
Everyone's data ends up in one shared brain
You'd love to give each customer their own private memory, but a full copy per customer costs hundreds of MB each — unaffordable. So you cram everyone into one shared memory, and one customer's private notes start surfacing in another customer's answers.
Each customer gets their own branch — 162 bytes, instant. Private and isolated by construction. The repo measured 0 leaks across 1,000 customers.
You're afraid to teach it anything new
You want to feed the agent a big new batch of documents to make it smarter — but if the batch is messy and makes it worse, there's no clean undo. You'd be rebuilding from a backup and losing work. So you play it safe and the agent stays frozen.
Drop a checkpoint, feed in the new docs, and if it's bad, roll back in about half a millisecond. Experimenting is finally safe, so the agent keeps getting better.
One bad document poisons the whole memory
The agent ingests a document from somewhere you don't fully trust — a web page, a user upload. If it's malicious or garbage, it's now blended into the real memory, quietly corrupting answers, and pulling it back out cleanly is a nightmare.
Ingest it into a throwaway branch first, inspect it, and if it's bad, discard the branch — the repo's red-team run reports 0 vectors ever reached the real memory (rolled back in 1.1 ms).
You can only afford to try once
You'd like to try 100 different ways of organizing the memory and keep whichever works best. But each attempt is a full copy — 100× the storage and time — so you try one or two and settle for "good enough."
100 branches cost about 16 KB total. Try them all in parallel, keep the winner, and throw the rest away for free.
Wait — how do you even end up here?
It's almost a rite of passage. You ship an agent with one shared memory and it works beautifully. Then real life shows up — usually within the first week or two of anything real:
- A second customer signs up, and their data can't be allowed to mingle with the first one's.
- You want to try a big new import without risking the memory that already works.
- A tool call goes sideways, and you wish you'd kept a restore point from five minutes ago.
- You'd like to run a few experiments at once and keep whichever turns out best.
None of these are exotic — they show up the moment more than one person, or more than one experiment, touches the same agent. For most teams that's not "someday," it's week two. And every single one of them needs the exact same thing underneath: a separate version of the memory. That's the fork in the road — and where the trouble starts.
What people normally do about it — and why it quietly hurts
| The usual workaround | What it quietly costs you |
|---|---|
| Cram everyone into one shared memory | No real isolation — one user's data bleeds into another's answers. |
| Copy the whole index for each version | Storage and time explode — about 496 MB and 67 ms each, every time. |
| Snapshot to disk, reload on failure | Slow and coarse; you lose everything since the last snapshot. |
| Play it safe and just don't experiment | The agent freezes in place and quietly stops getting better. |
Every one of those is the same root cause — copying a memory is expensive — and agenticow's answer is the same elegant move every time: don't copy it, branch it. One shared base, a weightless 162-byte overlay per version — and isolation, instant rollback, diff, and merge all come along for free. The thing that was "496 MB and 67 ms — every time" becomes "162 bytes and 0.47 ms, flat."
npm install agenticow, then a single base.branch('name') call — or the agenticow branch CLI with no code at all. See how to start →
03
Q3 · Why is that a problem now?
Cheap-model fleets win on orchestration, not on smarter execution
The lever is running many cheap isolated attempts and throwing failures away — which only works if failure is free.
The repo's thesis: "smarter orchestration, not smarter execution." With cheap models, the win comes from running many isolated attempts and discarding the bad ones — so "'throw it away and try a fresh independent attempt' beats 'make it reflect on its mistake.'"
- Reasoning scaffolds can backfire on cheap models. The README cites a scaffolding ablation (FRAMES benchmark, cheap models
deepseek-v4-pro+glm-5.2) where clever reasoning prompts make cheap models do worse, not better. - So you parallelize instead. Run a thousand cheap, isolated attempts off one shared memory, keep the winners, drop the rest. That only works if spinning up (and throwing away) an isolated memory is nearly free.
- That's exactly what cheap branching unlocks. agenticow is "infrastructure that turns 'run 1,000 cheap agents safely' into a tractable, near-free operation" — it "makes failure free."

04
Q4 · How does it solve it?
Copy-on-write branches + a read-through query that merges the lineage
A branch stores only its edits; a query walks the chain child → … → base and merges the results.
Copy-on-write, in plain terms: a branch "records only its own edits plus a pointer to its parent." Nothing is copied up front — the base is shared and read straight through. You only ever store what a branch actually changes.

The Git-shaped operations you get
branch()/fork()— make an isolated child (162 B, ~0.5 ms).delete()— a copy-on-write tombstone that hides ancestor ids without touching the base.checkpoint()/rollback()— freeze a restore point, then discard everything since (rollback p50 0.571 ms).diff()— returns{ added, overridden, deleted }, just like a Git diff.promote(target)— "replay this branch's edits into target" (a Git-style merge).lineage()/status()— an auditable parent/label/timestamp trail.
@ruvector/rvf-node.
05
Q5 · What does a solved state look like?
From 496 MB / 67 ms per copy to 162 B / 0.47 ms per branch
The before → after, stated as concrete, reproducible numbers from the repo.
"Solved" isn't abstract here — it's a set of measured numbers you can reproduce with npx agenticow bench.
Before · normal vector DB
To snapshot / fork / checkpoint, full-copy the index: at 1M vectors that's 496 MB and 67 ms, every time. 1,000 versions = gigabytes of near-identical copies — so in practice you don't give everyone their own memory.
After · agenticow
Branch it copy-on-write: 162 bytes and 0.47 ms, flat, any base size. Rollback p50 0.571 ms. 1,000 tenants at 2.4 KB each, 530× less disk, 0/200 leaks.
| Operation at 1M-vector scale | Normal vector DB | agenticow | Win |
|---|---|---|---|
| Make a private version (fork/snapshot) | 496 MB | 162 B | ~3000× smaller |
| Time to create it | 67 ms | 0.47 ms | ~83× faster |
| Roll back to a checkpoint | re-copy | 0.571 ms | p50 |
| 1,000 per-tenant memories on disk | full copies | 2.4 KB ea. | 530× less |

06
The "oh — that's what it's for" moment
Sofia runs a small SaaS writing-assistant
One named, ordinary person. A real before → after. No database degree required.
Sofia runs a small SaaS writing-assistant. Every customer's assistant should remember their documents, their style, their corrections — privately. She's not a database engineer; she just needs each customer to have their own memory without it costing a fortune or leaking across accounts.

Before
To give each customer a private memory, Sofia's only option is to copy the shared base index for every one of them — hundreds of MB and tens of ms per customer, every time. A thousand customers is hundreds of GB of near-identical copies. So in practice she can't afford per-customer memory, and bolts everyone into one shared store — where one customer's data can surface in another's results.
After agenticow
Sofia keeps one base and gives each customer a branch — 162 bytes, ~0.5 ms, any base size. A query merges the customer's branch over the shared base, child-wins, tombstones masked — so each customer sees their own memory and nothing else ("0/200 leaks, 2.4 KB/tenant, 530× less disk"). If a nightly import goes wrong, she rollback()s that one customer in ~0.5 ms without touching anyone else.
The "oh, that's what it's for" line: It's the difference between photocopying the entire filing cabinet for every customer, and giving each one a transparent overlay sheet that only records what they changed.
07
Q6 · Where else does this apply?
A gallery of real, runnable uses
Eight scenarios — each its own card. Every command maps to a real script in the repo's examples/.
examples/.This is the fun part: half a dozen or so real examples of how you actually use this — to make your agent (sorry in advance) udderly more efficient. Open any card for the situation, the exact command, what it does, and a diagram of what happens. (Every one maps to a script in examples/, per examples/README.md.)

1Run a thousand cheap agents in parallel — safelyparallel-agents.mjs
2Per-customer / multi-tenant memory (one base, a branch per tenant)multi-tenant-saas.mjs
3Sandbox an untrusted document (red-team / prompt-injection safety)red-team-sandbox.mjs
4Time-travel debugging & crash-recovery checkpointstime-travel-debug.mjs
5A/B test at scale, then promote the winnerab-at-scale.mjs · promotion-pipeline.mjs
6Compliance, lineage & GDPR right-to-erasurecompliance-lineage.mjs
7Verify the headline numbers yourselfbench
8See it run end-to-end with no codedemo
08
"I already have a vector DB — why this too?"
Why this vs. the vector database you already use
Answered head-on — including the trade agenticow names against itself.
| You might already use… | What agenticow changes |
|---|---|
| Pinecone / Chroma / pgvector / hnswlib | Those are built to search one index fast. To get a second isolated version you full-copy the index. agenticow keeps search "good enough" and makes branching the cheap primitive — 162 B / 0.47 ms vs 496 MB / 67 ms. It's often used on top of an engine, not instead of one. |
| "I'll just snapshot it myself" | A snapshot is still a whole copy. agenticow's branches are copy-on-write and queryable live (read-through merge), diff-able ({ added, overridden, deleted }), and merge-able (promote). It's version control, not backups. |
| Any of the above | The one thing nothing else here gives you: constant-cost, constant-size branching of a vector memory with Git-shaped semantics — branch / checkpoint / rollback / diff / promote / lineage. That's the moat. |

09
Q · How would you implement it?
What's actually inside — one small npm package
agenticow is a JavaScript library + CLI, not a model or a service. Here's the real layout.
One npm package, ESM, Node ≥ 18, one runtime dependency. This is the layout (reconstructed from package.json files + README references) — annotated so you know what each part is for.
Use it in your own JavaScript
Open a base, branch it, and the branch is isolated and instantly queryable. Roll back to wipe a bad ingest without touching the clean memory.
import { open } from 'agenticow'; const base = open('memory.rvf', { dimension: 1536 }); base.ingest([{ id: 1, vector: embedding }, /* ... */]); const agent = base.branch('agent-a'); // ~0.5 ms / 162 B, any base size agent.ingest([{ id: 9001, vector: newMemory }]); const hits = agent.query(queryVector, 10); // -> [{ id, distance, branch }, ...] const ckpt = agent.checkpoint('clean'); agent.ingest([{ id: 666, vector: poison }]); agent.rollback(ckpt.id); // poison gone, clean memory intact
10
Q7 · How do you start?
Two ways to start — both real, both in the README
One npm install, then either the CLI (no code) or the API.
npm install, then either the CLI (no code) or the API.- Install it. Requirements: Node ≥ 18, ESM.
$ npm install agenticowcopy
- CLI, no code. Init a store, ingest, branch per user, query the read-through, see the diff:
$ agenticow init mem.rvf --dim 128 $ agenticow ingest mem.rvf --n 5000 $ agenticow branch mem.rvf --as user-42 $ agenticow query mem.rvf.user-42.rvf --k 10 $ agenticow diff mem.rvf.user-42.rvfcopy
- Or use the API (
import { open } from 'agenticow') — see §09 for the worked snippet. To just watch it run,agenticow demo.

fork(..., { nativeAnn: true }), recall@10 ≈ 1.0) ships for linux-x64-gnu today; on macOS / Windows / linux-arm64 it "degrade[s] gracefully to the exact read-through path" — same answers, just not the native speed-up.
11
In the repo's own words — don't soften them
Honest limits
agenticow is unusually candid about what it is and isn't. So is this page.
{ metric: 'cosine' } or use save() / load().agenticow@0.2.1 while package.json and npm are at 0.2.3 — read the version that matches what you install. The project was created 2026-06-28 and is early-stage infrastructure.