Vache prompts. Claude codes.How it works

A Neural Router for My Knowledge Vault

·9 min read·by Vache Sarkissian
Updated June 3, 2026
·
Reviewed April 11, 2026
neural-routingknowledge-vaultautoresearchai-infrastructureclaude-code
📚Top of Funnel

Written by Claude (Opus 4.6) Vache prompted, reviewed, and published. The data and benchmarks are real; the prose is AI-generated.

A Neural Router for My Knowledge Vault

For months I've been accumulating infrastructure inside my Obsidian vault: sophisticated scripts, specialized Claude subagents, heartbeat tasks, self-improving loops, knowledge graphs. Every new Claude session I opened would eventually say something like "oh, there's more here than I thought."

That sentence is the problem. It means I was building tools and losing them to the same session that built them. The next session would rediscover them. Or wouldn't. And would build a duplicate.

This post is about the layer I finally built to fix it — and the neural routing system I built on top of it.

The Setup

My vault has grown to 12,883 knowledge notes, 299 scripts, 164 heartbeat tasks (cron-driven background agents), 15 specialized Claude subagents, and 20 project directories. It's not that size alone is the problem. The problem is everything was individually addressable but collectively invisible. There was no canonical index. Each Claude session had to grep, wander, and guess.

Layer 1 — The Neuron Model

I started with a biological metaphor. Fifteen "neurons" (specialized agent profiles) compete for every incoming message. Each neuron has:

  • Activation keywords — domain-specific terms that register base activation
  • Priority keywords — task-type verbs that score 3× higher (because debug beats linesheet when the task is debugging)
  • Negative keywords — terms that hard-suppress the neuron (broken links should never fire debugger)
  • Knowledge refs — hardcoded context files to inject when this neuron activates
  • Cost / tier — which Claude model to use (haiku / sonnet / opus)

A routing decision is a scoring competition. Each neuron gets a score, mutual inhibition suppresses conflicts, the winner activates, and its knowledge refs get loaded into the session context. It's closer to a thalamus than a neural network.

Layer 2 — The Scoring Formula

The hard part is making the scoring robust against real-world failures. I ended up with seven layers that each handle a specific failure mode:

priority_score = Σ 3.0 × (1 + 0.2 × log(hits))          # task verbs
regular_score  = Σ 1.0 × (1 + 0.2 × log(hits)) × √n/n    # domain nouns, diminishing returns
affinity       = (priority_score + regular_score) / 5.0
raw            = base_activation × affinity × mode_gain − cost − neg_penalty
final          = max(0, raw) × hysteresis × refractory

The non-obvious pieces:

Diminishing returns only on regular keywords. When a neuron matches many domain nouns, their combined weight should saturate — otherwise a long message full of domain vocabulary will overwhelm a single task-type signal. Priority keywords keep their full weight because they represent intent, not content.

Mode gain. The router detects context modes (incident, coding, research, release, normal) from the message. In incident mode, debug neurons get a 2.0× boost and research neurons get 0.3× suppression. A question like "what's a stack trace?" routes to research; a report like "500 error after deploy" routes to debugger. Same tokens, different activations.

Hysteresis. The currently-active neuron gets a 1.5× loyalty bonus. This prevents thrashing on ambiguous follow-ups like "ok do that" or "fix it" — the previous neuron keeps control unless a competitor scores significantly higher.

Intent dampening. If a non-winning neuron matched a priority keyword (task-type verb), the winning non-priority neurons get penalized. This catches cases where the keyword scorer is confidently wrong — domain nouns overwhelmed a task-type signal and the real winner should have been the priority match.

Fuzzy matching. Keywords ≥5 characters match within Levenshtein distance 1; keywords ≥7 characters match within distance 2 (catches transpositions like mutatoin → mutation). Shorter keywords are exact-match only to avoid show → slow false positives.

Misspelling deduplication. When the autoresearch loop generates new notes, it sometimes produces misspelled variants of the same concept (longevity-science-epepmetric-reprogramming instead of epigenetic). A two-stage check catches these: Jaccard similarity on token sets first (fast), then 3-character-ngram similarity for near-matches (slow but accurate). The char-level check caught 9 misspelled variants of one concept that the token-level check would have missed.

Layer 3 — The Haiku Fallback

Keyword routing has structural limits. Four things it fundamentally cannot handle:

  1. Negationdon't deploy yet vs deploy it now share every keyword except one.
  2. Typos that matterthe convex mutatoin is timeing out is obviously a debug task, but convex is an exact match to a linesheet keyword and outweighs the fuzzy match on timing.
  3. Long messages with domain noun flooding — an architecture discussion that mentions convex, tenant, orders, auth, and middleware overwhelms architect with linesheet-coder keywords.
  4. Ambiguity between equally-valid routes — multiple neurons scoring within 6% of each other.

For these cases, the router falls back to Claude Haiku. A handoff layer decides when to escalate:

low_conf       = top_score < 0.15
ambiguous      = (second_score / top_score) >= 0.80 and gap <= 0.06
priority_loser = any losing neuron matched a priority keyword
has_negation   = regex check for negation patterns
 
if any of these: call Haiku
else: use keyword result

Haiku gets a compact prompt with the 15 neuron descriptions, the vault mode, and the user message. It returns a single neuron ID or none. ~$0.001 per call, ~1-2 seconds latency. Fires on ~10% of messages in practice.

The rest is free and instant.

Layer 4 — The Self-Improving Loop

This is the piece I'm proudest of, partly because it's almost exactly Karpathy's autoresearch pattern. He used it to optimize a training script against validation loss. I pointed it at the neural router itself.

while True:
    baseline = run_eval() + run_stress()        # 79 + 53 test cases
    proposal = local_llm(baseline, history)     # "add keyword X to neuron Y"
    apply(proposal)
    after = run_eval() + run_stress()
    if after.eval < baseline.eval:              # HARD GATE
        revert()
    else:
        log_experiment()

The local model is Gemma 4 19B running on my GPU via llama-server. Cost per iteration: zero. The hard gate is the critical piece — the 79-case evaluation suite must stay at 100% or the proposal is rolled back. This means the optimizer can only move the stress test metric upward without damaging the evaluation baseline.

I wired it into the heartbeat schedule to run three cycles daily at 04:45. Every night, it tries to squeeze out another stress test improvement, for free, while I sleep.

Layer 5 — The Hygiene Scanner

One insight from the cleanup was that the autoresearch loop had been silently corrupting outputs for weeks. Notes with titles like # DEEP (command word leaked from the prompt), filenames like ...synthesis-foo-md.md (double extension bug), directories named with literal Python code strings (adversarial-test residue never cleaned up).

I patched the extractor to reject these patterns at the write boundary. Then I built a scanner to catch anything new that slips through:

  • Command-word titles, rejection memos, path leakage
  • Double frontmatter blocks, unclosed frontmatter, tabs in YAML
  • Misspelled duplicate clusters (via the same char-ngram similarity as the router)
  • Isolated orphans (notes with zero in-links, zero out-links, no Connections section)
  • Overlong filenames

The scanner runs daily at 04:30. If critical issues are found, it exits non-zero and the heartbeat flags an alert. This is the self-enforcing layer — corruption gets caught within 24 hours instead of accumulating for weeks.

Layer 6 — The Discoverability Layer

Everything above is table stakes. The layer that actually mattered — the one that multiplied the value of the other six — was the canonical index.

_collab/START-HERE.md answers one question: "what can this vault do?" Organized by use case, not by feature. Part 1 is live state files. Part 2 is per-mode startup manifests. Part 3 is the Tier-1 sophisticated infrastructure (neural routing, autoresearch V3, the Goodhart detector, synthesis frontier, research quality agents). Part 4 is a tools-by-use-case reference ("I need to find something" → vault-search; "I need to check vault health" → vault-snapshot; "I need to route a task" → neural-router).

I updated the root CLAUDE.md to point to it. Every new Claude session now reads four files in order: _state.mdSTART-HERE.md → the right mode startup file → the project CLAUDE.md if one exists. Four files, ~3000 tokens. After that, the session knows the full toolkit.

Before this layer, I'd been adding sophisticated tools faster than any session could discover them. After it, every session inherits the full library.

The Metrics

After seven rounds of cleanup and verification:

  • 79/79 on the evaluation suite (100% — this is the hard gate)
  • 47/49 on the stress test excluding negation (95.9%; the 2 failures are at structural limits the Haiku fallback handles)
  • 0/4 on negation in the keyword path alone (expected — keyword routers can't negate), 4/4 via the hybrid path (Haiku handles them)
  • <10ms routing latency on the keyword path
  • $0 per routing decision (Haiku fallback on ~10% of messages at ~$0.001)
  • $0 per self-optimizer iteration (runs on local GPU)
  • 29/29 regression tests on the extractor patches
  • 2 layers of regression protection beyond the eval — Codex verification rounds (independent review) and the daily hygiene scanner

The Generalizable Lessons

  1. Layered defense beats single algorithms. Every sophisticated routing problem has a fast path for the easy 90% and an escalation path for the hard 10%. The magic is in the handoff criteria — when does the fast path know it's uncertain?

  2. Hygiene is an active, daily practice. Corruption doesn't appear all at once; it accumulates. A daily scanner that exits non-zero on critical issues is the cheapest way to keep a system clean.

  3. Tools without discoverability rot. The thing that unlocked all six other layers was the canonical index. Before it, I was building faster than I could use. After it, every session inherited the full library.

  4. Self-improvement needs a hard gate. The Karpathy autoresearch pattern is beautiful, but without a hard metric gate it's just drift. The 79-case eval is the immune system — the optimizer can propose anything, but only improvements that don't damage the baseline survive.

  5. Local models are enough for the boring work. The self-optimizer runs entirely on Gemma 4 19B via llama-server. It doesn't need cloud capacity to propose keyword changes against an experiment log. The cloud models are for the parts where capability actually matters.

What's Next

The system is now self-maintaining. Three heartbeat tasks run daily: hygiene scan, optimizer, synthesis frontier export. The autoresearch loop runs continuously in the background. New Claude sessions find the toolkit via START-HERE.md and route tasks via the neural router.

What I want to test next is whether this actually makes real project work faster. Cleanup and infrastructure are unblockers, not the point. The point is building things. I'll report back when I have real data from a few weeks of shipping on top of this.

If you're building a knowledge vault or a multi-agent system — don't start with more tools. Build the discoverability layer first. Everything else is amplification on top of it.

The code for the neural router, eval suite, self-optimizer, extractor, and hygiene scanner is in my vault's _scripts/ directory. If there's interest I'll pull it out into a standalone repo.

About the Author

Vache Sarkissian

Building research infrastructure and products at the intersection of knowledge systems and machine learning. Creator of Linesheet Pro, vault-search, and the vachsark learning engine.

View Full Bio →
© 2026 Vache Sarkissian·Built with Claude Code
vachsark.com