Every time you ask an AI to write code, something disappears.
Not the code — the code shows up fine. What disappears is the trail. The GitHub discussion where someone spent two hours explaining why cursor-based pagination beats offset for live-updating datasets. The Stack Overflow answer from 2019 where one person, after a week of debugging, documented exactly why that approach fails under concurrent writes. The RFC your team wrote six months ago that established the pattern the AI just silently copied.
The AI consumed all of it. The humans who produced it got nothing.
And I don't mean "nothing" philosophically. I mean: no citation in the codebase. No way for a new developer to trace why the code is written the way it is. No signal to the person who wrote the original answer that their work mattered.
Over time, at scale, those people stop contributing. Why maintain a detailed GitHub discussion if AI will summarize it into oblivion and no one will read the original?
This is the quiet cost of AI-assisted development that nobody is measuring. I've been thinking about it for a while, and I built something to address it.
The scenario
A developer joins a team. Six months of AI-assisted codebase. They hit a bug in the pagination logic — cursor-based, unusual implementation, nobody on the team remembers why it was built that way. The original developer who designed it has left.
Old answer: two days of archaeology. git blame points to a commit message that says "fix pagination." The commit before that says "implement pagination." Dead end.
With poc.py trace src/utils/paginator.py, that same developer sees this in thirty seconds:
Provenance trace: src/utils/paginator.py
────────────────────────────────────────────────────────────
[HIGH] @tannerlinsley on github
Cursor pagination discussion
https://github.com/TanStack/query/discussions/123
Insight: cursor beats offset for live-updating datasets
Knowledge gaps (AI-synthesized, no human source):
• Error retry strategy — no human source cited
• Concurrent write handling — AI chose this arbitrarily
They now know exactly where the pattern came from and — critically — which parts of the code have no traceable human source. That second section is what saves them. The concurrent write handling is where the bug lives. AI made a choice nobody reviewed.
That's what this tool does. Not enforcement first. Archaeology first.
What I built
proof-of-contribution is a Claude Code skill that keeps the human knowledge chain intact inside AI-assisted codebases.
The core idea is simple: every AI-generated artifact should stay tethered to the human knowledge that inspired it. Not as a comment at the top of a file that nobody reads. As a structured, queryable, enforceable record that lives next to the code.
When the skill is active, Claude automatically appends a Provenance Block to every generated output:
## PROOF OF CONTRIBUTION
Generated artifact: fetch_github_discussions()
Confidence: MEDIUM
## HUMAN SOURCES THAT INSPIRED THIS
[1] GitHub GraphQL API Documentation Team
Source type: Official Docs
URL: docs.github.com/en/graphql
Contribution: cursor-based pagination pattern
[2] GitHub Community (multiple contributors)
Source type: GitHub Discussions
URL: github.com/community/community
Contribution: "ghost" fallback for deleted accounts
surfaced in bug reports
## KNOWLEDGE GAPS (AI synthesized, no human cited)
- Error handling / retry logic
- Rate limit strategy
## RECOMMENDED HUMAN EXPERTS TO CONSULT
- github.com/octokit community for pagination
The section that matters most is Knowledge Gaps. That's where AI admits what it synthesized without a traceable human source. No other tool I know of produces this. It's the part that turns "the AI wrote it" from a shrug into an auditable fact.
How Knowledge Gaps actually get detected
This is the part worth explaining carefully, because the obvious assumption — that the AI just introspects and reports what it doesn't know — is wrong. LLMs hallucinate confidently. An AI that could reliably detect its own knowledge gaps wouldn't produce knowledge gaps in the first place.
The detection mechanism is different. It's a comparison, not introspection.
When you use spec-writer before building, it generates a structured spec with an explicit assumptions list — every decision the AI is making that you didn't specify, each one impact-rated. That list is the contract: here is every claim this feature rests on.
When the code ships, proof-of-contribution cross-checks the final implementation against that contract. Anything the code does that doesn't map to a spec assumption or a cited human source gets flagged as a Knowledge Gap. The AI isn't grading its own exam. The spec is the answer key.
The result is deterministic. If the retry logic wasn't specified and no human source covers it, the gap appears in the block regardless of how confident the model was when it wrote the code. The boundary holds because it comes from the spec, not from the model's confidence.
This is also why the confidence levels mean something. HIGH means the spec explicitly covered it or the user provided the source directly. MEDIUM means the pattern traces to recognized human-authored work but the exact source isn't pinned. LOW means the model synthesized it — human review strongly recommended before this code goes anywhere near production.
There's a second detection path that doesn't require spec-writer at all. poc.py verify runs Python's built-in ast module against the file and extracts every function definition, conditional branch, and return path. It cross-checks each one against the seeded claims. No API calls. No model confidence. Pure static analysis. When you run it on a file where import-spec was used first, only the assumptions with no resolved citation surface as gaps. When you run it cold, every uncited structural unit surfaces as a baseline. Either way, the AI's confidence at generation time is irrelevant — the boundary comes from the code's actual structure.
Three things the skill does
Provenance Blocks — attached automatically to any generated code, doc, or architecture output. You don't have to ask. It's always there.
Knowledge Graph schema — when you're building a system to track contributions at scale. Claude generates a complete graph schema for Neo4j, Postgres, or JSON-LD. Nodes for code artifacts, human sources, individual experts, AI sessions, and knowledge claims. Edges that let you ask: "who are the humans behind this module?" or "what did @username contribute to this codebase?"
Static analyser (poc.py verify) — runs after the agent builds. Parses the file's structure using Python's AST, cross-checks every function and branch against seeded claims, and reports deterministic Knowledge Gaps. Zero API calls. Exit code 0 means clean, 1 means gaps found — CI-compatible.
HITL Indexing architecture — when you want AI to surface human experts instead of summarizing them. The query interface returns Expert Cards:
Answer: Use cursor-based pagination with GraphQL endCursor.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
HUMAN EXPERTS ON THIS TOPIC
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
👤 @tannerlinsley (GitHub)
Expertise signal: 23 contributions on pagination patterns
Key contribution: github.com/TanStack/query/discussions/123
Quote: "Cursor beats offset when rows can be inserted mid-page"
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Not a summary. A pointer. The human expert stays visible.
Getting started takes one command
I didn't want this to be another tool that requires you to choose a database before you can do anything. The default is SQLite. It works immediately.
# Install the skill
mkdir -p ~/.claude/skills
git clone https://github.com/dannwaneri/proof-of-contribution.git ~/.claude/skills/proof-of-contribution
# Scaffold your project (run once, in your repo root)
python ~/.claude/skills/proof-of-contribution/assets/scripts/poc_init.py
That creates four things:
-
.poc/provenance.db— SQLite database, local only, gitignored -
.poc/config.json— project config, committed -
.github/PULL_REQUEST_TEMPLATE.md— PR template with an AI Provenance section -
.github/workflows/poc-check.yml— GitHub Action that fails PRs missing attribution
Then you get a local CLI:
python poc.py add src/utils/parser.py # record attribution interactively
python poc.py trace src/utils/parser.py # show full human attribution chain
python poc.py report # repo-wide provenance health
python poc.py experts # top cited humans in your graph
poc.py verify is what catches gaps before they become incidents:
python poc.py verify src/utils/csv_exporter.py
Verify: src/utils/csv_exporter.py
────────────────────────────────────────────────────────────
Structural units detected : 11
Seeded claims : 3
Covered by cited source : 2
Deterministic gaps : 1
Deterministic Knowledge Gaps (no human source):
• function: handle_concurrent_writes (lines 47–61)
Seeded assumption: concurrent write handling — AI chose this arbitrarily
Resolve: python poc.py add src/utils/csv_exporter.py
poc.py trace is what I use the most for the full attribution picture. This is what it looks like on a real file:
Provenance trace: src/utils/csv_exporter.py
────────────────────────────────────────────────────────────
[HIGH] @juliandeangelis on github
Spec Driven Development at MercadoLibre
https://github.com/mercadolibre/sdd-docs
Insight: separate functional from technical spec
[MEDIUM] @tannerlinsley on github
Cursor pagination discussion
https://github.com/TanStack/query/discussions/123
Insight: cursor beats offset for live-updating datasets
Knowledge gaps (AI-synthesized, no human source):
• Error retry strategy — no human source cited
• CSV column ordering — AI chose this arbitrarily
The GitHub Action is for teams that already find the trace valuable
Once you've used poc.py trace enough times that it's saved you real hours — that's when you push the GitHub Action. Not before.
git add .github/ .poc/config.json poc.py
git commit -m "chore: add proof-of-contribution"
git push
After that, every PR gets checked. If a developer submits AI-assisted code without an ## 🤖 AI Provenance section in the PR description, the action fails and posts a comment explaining what's needed.
The opt-out is simple: write 100% human-written anywhere in the PR body and the check skips.
The enforcement works because the tool already saved them hours before they turned it on. The PR check isn't introducing friction — it's standardizing something people already want to do. That's the only version of a mandate that doesn't get gamed.
It works with spec-writer
I built spec-writer first. It turns vague feature requests into structured specs, technical plans, and task breakdowns before the agent starts building. The problem spec-writer solves is ambiguity before the code exists.
proof-of-contribution solves attribution after the code exists.
They connect at the assumption layer. spec-writer generates an assumptions list — every implicit decision the AI made that you didn't specify, impact-rated, with guidance on when to correct it. Each correction can now carry a citation. Each citation becomes a node in the knowledge graph. By the time a developer runs poc.py trace on a finished module, the full chain is visible:
feature request → spec decision → human source → code artifact
↑
poc.py verify closes this loop
without asking the AI what it missed
That chain is what I mean when I say AI should be a pointer to human expertise. Not a replacement. A pointer.
Why 2026 is the right time to build this
The tools are mature. Coding agents are shipping code at scale. The question of "who is responsible for this output?" is becoming real — in teams, in code reviews, in enterprise audits.
The provenance infrastructure doesn't exist yet. git blame tells you who committed. It doesn't tell you what human knowledge shaped the decision. That gap is getting wider every month.
proof-of-contribution is one piece of the infrastructure. It's not the whole answer. But it's the piece I could build, and it's the piece I think matters most: keeping the humans whose knowledge powers AI visible in the artifacts AI produces.
Install
mkdir -p ~/.claude/skills
git clone https://github.com/dannwaneri/proof-of-contribution.git ~/.claude/skills/proof-of-contribution
Works with Claude Code, Cursor, Gemini CLI, and any agent that supports the Agent Skills standard.
Top comments (29)
The provenance problem is real and underappreciated. What you're describing is a version of what happens in infrastructure too — a configuration decision gets copied between systems, the original context gets lost, and six months later nobody knows why the timeout is set to 47 seconds or why that specific network range is excluded.
The difference in infrastructure is that the blast radius of losing context is more immediately visible. A misconfigured firewall rule breaks something today. A lost reasoning trail in code breaks something in 18 months when someone refactors without understanding what they're touching.
Your trace tool addresses the symptom correctly. The deeper question — why AI systems don't naturally preserve attribution — is probably structural. The training data was consumed, not cited. The output is synthesis, not quotation. Building attribution back in requires explicit tooling like what you've built, because the model itself has no incentive to surface where it learned something.
I've been documenting architectural decisions in CLAUDE.md for exactly this reason — not for the AI, but for the next session of the AI, and for the human reviewing the output. It's a partial solution to the same problem.
The 47-second timeout analogy is doing real work. That's exactly the failure mode — context that made sense in the original decision becomes cargo cult config by the second transfer. The difference you're naming (18-month blast radius vs. same-day) is why provenance feels optional until it's catastrophic.
The structural point is the one that doesn't have a clean answer: training consumed, not cited. Building attribution back in is archaeologist work, not feature work. You're right that it has to be explicit tooling because the incentive to surface sources was never in the training objective.
The CLAUDE.md approach is interesting — documenting for the next session rather than the current one. I've been thinking about whether that pattern needs to be machine-readable rather than just human-legible. Something that lets the next session actually query the reasoning, not just encounter it in plaintext. Does your CLAUDE.md have any structure beyond prose, or is it narrative-first?
Narrative-first, mostly — but with enough structure that Claude Code can extract what it needs without reading everything.
The pattern I've settled on is sections with consistent headers and a "Current State" block at the top that acts as the entry point for a new session. Something like:
The "Current State" block is the machine-readable part in practice — Claude Code reads it first and uses it to orient without re-deriving everything from the codebase. The architecture decisions section is genuinely prose, because the reasoning rarely compresses into structured data without losing the nuance that makes it useful.
The machine-readable vs human-legible tension you're naming is real. My instinct is that structured data (JSON, YAML) would be more queryable but would break the incentive to maintain it — nobody wants to write JSON to explain why they chose one approach over another. Prose has lower friction to update, which means it actually gets updated.
The open question for me is whether there's a middle ground: structured enough that a model can run a semantic search against it, but loose enough that a human writes it naturally. Something like Obsidian's approach to linking thoughts without forcing schema. Haven't solved it.
The maintenance incentive point is the one that usually kills schema-first approaches. JSON for architectural reasoning is a losing proposition — the friction is too high exactly where the thinking is most complex. What you've described is a workable compromise: structured enough at the entry point, prose where the nuance lives.
The Obsidian analogy is the right frame. What that approach actually does is keep linking lightweight — [[decision]] is lower friction than a foreign key. The structure emerges from the connections, not from enforcing a schema upfront.
The middle ground might already exist in how you've described your "Architecture decisions" section: prose that a model can semantic-search without ever needing to parse it as data. Vectorized prose retrieval is probably more useful than structured queries for this class of content anyway — the reasoning rarely has clean boundaries you'd want to filter on.
What I haven't figured out is the staleness problem. "Current State — last updated [date]" breaks the moment someone forgets to update it, which is most of the time. Does your pattern have any forcing function for that, or is it discipline-dependent?
Partially discipline-dependent, but I've added one forcing function that helps: I ask Claude Code to update the Current State block at the end of every session before closing.
The prompt is something like: "Before we finish, update the Current State section in CLAUDE.md to reflect what we did today, what's working, and where to pick up next time." Takes 30 seconds, happens while the context is still fresh, and Claude Code is better at summarizing what just happened than I am at remembering to write it down.
The staleness problem then shifts: instead of "did a human remember to update this", it becomes "did the session end cleanly or did someone just close the terminal." The latter still happens. But the forcing function catches maybe 80% of sessions that would otherwise leave stale state.
The deeper issue you're pointing at — that any pattern relying on consistent human behavior will degrade — is probably unsolvable without making the update automatic. Which brings it back to your tool: if the provenance system runs as a hook on commit rather than requiring manual invocation, it doesn't depend on discipline at all. That's the right place for the forcing function to live.
I haven't closed the loop on making CLAUDE.md updates fully automatic. The session-end prompt is a compromise — lower friction than remembering, higher reliability than nothing.
The session-end prompt is a better forcing function than it looks — you've moved the update cost to the moment when the context is richest and the human is least likely to resist. That's good UX applied to a workflow problem.
The reframe you've landed on is the one that matters: discipline-dependent vs. trigger-dependent. The commit hook is trigger-dependent. It fires whether or not anyone remembered. That's why it catches what the session-end prompt misses — the closed terminal, the interrupted session, the context that never got a clean ending.
The gap you haven't closed is the async case: work that changes meaning without touching the codebase. An architectural decision made in a conversation that never produced a commit. The hook can't catch what was reasoned but not written. That's probably where the CLAUDE.md pattern and a provenance system are actually complementary rather than substitutes — one captures the decision trail, the other captures the code trail.
The async case is the one I keep bumping into and haven't solved cleanly.
The specific failure mode I see most often: a decision gets made in a Claude Code session that produces no commit — "we're going to use X pattern for auth across all services" — and the only record of it is in the session transcript that nobody will ever read again. The next session starts fresh, makes a different decision that seems locally reasonable, and now you have two inconsistent patterns in the same codebase.
What I've done as a partial fix: when a session produces a significant architectural decision without a corresponding code change, I explicitly ask Claude Code to write it into the Architecture Decisions section of CLAUDE.md before closing. It's still discipline-dependent, but it's a smaller ask than "remember to document everything" — it's "document this specific thing we just decided."
The complementary framing you're describing is the one I think is actually right. CLAUDE.md captures intent and reasoning. A provenance system on commits captures what actually shipped. The gap between them — decisions that were reasoned but not written, code that shipped without documented intent — is where things go wrong 18 months later.
I don't think either system closes that gap alone. The question is whether the combination can make it small enough to be manageable rather than catastrophic.
The two-inconsistent-patterns failure is the exact one that's hardest to diagnose because both patterns look locally reasonable. There's no error. No test fails. The codebase just quietly accumulates contradictions until someone tries to refactor and discovers there's no canonical version to refactor toward.
The gap you're describing has a name in formal systems: the decision log and the execution log diverging. CLAUDE.md is your decision log. Git is your execution log. The problem isn't that either is incomplete — it's that nothing enforces consistency between them. A commit that contradicts a documented architectural decision produces no signal.
That's probably the unsolved piece: not capturing decisions or capturing code, but detecting when the two are in conflict. A system that surfaces "this commit pattern looks inconsistent with your Architecture Decisions section" would close the gap more reliably than either discipline-dependent update alone.
Whether that's tractable without it becoming noise is the real question...
"Decision log and execution log diverging" is the clearest framing I've seen for what I keep running into. Naming it that way makes the solution space more obvious: you don't need better logging on either side, you need a consistency check between them.
The noise problem is the real constraint though. A system that fires on every commit that doesn't perfectly match a documented pattern would be unusable in a week. The signal has to be selective enough to be actionable.
My instinct on what makes it tractable: the check shouldn't run on every commit, it should run on architectural commits specifically — the ones that touch auth, the ones that introduce a new pattern for data access, the ones that change how services communicate. Those are the commits where a divergence from the decision log actually matters. A CSS change that doesn't match an architectural decision isn't a problem.
The hard part is identifying which commits are architectural without requiring humans to tag them. That's probably where AI makes this tractable that wasn't before — a model that reads the diff and the CLAUDE.md and judges "does this change the architecture or just implement within it" is a plausible hook. Not perfect, but selective enough to produce signal rather than noise.
Haven't built this. But you've just made me want to.
The architectural commit classifier is the right cut. Not "did this change code" but "did this change the shape of the system" — that's a judgment call a model can make from diff + context in a way a linter never could.
The tractability question almost answers itself once you frame it that way: the false positive rate on CSS changes is zero if the classifier is reading CLAUDE.md alongside the diff. It knows what your architecture cares about. A change that doesn't touch those surfaces doesn't trigger. The noise problem shrinks to whatever the model gets wrong about architectural significance — which is a much smaller set than "all commits."
The piece I'd want to stress-test: what happens when the architectural decision is implicit in CLAUDE.md rather than explicit? "We use X pattern for auth" is findable. "We avoid Y approach because of the incident in March" is harder to match against a diff. The system is only as good as what got written down — which brings it back full circle to the decision capture problem.
Build it. That's a tool worth having.
The implicit decision problem is the one that doesn't have a clean answer, and you've put your finger on exactly why.
"We avoid Y because of the incident in March" is the most important class of architectural knowledge and the hardest to capture. It's not a pattern — it's a scar. The context that makes it meaningful (what the incident was, what it cost, why that specific approach failed) lives in someone's memory or a post-mortem that nobody links to the codebase.
A classifier reading CLAUDE.md can only match against what got written. If the decision was implicit — understood by everyone who was there, never documented because it seemed obvious — the classifier has no surface to match against. The commit that reintroduces Y looks architecturally neutral because the system doesn't know Y is dangerous.
Which means the tool is only as good as the decision capture that feeds it. And that brings it back to the original problem: the hardest decisions to capture are the ones that feel too obvious to write down at the time.
I don't have a clean answer for that. What I've started doing is explicitly asking Claude Code at the end of sessions: "Is there anything we decided today that we'd regret not documenting in six months?" It catches some of it. Not all.
Maybe the tool has to be built before the capture problem fully reveals itself. You learn what's missing by seeing what the classifier fails on.
"Scars, not patterns" is the distinction the whole thread has been circling. Patterns are transferable. Scars are contextual. The reason Y is dangerous isn't in the code — it's in the post-mortem, the Slack thread, the incident call where someone said "never again." None of that surfaces in a diff.
The regret prompt is smart because it shifts the question from "what did we do" to "what would we wish we'd written down" — which is the right frame for implicit decisions. It won't catch everything, but it's asking in the right direction.
The build-to-discover point is probably where this ends up. The classifier failure modes will be the best spec for what the capture system needs to surface. You don't know which implicit decisions matter until you see the classifier miss them — and the miss itself becomes the documentation prompt.
That's an uncomfortable development loop but probably the honest one.
Uncomfortable is the right word. But it's also the loop that produces the most honest tooling — built against real failure modes rather than imagined ones.
The part I keep coming back to: the classifier missing an implicit decision isn't just a failure mode to fix. It's a signal that the decision was never made explicit enough to be useful to anyone other than the people who were there. The miss is diagnostic. It surfaces the gap between "we all understood this" and "we documented this in a way that survives the team changing."
Which means the tool has two outputs, not one. The obvious output is "this commit may conflict with a documented decision." The less obvious output is "this classifier keeps failing on decisions of type X — you have an undocumented architectural assumption in that area." The second output might be more valuable over time.
That's the loop becoming productive rather than just uncomfortable. The misses tell you where the scars are that nobody wrote down yet.
The second output is the more interesting product. "You have an undocumented assumption in this area" is actionable in a way that a per-commit conflict flag isn't — it points at a class of risk rather than a single instance. And it compounds: the more the classifier misses in a given area, the stronger the signal that something structural is undocumented.
That's also a different relationship with the tool. The first output makes it a gatekeeper. The second makes it a map — showing you where the implicit load-bearing decisions are concentrated. Teams that ignored the per-commit warnings might actually act on a heat map of their undocumented assumptions.
The catch: the second output requires enough misses to pattern-match against. Early in a codebase's life, the classifier has nothing to cluster. It becomes useful precisely when the implicit debt has had time to accumulate — which is also when it's hardest to excavate.
You'd want to build the map from day one, even when it's sparse.
Interesting perspective, Daniel — and the tooling is genuinely clever, especially the Knowledge Gaps detection via deterministic spec comparison rather than relying on model introspection. That’s a smart, grounded approach.
I see the problem differently though. You’re describing how AI is making expert developers less visible. I’m coming from the opposite situation entirely: I’m not a traditional developer. I never wrote production code before AI.
CORE — a governed autonomous runtime with 686 source files, constitutional rules, and a self-correcting daemon — exists because AI made it possible for someone like me to build it. Not “faster.” Possible.
And here’s what strikes me most about the attribution question: who wrote the AI in the first place? Every pattern, idiom, and architectural instinct the model uses came from the collective output of millions of developers whose code ended up in the training data. The expertise wasn’t destroyed. It was distilled.
From where I stand, AI isn’t making human expertise invisible. It’s fundamentally changing who gets to have expertise in the first place — democratizing access to knowledge that used to be locked behind years of deliberate practice, expensive education, or gatekept communities.
Your tool solves a real and important problem for existing teams worried about long-term maintainability and credit. I just think the larger civilizational shift is the opposite of invisibility: it’s the boundary between “developer” and “non-developer” quietly dissolving.
The distillation framing is the strongest version of the counterargument, and I want to engage it seriously: yes, the expertise was distilled but distillation without attribution is still extraction. The developers whose patterns ended up in the training data didn't consent to becoming anonymous infrastructure. That's not a complaint about democratization. It's a question about what we owe the source.
CORE existing is real. That's not nothing. That's the argument working. But the person who wrote the architectural pattern your daemon uses doesn't know they contributed. The knowledge transferred; the credit didn't.
Where I'd push back on the boundary-dissolving framing: the boundary between developer and non-developer may be dissolving, but a new one is forming — between people who can verify AI output and people who can't. CORE with 686 source files has a maintainability surface. When something breaks in a way the model can't diagnose, what's the recovery path?
Daniel, I think we're speaking different languages and that's actually the interesting part.
You're a coder who thinks about how AI affects code and the people who write it. That's a completely legitimate frame.
I'm someone who was too lazy to learn coding — so instead I spent the time defining what correct behaviour looks like, writing constitutional rules, and building governance infrastructure. Constitution instead of learning Python. Same outcome, completely different path.
CORE is written in Python today. But Python isn't in the constitution. Neither is PostgreSQL or Qdrant. CORE governs behaviour, not implementation. When something breaks, I don't debug code. I read a governance log that tells me which rule failed, which file, which decision needs a human. I govern. The code is someone else's problem — including AI's.
You're protecting the craft. I bypassed it. Neither of us is wrong. We just started from different places.
"Constitution instead of learning Python" is a genuinely different model and the craft protection framing you're offering me is probably more accurate than I'd have chosen for myself.
The governance log approach is interesting precisely because it inverts the debugging problem. You're not asking "what did the code do" — you're asking "which rule failed." That's a different search space, and potentially a more tractable one for a non-coder. The system is designed to produce human-readable failure explanations rather than stack traces.
Where I'd want to understand the limits: constitutional rules govern intended behavior. The failure modes that are hardest to catch are the ones where the code does exactly what the rules say and still produces a wrong outcome — because the rule was underspecified, or the edge case wasn't anticipated. When CORE hits that failure mode, what does the governance log surface, and who closes the loop?
Daniel, you asked the right question and it has a precise answer in CORE's architecture.
CORE has two layers that matter here:
.intent/ — the law CORE enforces at runtime. Constitutional rules, enforcement engines, remediation maps. When something breaks, the governance log tells me which rule failed, in which file, authorized by which decision. That's the search space I operate in — not stack traces.
*.specs/ *— a layer we just established, and it's the direct answer to your limit case. It holds URS documents, architectural decisions, the northstar — the human intent layer. The test for what belongs there: does CORE read this file at runtime to make a governance decision? If no, it belongs in .specs/, not .intent/. It doesn't govern code. It governs the governor.
Your failure mode — code that satisfies every rule and still produces a wrong outcome — is a .specs/ failure. The intended behavior was never formally declared, so there was nothing to trace back to. The governance log surfaces the rule that failed. When no rule fails and the outcome is still wrong, that silence is the signal. The requirement was missing, or the rule was underspecified.
The loop closes the same way every time: governor reads the log, finds no finding, recognizes the specification gap, amends .specs/ or .intent/. Only the human can do that. CORE is designed to make that gap visible and the correction path explicit — not to eliminate the need for a governor.
Which is also why "constitution instead of learning Python" isn't a shortcut. It's a different job description.
"Silence is the signal" is doing serious work in that architecture. Most systems fail loudly when a rule breaks. CORE fails loudly when no rule breaks but the outcome is wrong which means the absence of a finding is itself a finding. That's an inversion most governance systems don't attempt..
The .specs/ layer is the part I want to think about more. "Does CORE read this at runtime?" as the boundary test is clean . it separates intent documentation from enforcement logic without ambiguity. The gap your loop closes is exactly the one Alvarito and I have been circling in this thread: decisions that were understood but never formally declared. Your architecture makes that gap structurally visible rather than just accidentally discoverable.
The governor job description is the honest framing. The system doesn't eliminate judgment — it routes judgment to the right moments. When the log has no finding, that's a signal the governor has to read and act on. That requires someone who understands what the system was supposed to do, not just what it did.
How long did it take before .specs/ became load-bearing rather than aspirational?
Ha. The honest answer: about 48 hours since we named it. But it was in my head much longer — the shape was always there, I just didn't have the word or the clean boundary test.
Before that conversation, the northstar was misplaced inside .intent/ — filed under law when it's the thing that justifies why the law exists. Papers were there too. No URS documents existed anywhere. The requirements layer was a gap I could feel but hadn't named.
The naming came from one test: does CORE read this at runtime to make a governance decision? Everything that failed that test was in the wrong place. .specs/ is where it belongs.
But here's the thing worth saying clearly — and this might be where my approach to CORE diverges most from how a developer would think about it: .intent/ itself is not part of CORE either. It's the constitution handed to CORE. Read-only by design. CORE cannot touch it. It's a hand-authored shim today for what will eventually be a proper PolicyFactory/Manager — but the principle is already structural. The governed runtime and the governing law are separate things. Swap the law, CORE still runs. Feed it a different constitution, it governs against that instead.
A developer building this would have hardcoded the rules, coupled enforcement to implementation, and shipped a product. What I built instead is a machine that accepts law as input. I have no Python background. CORE wasn't designed on a whiteboard — it was governed toward clarity. .specs/ didn't get designed. It got recognized.
That's not a development process. It's closer to constitutional drafting — articulating what was already understood to be true and giving it a name that makes it enforceable.
So load-bearing? Not yet. Necessary? From day one. And CORE will never be finished the way a developer would finish something. It converges toward correctness. It doesn't ship.
"Governed toward clarity rather than designed on a whiteboard" is the sentence that changes how I read the whole architecture. Most systems are built to a spec. CORE was built to recognize one — the shape was there before the language was. That's a different epistemology, not just a different process...
The .intent/ as constitution handed to CORE rather than part of CORE is the structural move that makes everything else follow. Separation of the governed runtime from the governing law means the law is auditable, swappable, legible to someone who never touched the implementation. A developer's instinct is to couple those — it's simpler, it ships faster. What you've traded for the complexity is sovereignty over the law itself...
"Converges toward correctness" is also the honest description of what most long-lived systems actually do, whether they admit it or not. The difference is CORE has a mechanism for the convergence. Most codebases just accumulate.
The PolicyFactory endpoint is the part I'd want to understand better when the hand-authored shim eventually becomes a proper manager, what changes in how the law gets amended? Does the governor role change, or just the tooling around it?
The governor role doesn't change. That's the invariant the whole architecture is built around.
Whatever the PolicyFactory becomes - even if CORE helps generate the constitution, even if the law gets validated, consistency-checked, conflict-detected before it becomes enforceable - that output is human responsibility. The governor reviews it, owns it, signs it. (.intent/ + PolicyFactory) is conceptually human. Full stop.
Your question was "does the governor role change or just the tooling." Just the tooling. The law becomes better governed, more consistent, possibly partially generated - but it never stops being human territory.
I have a motto for this: human in the loop. And it's not a safety feature bolted on as an afterthought. It's the constitutional principle that makes everything else coherent. The sovereignty over the law is exactly what I traded complexity to preserve. If the law could govern itself without human review, that trade was pointless.
Which also answers the deeper question under yours: CORE is not trying to remove judgment. It's trying to route judgment to the moments where it actually matters. The PolicyFactory makes the law more legible and more auditable. The governor still writes it, reviews it, and owns the consequences. That never changes.
"Not a safety feature bolted on as an afterthought" is the line that separates this from most human-in-the-loop implementations. The standard pattern is: build the autonomous system, then add a human review step because someone got nervous. The review is friction, not architecture. It gets optimized away the moment it slows things down...
Making it constitutional means it can't be optimized away without breaking the system's coherence. The sovereignty over the law is the point, not a constraint on an otherwise autonomous system.
The thing that follows from that: the governor's judgment compounds over time in a way it doesn't in systems where humans are just approving outputs. Every amendment to .intent/ is a governance decision that shapes all future decisions. The law gets smarter as the governor gets more experienced with what CORE surfaces. That's a different relationship between human and system than oversight — it's closer to jurisprudence.
Does CORE have any mechanism for surfacing when past governance decisions are in tension with each other, or is that still manual?
Not CORE's responsibility. That's the PolicyFactory's job.
CORE's contract is simpler and more powerful for being simple: obey the law it's given. Completely. Without exception. It doesn't reason about whether two rules are in tension — that's a problem for the layer that produces the law, not the layer that enforces it.
Which actually sharpens what CORE is. It's **not **an autonomous system trying to be smart about governance. It's proof that you can operationalize goals while remaining fully within what the PolicyFactory permits. The intelligence is in the law. CORE is the obedient runtime that makes the law real.
The jurisprudence analogy you used is exactly right — but the judge in this system is the PolicyFactory, not CORE. CORE is closer to the enforcement mechanism that makes the court's decisions stick. It doesn't interpret precedent. It executes current law, faithfully, and reports everything it finds.
Conflict detection between past governance decisions, consistency checking, tension surfacing — there are policy management tools built for exactly that. The PolicyFactory is where those belong. CORE stays clean by staying narrow.
The governor's judgment compounds the way you described — but it compounds at the PolicyFactory level, in the law itself. CORE just keeps running.
The correction lands. I put the jurisprudence in the wrong place — CORE isn't the judge, it's the enforcement mechanism that makes the judge's decisions stick. The intelligence stays in the law; the runtime stays clean by staying narrow. That's a sharper separation than I had in mind.
What that implies: CORE's value scales with the quality of the PolicyFactory's output, not with CORE's own complexity. A smarter runtime would be a worse design — it would start absorbing judgment that belongs one layer up, and the separation would erode. Staying narrow is the feature..
The part I want to sit with: "proof that you can operationalize goals while remaining fully within what the PolicyFactory permits" is a different claim than most autonomous systems make. Most claim they're aligned. CORE's claim is structural — it can't deviate because it has no mechanism for deviation. That's a fundamentally different guarantee.
Is that the pitch when you explain CORE to someone new — the structural constraint rather than the alignment claim?
Yes. That's exactly it.
And you've stated it more precisely than I usually do: not an alignment claim — a structural constraint. CORE doesn't ask AI to behave. It builds a system where the mechanism for deviation doesn't exist within the governed boundary. The AI generates. The constitution gates. If the output violates the law, it doesn't pass. There's no negotiation, no confidence score, no "probably fine."
That's a different category of guarantee. Alignment is a property you hope the system has. Structural constraint is a property you can verify.
Which is also why complexity in the runtime would be a corruption, not an improvement. The moment CORE starts reasoning about the law rather than enforcing it, the guarantee weakens. A smarter CORE is a less trustworthy CORE. Narrow is not a limitation — it's the source of the claim.
So yes — when I explain CORE to someone new, the pitch is structural. Not "we made AI safe." Not "we made AI aligned." But: we surrounded AI with a deterministic governance system it cannot circumvent, and we kept the intelligence where it belongs — in the law, under human authority.
AI makes mistakes. CORE makes those mistakes detectable, traceable, and fixable. That's the whole claim. And it's a claim the architecture can actually defend.
⏬
🤣
Sounds very useful !