DEV Community

AI is quietly making human experts invisible. I built a tool to stop it.

Daniel Nwaneri on April 14, 2026

Every time you ask an AI to write code, something disappears. Not the code — the code shows up fine. What disappears is the trail. The GitHub disc...

Read full post

Dariusz Newecki • Apr 14

Interesting perspective, Daniel — and the tooling is genuinely clever, especially the Knowledge Gaps detection via deterministic spec comparison rather than relying on model introspection. That’s a smart, grounded approach.
I see the problem differently though. You’re describing how AI is making expert developers less visible. I’m coming from the opposite situation entirely: I’m not a traditional developer. I never wrote production code before AI.
CORE — a governed autonomous runtime with 686 source files, constitutional rules, and a self-correcting daemon — exists because AI made it possible for someone like me to build it. Not “faster.” Possible.
And here’s what strikes me most about the attribution question: who wrote the AI in the first place? Every pattern, idiom, and architectural instinct the model uses came from the collective output of millions of developers whose code ended up in the training data. The expertise wasn’t destroyed. It was distilled.
From where I stand, AI isn’t making human expertise invisible. It’s fundamentally changing who gets to have expertise in the first place — democratizing access to knowledge that used to be locked behind years of deliberate practice, expensive education, or gatekept communities.
Your tool solves a real and important problem for existing teams worried about long-term maintainability and credit. I just think the larger civilizational shift is the opposite of invisibility: it’s the boundary between “developer” and “non-developer” quietly dissolving.

Daniel Nwaneri • Apr 15

The distillation framing is the strongest version of the counterargument, and I want to engage it seriously: yes, the expertise was distilled but distillation without attribution is still extraction. The developers whose patterns ended up in the training data didn't consent to becoming anonymous infrastructure. That's not a complaint about democratization. It's a question about what we owe the source.

CORE existing is real. That's not nothing. That's the argument working. But the person who wrote the architectural pattern your daemon uses doesn't know they contributed. The knowledge transferred; the credit didn't.
Where I'd push back on the boundary-dissolving framing: the boundary between developer and non-developer may be dissolving, but a new one is forming — between people who can verify AI output and people who can't. CORE with 686 source files has a maintainability surface. When something breaks in a way the model can't diagnose, what's the recovery path?

Dariusz Newecki • Apr 15

Daniel, I think we're speaking different languages and that's actually the interesting part.
You're a coder who thinks about how AI affects code and the people who write it. That's a completely legitimate frame.
I'm someone who was too lazy to learn coding — so instead I spent the time defining what correct behaviour looks like, writing constitutional rules, and building governance infrastructure. Constitution instead of learning Python. Same outcome, completely different path.
CORE is written in Python today. But Python isn't in the constitution. Neither is PostgreSQL or Qdrant. CORE governs behaviour, not implementation. When something breaks, I don't debug code. I read a governance log that tells me which rule failed, which file, which decision needs a human. I govern. The code is someone else's problem — including AI's.
You're protecting the craft. I bypassed it. Neither of us is wrong. We just started from different places.

Daniel Nwaneri • Apr 15

"Constitution instead of learning Python" is a genuinely different model and the craft protection framing you're offering me is probably more accurate than I'd have chosen for myself.
The governance log approach is interesting precisely because it inverts the debugging problem. You're not asking "what did the code do" — you're asking "which rule failed." That's a different search space, and potentially a more tractable one for a non-coder. The system is designed to produce human-readable failure explanations rather than stack traces.

Where I'd want to understand the limits: constitutional rules govern intended behavior. The failure modes that are hardest to catch are the ones where the code does exactly what the rules say and still produces a wrong outcome — because the rule was underspecified, or the edge case wasn't anticipated. When CORE hits that failure mode, what does the governance log surface, and who closes the loop?

Dariusz Newecki • Apr 15

Daniel, you asked the right question and it has a precise answer in CORE's architecture.
CORE has two layers that matter here:
.intent/ — the law CORE enforces at runtime. Constitutional rules, enforcement engines, remediation maps. When something breaks, the governance log tells me which rule failed, in which file, authorized by which decision. That's the search space I operate in — not stack traces.
*.specs/ *— a layer we just established, and it's the direct answer to your limit case. It holds URS documents, architectural decisions, the northstar — the human intent layer. The test for what belongs there: does CORE read this file at runtime to make a governance decision? If no, it belongs in .specs/, not .intent/. It doesn't govern code. It governs the governor.
Your failure mode — code that satisfies every rule and still produces a wrong outcome — is a .specs/ failure. The intended behavior was never formally declared, so there was nothing to trace back to. The governance log surfaces the rule that failed. When no rule fails and the outcome is still wrong, that silence is the signal. The requirement was missing, or the rule was underspecified.
The loop closes the same way every time: governor reads the log, finds no finding, recognizes the specification gap, amends .specs/ or .intent/. Only the human can do that. CORE is designed to make that gap visible and the correction path explicit — not to eliminate the need for a governor.
Which is also why "constitution instead of learning Python" isn't a shortcut. It's a different job description.

Daniel Nwaneri • Apr 15

"Silence is the signal" is doing serious work in that architecture. Most systems fail loudly when a rule breaks. CORE fails loudly when no rule breaks but the outcome is wrong which means the absence of a finding is itself a finding. That's an inversion most governance systems don't attempt..
The .specs/ layer is the part I want to think about more. "Does CORE read this at runtime?" as the boundary test is clean . it separates intent documentation from enforcement logic without ambiguity. The gap your loop closes is exactly the one Alvarito and I have been circling in this thread: decisions that were understood but never formally declared. Your architecture makes that gap structurally visible rather than just accidentally discoverable.
The governor job description is the honest framing. The system doesn't eliminate judgment — it routes judgment to the right moments. When the log has no finding, that's a signal the governor has to read and act on. That requires someone who understands what the system was supposed to do, not just what it did.

How long did it take before .specs/ became load-bearing rather than aspirational?

Alvarito1983 • Apr 14

The provenance problem is real and underappreciated. What you're describing is a version of what happens in infrastructure too — a configuration decision gets copied between systems, the original context gets lost, and six months later nobody knows why the timeout is set to 47 seconds or why that specific network range is excluded.

The difference in infrastructure is that the blast radius of losing context is more immediately visible. A misconfigured firewall rule breaks something today. A lost reasoning trail in code breaks something in 18 months when someone refactors without understanding what they're touching.

Your trace tool addresses the symptom correctly. The deeper question — why AI systems don't naturally preserve attribution — is probably structural. The training data was consumed, not cited. The output is synthesis, not quotation. Building attribution back in requires explicit tooling like what you've built, because the model itself has no incentive to surface where it learned something.

I've been documenting architectural decisions in CLAUDE.md for exactly this reason — not for the AI, but for the next session of the AI, and for the human reviewing the output. It's a partial solution to the same problem.

Daniel Nwaneri • Apr 14

The 47-second timeout analogy is doing real work. That's exactly the failure mode — context that made sense in the original decision becomes cargo cult config by the second transfer. The difference you're naming (18-month blast radius vs. same-day) is why provenance feels optional until it's catastrophic.

The structural point is the one that doesn't have a clean answer: training consumed, not cited. Building attribution back in is archaeologist work, not feature work. You're right that it has to be explicit tooling because the incentive to surface sources was never in the training objective.

The CLAUDE.md approach is interesting — documenting for the next session rather than the current one. I've been thinking about whether that pattern needs to be machine-readable rather than just human-legible. Something that lets the next session actually query the reasoning, not just encounter it in plaintext. Does your CLAUDE.md have any structure beyond prose, or is it narrative-first?

Alvarito1983 • Apr 14

Narrative-first, mostly — but with enough structure that Claude Code can extract what it needs without reading everything.

The pattern I've settled on is sections with consistent headers and a "Current State" block at the top that acts as the entry point for a new session. Something like:

## Current State — last updated [date]
- What's working
- What's broken
- Where to pick up next session

## Architecture decisions
[prose explaining why, not just what]

## Conventions
[rules Claude should follow]

The "Current State" block is the machine-readable part in practice — Claude Code reads it first and uses it to orient without re-deriving everything from the codebase. The architecture decisions section is genuinely prose, because the reasoning rarely compresses into structured data without losing the nuance that makes it useful.

The machine-readable vs human-legible tension you're naming is real. My instinct is that structured data (JSON, YAML) would be more queryable but would break the incentive to maintain it — nobody wants to write JSON to explain why they chose one approach over another. Prose has lower friction to update, which means it actually gets updated.

The open question for me is whether there's a middle ground: structured enough that a model can run a semantic search against it, but loose enough that a human writes it naturally. Something like Obsidian's approach to linking thoughts without forcing schema. Haven't solved it.

Daniel Nwaneri • Apr 14

The maintenance incentive point is the one that usually kills schema-first approaches. JSON for architectural reasoning is a losing proposition — the friction is too high exactly where the thinking is most complex. What you've described is a workable compromise: structured enough at the entry point, prose where the nuance lives.

The Obsidian analogy is the right frame. What that approach actually does is keep linking lightweight — [[decision]] is lower friction than a foreign key. The structure emerges from the connections, not from enforcing a schema upfront.
The middle ground might already exist in how you've described your "Architecture decisions" section: prose that a model can semantic-search without ever needing to parse it as data. Vectorized prose retrieval is probably more useful than structured queries for this class of content anyway — the reasoning rarely has clean boundaries you'd want to filter on.

What I haven't figured out is the staleness problem. "Current State — last updated [date]" breaks the moment someone forgets to update it, which is most of the time. Does your pattern have any forcing function for that, or is it discipline-dependent?

Alvarito1983 • Apr 15

Partially discipline-dependent, but I've added one forcing function that helps: I ask Claude Code to update the Current State block at the end of every session before closing.

The prompt is something like: "Before we finish, update the Current State section in CLAUDE.md to reflect what we did today, what's working, and where to pick up next time." Takes 30 seconds, happens while the context is still fresh, and Claude Code is better at summarizing what just happened than I am at remembering to write it down.

The staleness problem then shifts: instead of "did a human remember to update this", it becomes "did the session end cleanly or did someone just close the terminal." The latter still happens. But the forcing function catches maybe 80% of sessions that would otherwise leave stale state.

The deeper issue you're pointing at — that any pattern relying on consistent human behavior will degrade — is probably unsolvable without making the update automatic. Which brings it back to your tool: if the provenance system runs as a hook on commit rather than requiring manual invocation, it doesn't depend on discipline at all. That's the right place for the forcing function to live.

I haven't closed the loop on making CLAUDE.md updates fully automatic. The session-end prompt is a compromise — lower friction than remembering, higher reliability than nothing.

Daniel Nwaneri • Apr 15

The session-end prompt is a better forcing function than it looks — you've moved the update cost to the moment when the context is richest and the human is least likely to resist. That's good UX applied to a workflow problem.

The reframe you've landed on is the one that matters: discipline-dependent vs. trigger-dependent. The commit hook is trigger-dependent. It fires whether or not anyone remembered. That's why it catches what the session-end prompt misses — the closed terminal, the interrupted session, the context that never got a clean ending.

The gap you haven't closed is the async case: work that changes meaning without touching the codebase. An architectural decision made in a conversation that never produced a commit. The hook can't catch what was reasoned but not written. That's probably where the CLAUDE.md pattern and a provenance system are actually complementary rather than substitutes — one captures the decision trail, the other captures the code trail.

Fyodor • Apr 16

AI is quietly making human experts invisible. I built a tool to stop it.

⏬

OpenAI Ceo Sam Altman's Home Targeted With Molotov Cocktail

🤣

leob • Apr 15

Sounds very useful !