close

DEV Community

Pavel Gajvoronski
Pavel Gajvoronski

Posted on

I Built 23 Pages in One Day With AI. Then One API Key Almost Killed Everything

This is a build-in-public update on Kepion — an AI platform that deploys companies from a text description. First post here.*

"This is a build-in-public update..."

Two days ago I shared the architecture. Today I want to share what actually happened when I started building — the wins, the disasters, and the numbers.


The disaster: 3 hours lost to a phantom API key

I sat down at 8am ready to build. Opened my terminal. Ran GSD-2 (my build orchestrator). Got this:

Error: All credentials for "anthropic" are in a cooldown window.
Enter fullscreen mode Exit fullscreen mode

My Max plan showed 3% usage. The tool said I was rate-limited. For three hours I debugged, restarted, cleared caches, filed a support ticket. The fix?

unset ANTHROPIC_API_KEY
Enter fullscreen mode Exit fullscreen mode

An old API key from a previous tool installation was silently overriding my subscription. One environment variable. Three hours gone.

The lesson: invisible defaults are the most dangerous bugs in AI tooling.

I'm sharing this because every developer building with AI agents will hit this. Your LLM provider's auth layer has more failure modes than your application code.


What GSD-2 actually built in one day

Once the auth was fixed, I pointed GSD-2 at Kepion and let it work. Here's the raw output from a single day:

Security hardening (10 items)

  • Deny-by-default auth middleware — every new route is blocked unless explicitly whitelisted
  • Path traversal fix in vault manager
  • WebSocket authentication (was anonymous before)
  • CORS whitelist replacing wildcard *
  • Password policy: 12+ chars, uppercase, digit, special char
  • Rate limiting by user email instead of IP
  • Upload validation: file extension whitelist, 5MB limit
  • Business ownership verification on all endpoints
  • Session scoping by user_id
  • Login attempt tracking with 30-minute lockout after 10 failures

Observability (shipped)

  • Every HTTP request gets a trace_id
  • Every agent call becomes a span linked to the trace
  • Slow trace detection (>5s)
  • Error trace listing
  • All persisted in SQLite

Cost intelligence (shipped)

  • Per-agent, per-model, per-business cost breakdown
  • Anomaly detection: flags agents with z-score > 2σ above mean
  • Cost circuit breaker: blocks requests at configurable limits

Team Memory (shipped)

  • Agents save learnings across sessions
  • Effectiveness scoring (0.0–1.0)
  • Auto context injection — relevant memories prepended to prompts
  • Categories: solution, pattern, mistake, optimization

Checkpoint & Replay (shipped)

  • Checkpoint after every chain step
  • Resume on failure with can_resume: true
  • Dead letter queue for chains that fail after all retries
  • Configurable retry policies: default, critical, fast_fail

Event-driven triggers (shipped)

  • 5 trigger types: schedule, webhook, event_pattern, vault_change, threshold
  • 4 action types: run_agent, run_chain, webhook_out, notify

Web UI: 23 pages (shipped)

  • Full Next.js 16 dashboard with collapsible sidebar
  • Dashboard, Chat, Agents, Pipelines, Businesses, Integrations
  • Vault, Research, Patterns, YouTube, Workflows, Gate
  • Costs, Traces, Triggers, Admin, Pricing, Account
  • Live support chat widget with typing indicators
  • Pricing page with 5 tiers and competitive comparison table

Telegram bot: fully functional (shipped)

  • /start with auto-registration and JWT token storage
  • /agents, /agent, /business, /status, /costs, /help
  • Free text → auto-routing to the right agent
  • Typing indicators while agents think
  • Auth headers on every API call

The numbers

Metric Value
Services 30+
API endpoints 40+
Agent prompts (v3) 31 × 17 sections each
Tests 180+
Web UI pages 23
Telegram commands 15
Lines changed in one day ~3,000+

One person. One AI build tool. One day.


What I learned

1. Security is invisible until it isn't. Nobody sees path traversal protection. But without it, the first user with ../../etc/passwd in a vault search owns your server. I'm glad GSD-2 caught every item from the CONCERNS.md audit.

2. Observability changes everything. Before traces, debugging a 5-agent chain was guesswork. Now I can see: request → router (2ms) → researcher (4.3s) → sentinel (1.1s) → warden (0.8s) → response. The bottleneck is always the researcher.

3. Cost circuit breakers are non-negotiable. Without them, one hallucinating agent in a loop burns through your OpenRouter budget in minutes. Our circuit breaker has 4 levels: per-request ($2), per-agent-hourly ($10), per-business-daily ($50), platform-hourly ($100).

4. Team Memory is the moat. Every business Kepion creates makes the next one better. Agents save what worked and what failed. Business #5 benefits from patterns discovered in businesses #1-4. This compounds. Competitors can copy the code — they can't copy the accumulated knowledge.


What's next

  • Autonomous Operations — agents posting to Twitter, sending emails, running outreach. Every output goes through Sentinel (fact-check) and Warden (quality gate) before publishing. Quality over spam.

  • Full Deploy Pipeline/deploy chess-school → buy domain → deploy frontend (Vercel) → deploy backend (Railway) → configure Paddle payments → live URL. One command.

  • Code Ownership — all generated code pushes to the user's GitHub. You own everything. Kepion is the builder, not the landlord.


Questions for you

I'm genuinely curious:

  1. How do you handle AI agent costs in production? We built a 4-tier model routing system (Free → Budget → Performance → Premium) with auto-escalation on failure. Is anyone doing this differently?

  2. Team Memory vs RAG — what's your experience? We went with vault-based memory with effectiveness scoring instead of pure vector search. The scoring means bad memories decay. Has anyone combined both approaches?

  3. What's your threshold for "good enough" security in an MVP? We went aggressive (deny-by-default, path traversal, rate limiting) before launch. Some say ship fast, secure later. Curious where others draw the line.


Follow the build: GitHub | kepion.app

Top comments (0)