DEV Community: Alexander Tyutin

Architecture Documentation as a First-Class Engineering Asset

Alexander Tyutin — Thu, 16 Apr 2026 09:49:24 +0000

How autonomous AI agents can generate a complete architecture snapshot of your microservices platform - while you do push-ups - and why that documentation becomes the most powerful input for your AI-driven quality pipeline.

TL;DR

Architectural documentation is not a chore. When colocated with your source code and fed into an AI-powered quality pipeline, it transforms static analysis from "catching typos" into "catching systemic security failures and costly infrastructure leaks." This article documents a real experiment where an autonomous AI agent generated architecture files across a multi-service Google Cloud platform - with the human engineer largely off-screen - and what happened when that documentation gave our AI Quality Gate an entirely new perspective.

1. The "Self-Documenting Code" Problem

There is a persistent assumption in software engineering that well-structured code is self-explanatory. Clean functions, good variable names, and a Pylint score of 10.0/10 - surely that's enough?

It is not.

Code describes how a system executes. Architecture documentation describes why a system exists and how it interacts with everything around it. Without this context layer, every automated analysis tool is operating in the dark. It sees a function, but not its role in the broader service mesh. It sees an API call, but not the security boundary it is expected to enforce.

This distinction matters enormously when you introduce AI-powered tools into your engineering workflow. An LLM analyzing raw code without architectural context is like asking a senior engineer to perform a security review without access to the system design.

2. Generating Architecture While Doing Push-ups

My platform runs on Google Cloud. It consists of dozens of microservices deployed on Cloud Run, interacting via REST APIs, persisting assets to Google Cloud Storage, and routing all AI operations through a centralized Vertex AI gateway. A rich, well-connected system - but one where the only documentation was spread across scattered README files.

I set out to change that. The goal: a standardized, machine-readable architectural snapshot for every service, committed directly to the repository.

The method: guided autonomous agent execution.

The engineer set a direction, established the documentation standard, and then stepped back. The AI agent - powered by Gemini 3 Flash and Claude Sonnet 4.6 running inside Antigravity, an agentic AI coding assistant - took over. It autonomously inspected each service, read the source code, traced inter-service dependencies, cross-referenced existing implementations against the documentation standard, and iteratively generated structured ARCHITECTURE.md files. The engineer's main activity during most of this process was physical exercise.

The output was not informal notes. It was a disciplined, multi-level documentation hierarchy:

📦 platform-root
 ┣ 📜 ARCHITECTURE.md           ← Level 0: Global service mesh, topology, lifecycle status
 ┗ 📂 services
    ┣ 📂 core-ai-gateway
    ┃  ┗ 📜 ARCHITECTURE.md     ← Level 1: Security policy engine, FinOps guardrails
    ┣ 📂 orchestration-bot
    ┃  ┗ 📜 ARCHITECTURE.md     ← Level 1: Async task flow, Telegram webhook handling
    ┣ 📂 media-transcriber
    ┃  ┗ 📜 ARCHITECTURE.md     ← Level 1: Speech-to-Text pipeline, GCS asset management
    ┗ 📂 translation-engine
       ┗ 📜 ARCHITECTURE.md     ← Level 1: Structured output, multilingual routing

Each document followed a strict template:

Intent: The concrete business and technical reason this service exists.
Design Principles: Key trade-offs - statelessness, latency targets, fallback strategies.
Interaction Diagram: A Mermaid graph of service-to-service flows, security boundaries, and AI provider integrations. It may be generated by the agent and automatically drawn in Gitlab.
LLM Context Block: A precise summary optimized for consumption by automated agents and AI reviewers.

The entire operation resulted in a navigable, cross-linked architecture map - built with minimal human cognitive effort (and with visualizations!)

3. The Quality Gate Awakening

Once the documentation was committed alongside the source code, I ran a standard CI quality review using our AI-powered Quality Gate - a service built on top of Gemini via Vertex AI, designed to perform automated architectural and security reviews on every merge request.

💡 What is the Quality Gate, exactly?
It is not a $100,000 enterprise SaaS platform. It is a lightweight, purpose-built microservice - part of the same platform it reviews - deployed on Google Cloud Run. It exposes a single endpoint, receives the merge request diff from the CI pipeline, constructs an LLM prompt enriched with the repository's architectural documentation, calls Vertex AI (Gemini), and returns a structured JSON review report.

Because it runs on Cloud Run, it starts only when a review is triggered and shuts down immediately after. The total monthly cost for me is a few dollars - a fraction of a single human code review hour. This is a practical demonstration of the Google Cloud serverless model: pay only for the compute you actually use, and use high-intelligence AI only when it adds value.

The difference was immediately visible.

Previously, without architectural context, the Quality Gate was limited to code-level analysis: style consistency, common security anti-patterns, dependency versions. Useful, but shallow.

With the ARCHITECTURE.md files available as context, the model could see the architecture and the code simultaneously. The result was a qualitative leap: the Quality Gate shifted from a static analysis tool into a reasoning system operating at the level of system design.

It identified two critical issues within minutes - issues that had existed undetected in the codebase for months.

Finding 1: The Distributed Tracing Blackout

One of our routing services included middleware that explicitly stripped incoming trace headers. On the surface, this looked like a reasonable security measure to prevent external clients from injecting trace identifiers into internal systems.

The Quality Gate identified it as a critical observability violation.

Because the architectural documentation described the distributed tracing standard across the mesh - including the requirement for end-to-end X-Trace-ID propagation compatible with Google Cloud Trace - the model understood that stripping these headers at the boundary did not isolate a threat. It severed the trace chain entirely. In any production incident, engineers would be unable to correlate logs across services in Cloud Logging, turning a routine debugging session into a multi-hour forensic investigation with no Cloud Audit Logs correlation to lean on.

Security intention ✓. Systemic consequence ✗. The documentation made this contradiction visible.

Finding 2: The Silent Storage Leak

A media processing service was documented as intentionally skipping cleanup of temporary assets in Google Cloud Storage after each processing job. The rationale was implicit - simplicity, no failure modes from deletion errors.

The Quality Gate cross-referenced this against the documented architectural principle of data minimization and least-privilege access, and flagged it as both a security and FinOps violation.

The impact: user audio files - potentially containing sensitive personal information - accumulating indefinitely in cloud storage. No lifecycle policy. No deletion trigger. Silent, compounding cost growth. An expanding attack surface with each new processing request.

Neither a linter nor a code reviewer scanning functions in isolation would have flagged either of these. Both findings emerged from the intersection of code behavior and architectural intent - visible only because the documentation existed.

4. The ROI Case

This experiment produced a measurable return on investment across three dimensions:

Dimension	Without Documentation	With Documentation + AI Agent
Architecture Capture	Senior Architect hours	Agent cycle, near-zero human effort
Review Quality	Code-level findings	System-level and policy findings
Issue Discovery Cost	Post-incident or audit	CI/CD pipeline (minutes, pennies)
Quality Gate	Generic, rigid enterprise tool	Custom microservice, tunable per team or developer

Three additional factors are worth noting specifically in the context of Google Cloud platforms:

Vertex AI Token Efficiency: When the Quality Gate is backed by a Gemini model, providing a structured ARCHITECTURE.md reduces the tokens the model spends reconstructing system intent from raw code. Better context means cheaper, faster, and more accurate generation - directly impacting your AI compute costs.
Cloud Run Observability: The distributed tracing finding described above is particularly relevant for Cloud Run-based architectures, where services are stateless and ephemeral. Without continuous trace propagation, debugging inter-service failures on Cloud Run becomes significantly harder. The documentation made this risk explicit and catchable.
Serverless Cost Model: Because the Quality Gate is a Cloud Run service invoked only during CI/CD runs, there is zero idle cost. On a typical team with several merge requests per day, the entire AI-powered review pipeline costs a few dollars per month - less than a single engineering hour. This is the Google Cloud serverless model working exactly as intended: high-intelligence compute, on-demand, at minimal cost.

5. Lessons for Platform Engineers

The key insight from this experiment is not that AI agents write documentation faster than humans. That is expected. The key insight is that architecture documentation living inside the repository is a force multiplier for every automated tool that reads it.

This applies whether your automated tools are AI-powered code reviewers, compliance scanners, onboarding assistants, or infrastructure planning agents. The better the documentation, the higher the signal quality of every tool operating on top of it.

Practical recommendations:

Colocate documentation with code. A separate wiki that drifts out of sync is noise. An ARCHITECTURE.md in the service directory, updated in the same commit as the code, is signal.
Establish a documentation standard. A consistent template (Intent, Principles, Interaction Diagram) makes documentation machine-readable, not just human-readable.
Define a lifecycle status. Clearly mark deprecated or inactive services. Automated agents should not use legacy code as a reference for current standards.
Use agents to generate the initial draft. The cognitive overhead of starting from a blank page is real. Agents are excellent at producing a structured first pass that engineers then validate and refine.
Feed documentation to your CI pipeline. An AI quality reviewer with architectural context is a different class of tool than one without it.
Build your own Quality Gate - and make it yours. This is the key advantage that enterprise SaaS cannot match: flexibility. A custom Cloud Run service backed by Gemini and driven by your compliance rules, your architectural standards, and your team conventions means every developer can have a personal reviewer that understands the exact context of the project - not a generic ruleset designed for the average of all possible codebases.

6. Conclusion

Architecture documentation has historically been treated as optional overhead - valuable in theory, deprioritized in practice. This experiment demonstrates that when documentation is colocated with source code, follows a consistent machine-readable standard, and is kept current with the help of autonomous agents, it becomes a critical infrastructure component.

It enables automated systems to reason at the level of platform design, not just code syntax. It transforms AI-powered quality gates from expensive linters into genuine architectural advisors. And it can be generated - for an entire platform - while you are doing something else entirely.

The $10,000 ARCHITECTURE.md is not a metaphor. It is the estimated cost differential between finding a critical architectural flaw in a 5-minute CI review versus discovering it during a production incident, a compliance audit, or a cloud storage invoice that nobody expected.

Keep your architecture documented. Keep it in the repository. Let agents maintain it.

Stay standardized. Stay secure.

Using OpenCode as a fallback agent for Antigravity

Alexander Tyutin — Wed, 15 Apr 2026 10:45:53 +0000

Today I was confused by Antigravity errors about high load on their services. It made my work impossible even with the cheapest model Gemini 3 Flash.

Some time ago I heard something about the OpenCode. And it was the time to try it!

I've installed the opencode in my system by brew install anomalyco/tap/opencode and respective extension from the marketplace.

I have a good documentation inside the repo like described in the article Specification-First Agentic Development: A Methodology for Structured, Traceable AI-Assisted Development. So the default free OpenCode model Big Pickle performed planning, reviewing and coding stage well.

But then I realized that it was working without taking into the account system instruction and rules which I had for Antigravity.

So I've performed calls of Antigravity assurance workflows (like here and here) right from the OpenCode chat and it performed them perfectly.

As I have a lot of workflows for linting, security check of diff and the whole repo, and especially external self-made security gateway I was sure that the quality of code produced by the OpenCode was good enough and aligned with my codebase.

The only thing I can mention is a redundant file was left after some iterations of testing. But it can be fixed by a good review right after MR creation.

So seems the OpenCode is a good fallback for cases when Google servers are experiencing problems. Also it can by used to save tokens for some kind of tasks.

Gemini Thinking: How "Brainy" Models Unexpectedly Blew My Budget

Alexander Tyutin — Mon, 13 Apr 2026 07:29:03 +0000

Recently, Google notified me that the Gemini 2.0 models I was using are retiring. This was disappointing because my charity project for Technovation Girls, worked perfectly and very cheaply on those models.

I had to find a replacement. While Google recommended Gemini 3.0, those models are still in "preview". Since my project needs high stability, I chose the Gemini 2.5 family, which is already in "General Availability".

The Surprise: Why is it so Slow and Expensive?

Switching was easy because I built my platform to handle model changes and fallbacks automatically. I simply updated my allowed models list and set gemini-2.5-flash-lite as the primary choice.

However, I was shocked by the results:

Requests took much longer to finish.
The quality was barely better.
Token usage exploded.
I saw a massive "system overhead" in my logs.

Before:

After:

The Cause: "Thinking" by Default

After digging into the documentation, I found the reason: all Gemini 2.5 models are "thinking" models. By default, they use as many tokens as possible to "reason" before answering.

My project worked great without this extra thinking. The slight quality boost was not worth the massive increase in latency and cost. I had to find a way to stop the model from thinking "on my dime".

The Technical Hurdle

I discovered that different models have different minimum "thinking budgets". Surprisingly, gemini-2.5-flash-lite has a higher minimum budget (512 tokens) than the more powerful gemini-2.5-flash (only 1 token!).

Model	Min Thinking Budget
Gemini 2.5 Flash Lite	512 tokens
Gemini 2.5 Flash	1 token
Gemini 2.5 Pro	128 tokens

To fix this, I had to expand my code to calculate and limit these budgets during fallbacks. I also had to handle the new text constants (MINIMAL, MEDIUM, HIGH) used by the Gemini 3.x models.

"gemini-2.5-flash-lite": {
           "model_page": f"{_MODEL_GEMINI_DOCS_BASE}/gemini/2-5-flash-lite",
           "is_thinking": True,
           "grounding_rag": True,
           "grounding_google_search": True,
           "count_tokens": True,
           "supports_thinking_level": False,
           "supports_thinking_budget": True,
           "min_thinking_budget": 512,
           "outputs": ["text"],
       },
"gemini-2.5-flash": {
           "model_page": f"{_MODEL_GEMINI_DOCS_BASE}/gemini/2-5-flash",
           "is_thinking": True,
           "grounding_rag": True,
           "grounding_google_search": True,
           "count_tokens": True,
           "supports_thinking_level": False,
           "supports_thinking_budget": True,
           "min_thinking_budget": 1,
           "outputs": ["text"],
       },
"gemini-2.5-pro": {
           "model_page": f"{_MODEL_GEMINI_DOCS_BASE}/gemini/2-5-pro",
           "is_thinking": True,
           "grounding_rag": True,
           "grounding_google_search": True,
           "count_tokens": True,
           "supports_thinking_level": False,
           "supports_thinking_budget": True,
           "min_thinking_budget": 128,
           "outputs": ["text"],
       },

The Result

I finally switched to gemini-2.5-flash with a strict limit of 50 thinking tokens.

Now, response speeds are back up and costs are back down. It was a lot of unexpected work for a "simple" upgrade, but everything is running smoothly again!

AI-Powered Repository Security Check with Antigravity Workflow

Alexander Tyutin — Mon, 06 Apr 2026 09:46:09 +0000

When teams want to "move fast and break things," security is often the first thing they forget. I've seen a lot over 15 years in the industry. My approach is simple: follow the Pareto Principle (80/20). You want 80% of the security results with just 20% of the work.

In the AI era, that 20% of work can look like a single command.

Here is how we built the Antigravity workflow that checks the whole repository for security issues in several minutes. It does not cost much and does not use up all the AI's context window.

Short video demo made on a real repository:

The Initial Stack

To get a clear picture of a repository's health, one tool is not enough. We use a combination of proven, open-source scanners for the beginning:

Gitleaks: To find secrets like API keys and tokens.
Semgrep: For SCA and SAST to find bad code patterns and supply chain issues.
Checkov: To check IaC security (Docker, Terraform, Kubernetes).
OSV-Scanner: For SCA scan.

Inspecting their results manually takes a lot of time. And if you just send all their raw output directly to an AI, it becomes very expensive and confusing.

Token Economy

For a security review, the AI doesn't need to see every test that passed. It doesn't need to see the full abstract syntax tree. It only needs to know what is broken, where it is, and why it matters.

We use jq to remove the extra noise. This minifying step is very important for Token Economy.

To increase the token savings the command (workflow) may be ran with the cheapest Gemini 3 Flash. It is more than enough to receive a high-quality base report. Then the report may be reviewed with more powered models like Gemini 3.1 Pro.

Example: Minifying Results

Instead of a huge JSON file per tool, we make it small and simple. For example, here are the exact commands we use to make the results smaller:

  1. `jq '[.[] | {rule: .RuleID, file: .File, line: .StartLine}]' gitleaks-raw.json > gitleaks-min.json`
  2. `jq '[.results[] | {rule: .check_id, file: .path, line: .start.line, severity: .extra.severity}]' semgrep-raw.json > semgrep-min.json`
  3. `jq 'if type=="array" then map(.results.failed_checks[]) else .results.failed_checks end | [.[]? | {rule: .check_id, file: .file_path, line: .file_line_range}]' checkov-raw.json > checkov-min.json`
  4. `jq '[.results[]?.packages[]?.vulnerabilities[]? | {rule: .id, file: .package.name, line: "N/A", severity: ((.database_specific.severity) // "N/A")}]' osv-raw.json > osv-min.json`

By making the data 90% smaller, the AI stays focused on real problems. This makes the check much cheaper.

% ls -lh .security-artifacts | awk '{print $5, $9}'

6.3K checkov-min.json
2.6M checkov-rev-004-security-20260401-201110.json
2.4K gitleaks-min.json
19K gitleaks-rev-004-security-20260401-161824.json
19K gitleaks-rev-004-security-20260401-201110.json
107B osv-min-rev-001-security-20260406-110805.json
11K osv-raw-rev-001-security-20260406-110805.json
4.3K semgrep-min.json
61K semgrep-rev-004-security-20260401-161824.json
38K semgrep-rev-004-security-20260401-201110.json

The "One Command" Workflow

We put all these steps into one Antigravity slash command: /review-security-repo.

When we run it, the agent does exactly this:

Identifies the environment: Checks for tools like semgrep, gitleaks, checkov, osv-scanner and jq.
Executes Raw Scans: Runs the scanners to get raw logs.
Applies Minification: Uses jq to strip massive metadata.
Synthesizes Findings: Only reads the small files (gitleaks-min.json, semgrep-min.json, checkov-min.json, osv-min.json).
Performs Review: Checks high-risk files to find complex problems that static tools miss.
Generates an Actionable Report: Uses a strict Markdown structure instead of a generic summary.

Report Structure Snippet

The workflow forces the AI to output exactly what we need, like this:

### [Severity] - [Vulnerability Name/Rule ID]
- **Tool Source:** [Semgrep / Gitleaks / Checkov / Manual Architectural Review]
- **Location:** `[File Name]:[Line Number]`
- **Business Impact:** [Why this matters]
- **Remediation:** 
  [Actionable, copy-paste code or config fix]

Why This Works

Repeatability: Anyone on the team can check security without being an expert.
Audit Trail: Every raw and minified report is moved to .security-artifacts/ so we can track the history.
Reduced Hallucinations: Because we give AI only the exact scanner results and small code pieces, it gives real fixes without making things up.

Full Workflow Code

If you want to try this yourself, here is the complete code for the /review-security-repo workflow:

---
description: "Security review of the repo"
---

- Get the current branch name and current timestamp (format: YYYYMMDD-HHMMSS). Define output file as `security-review-[branch-name]-[timestamp].md`.
- Check for a `venv` (or `.venv`) directory in the repository root. If found, use its binaries.
- Verify if `semgrep`, `gitleaks`, `checkov`, and `jq` are installed. If missing, prompt for installation and pause until confirmed.
- Execute local security scanners to capture raw audit trails:
  1. `gitleaks detect --source . -v --report-format json --report-path gitleaks-raw.json`
  2. `semgrep scan --config auto --json --output semgrep-raw.json`
  3. `checkov -d . --quiet --skip-path venv -o json > checkov-raw.json`
  4. `osv-scanner -r . --format json > osv-raw.json || true`
- Execute `jq` to strip massive metadata, passed checks, and AST dumps, keeping only critical fields to save context tokens:
  1. `jq '[.[] | {rule: .RuleID, file: .File, line: .StartLine}]' gitleaks-raw.json > gitleaks-min.json`
  2. `jq '[.results[] | {rule: .check_id, file: .path, line: .start.line, severity: .extra.severity}]' semgrep-raw.json > semgrep-min.json`
  3. `jq 'if type=="array" then map(.results.failed_checks[]) else .results.failed_checks end | [.[]? | {rule: .check_id, file: .file_path, line: .file_line_range}]' checkov-raw.json > checkov-min.json`
  4. `jq '[.results[]?.packages[]?.vulnerabilities[]? | {rule: .id, file: .package.name, line: "N/A", severity: ((.database_specific.severity) // "N/A")}]' osv-raw.json > osv-min.json`
- Read ONLY `gitleaks-min.json`, `semgrep-min.json`, `checkov-min.json`, `osv-min.json`. Filter out false positives based on repository context.
- Analyze high-risk architectural files strictly for logical flaws and cross-service least-privilege violations that static tools cannot understand.
- Generate the report in `security-review-[branch-name]-[timestamp].md`. DO NOT output generic summary tables. You MUST output an exhaustive, itemized list.
- Use the following strict Markdown structure for the report:
  ## Executive Summary
  [Brief overview of the branch's security posture]
  ## Detailed Findings
  [Iterate through EVERY validated finding. For each finding, output:]
  ### [Severity] - [Vulnerability Name/Rule ID]
  - **Tool Source:** [Semgrep / Gitleaks / Checkov / Manual Architectural Review]
  - **Location:** `[File Name]:[Line Number]`
  - **Business Impact:** [Why this matters]
  - **Remediation:** 
    ```


    [Actionable, copy-paste code or config fix]


    ```
- Create a `.security-artifacts/` directory if it does not exist. Ensure `.security-artifacts/` is appended to `.gitignore`.
- Move and rename both raw and minified reports to `.security-artifacts/` to preserve the complete historical audit trail:
  - `gitleaks-raw.json` -> `.security-artifacts/gitleaks-raw-[branch-name]-[timestamp].json`
  - `semgrep-raw.json` -> `.security-artifacts/semgrep-raw-[branch-name]-[timestamp].json`
  - `checkov-raw.json` -> `.security-artifacts/checkov-raw-[branch-name]-[timestamp].json`
  - `osv-raw.json` -> `.security-artifacts/osv-raw-[branch-name]-[timestamp].json`
  - `gitleaks-min.json` -> `.security-artifacts/gitleaks-min-[branch-name]-[timestamp].json`
  - `semgrep-min.json` -> `.security-artifacts/semgrep-min-[branch-name]-[timestamp].json`
  - `checkov-min.json` -> `.security-artifacts/checkov-min-[branch-name]-[timestamp].json`
  - `osv-min.json` -> `.security-artifacts/osv-min-[branch-name]-[timestamp].json`
- Exit execution.

What’s Next?

What tools are missing from your perfect "One Command" security check? Will be happy to receive opinions on how to further optimize the token economy while expanding the security coverage.

Antigravity: My Approach to Deliver the Most Assured Value for the Least Money

Alexander Tyutin — Wed, 01 Apr 2026 05:49:34 +0000

As I'm not a professional developer but a guy who needs to use automation to get things done, I follow one main rule: keep it simple. Overengineering hurts. I use the Pareto rule—spend 20% of the effort to get 80% of the result.

When I use AI agents like Antigravity, my goal is not to let the AI write complex code that no one can read. My goal is to build simple, secure features fast. At the same time, I control costs by saving tokens. Here is the exact workflow I use.

The Token Economy Strategy

LLM tokens cost money. Using a smart, expensive model just to fix code spaces is not worth the cost. I change models based on how hard the task is.

High-Tier Models: They are for the big tasks: planning architecture, writing complex business logic, checking security, and counting cloud costs.
Low-Tier Models: These folks are for simple tasks: fixing syntax errors, aligning code to Pylint, and writing standard code pieces.

Task Decomposition & In-Repo Architecture

Large prompts can break LLMs. If a prompt has too much text, the AI gets confused and wastes tokens. To stop this, I break every task into small, separate pieces so the AI only sees what it needs.

I store all architecture plans and tasks inside the code repository (for example, ./docs). This keeps the instructions very close to the code for the AI.

Every task I write uses this strict four-part structure:

Idea: The main business or tech goal. Why it matters: It proves the task is useful before I spend tokens for delivering a code to review.
Plan: The technical blueprint. Why it matters: It locks down the plan, keeps security high, and stops the AI from inventing bad solutions.
What Was Done: A short log of the work. Why it matters: It gives future AI tasks a quick summary, so the AI does not have to read every code file again.
Debt: A list of any technical shortcuts or "crutches" used to save time. Why it matters: Hidden debt ruins the project. Important: My custom Quality Gate checks this section. If it finds unapproved shortcuts in the code, it blocks the release completely.

System Instructions for the AI

To keep the AI agent aligned with the goals, I pass strict system instructions on every run. It never lets the model guess my coding standards. Here are the core rules enforced:

No Crutches: Any "crutch" or technical shortcut must be approved by me. Then, the AI must document it as technical debt in the project files.
No Inventing Wheels: I try hard to avoid this. If a working approach already exists in another project, the AI reuses it.
Learn from the Past: When building a new service, the AI must check the old tech debt to avoid repeating past mistakes.
Simple Code Only: The code structure should just use standard classes. I avoid "genius-level" extreme one-line code tricks or overwhelming structures.
Maintainability First: A middle-level, part-time developer must be able to read and maintain the code.

The Core Workflow

Every feature goes through a step-by-step process. I'm trying to keep security and simplicity as the main focus at each step.

1. Plan & The Plan Review

Using a High-Tier Model.

Plan: Defining the code structure, the security rules, the cost limits, etc. I make sure not to add to old technical debt.
Review: I look at the plan with a "fresh eye." I do not start coding until the plan is clear with main code snippets planned.

2. Code & Code Review

Using a Low or Mid-Tier Model for code and Mid or High-Tier Model for review.

Code: Implement the code exactly as planned. Use clear classes and avoid complex, one-line code tricks. A middle-level developer must be able to maintain it easily.
Review: Make sure the code matches the rest of the project. I prefer another "person" to check it before I call it done.

3. Lint & Quality Gate

Using Free External Tools & A Custom Nanoservice.

Lint: I do not pay LLMs to fix missing spaces. I use free tools like autopep8, ruff, and pylint to save tokens.
Quality Gate: I built a simple nanoservice using the Vertex API. It checks the code changes against the main branch. It works like an automatic review from the CTO, CISO, and CFO. It checks every line for good architecture, proper security access, and cost impact before the code goes to production. Why is it so important? The Quality Gate is not overwhelmed by the full chat history inside the IDE. Its "fresh eye" often finds architectural and coding flaws that were missed by the IDE models, even after 6 to 9 rounds of review.

The Bottom Line

AI coding is not magic. In my experience, it requires a strict testing gate, smart model swapping, and simple design. By owning the process and letting the AI act as a typist, it is possible to ship secure code fast. I share this approach for an open discussion on how we can build better automation.