I have been thinking about this a lot lately.
As AI tools become part of daily development, generating code is getting easier. But reviewing that code properly is still very important.
I want to learn how other developers handle this.
When AI generates code for you, what review process do you follow before you keep it?
I am interested in questions like:
- Do you review everything line by line?
- Do you trust AI for boilerplate only, or also for business logic?
- What do you check first: correctness, security, performance, readability, or architecture?
- Do you use a checklist?
- How do you catch subtle bugs or bad assumptions?
My rough thinking is something like this:
- understand the code fully before keeping it
- verify logic against requirements
- test happy path and edge cases
- check security and performance concerns
- refactor to match project standards
- never merge code only because “it works”
I would really like to hear practical workflows from real developers and teams.
What is your process for reviewing AI-generated code?
Top comments (13)
I find a multi model review that runs in a cycle provides meaningfully better results. E.g. let's say claude wrote the code. I have gemini/gpt/claude all review it independently and then synthesize the results. I keep reviewing until the review output is no longer helpful / fixing issues. My human judgement is only needed to determine at what point the review output is counterproductive.
That is a smart approach. Cross-reviewing AI-generated code with multiple models seems like a practical way to reduce blind spots and catch different kinds of issues.
I also liked your point that human judgment is still essential, especially in deciding when more review stops adding value.
Would you be open to sharing your actual workflow or template for this? For example, how you structure the review cycle, what you ask each model to check, and what signals tell you it is time to stop iterating.
I think that would be really useful for people trying to turn this into a repeatable process instead of doing it ad hoc.
pretty straightforward-- here is 1 cycle:
Github Repo
Thanks for sharing this.
I will check it out. This seems interesting
A surprising insight we've observed is that AI-generated code often fails in code reviews not because of logic errors, but due to inadequate variable naming and documentation. In my experience with enterprise teams, a simple framework that helps is the "3C" approach: Clarity, Consistency, and Context. Make sure your AI-generated code adheres to these principles to ensure maintainability and ease of understanding for your team. - Ali Muwwakkil (ali-muwwakkil on LinkedIn)
That is a really strong point. I think this gets missed a lot because people usually focus first on whether the code “works,” but in real teams clarity and maintainability matter just as much.
I like your 3C approach: Clarity, Consistency, and Context. AI-generated code can often look correct at first glance, but weak naming and missing context make it much harder to review, debug, and extend later.
Have you found any practical way to enforce the 3Cs during review? For example, do you use a checklist, PR template, or internal standards for naming and documentation?
I paste the AI code into the IDE, run it, and if it works, I check to see if all the requested features are implemented. If it doesn't work, I regenerate it. If it works but is missing a feature, I ask for it to be completed in another chat. If it works, I ask for explanations of the code, identify errors, and suggest improvements, again in a new chat.
I use a model driven kanban board with human verification (think QA). It already deslops as a step and then refactors based on embeddings that codify things from Robert Martin, Unix Philosophy, functional programming paradigms, and it does a damn fine job. I review code once in awhile by hand to ensure my systems work.
State machines powered by cli calls instead of letting the model do things. I turn the model into just a "voice in the head" and let my agent do the rest (deterministic gate).
I have a simple workflow. After every epoch with the AI, I ask it: "What shortcuts did you take? What quickfixes did you do? What did you defer? What decisions did you make without checking in with me first?"
So far, Claude has been responding really well with "Here's an honest accounting" reply outlining all those things. After it addresses those, then I ask that question again. And so on, until things are acceptable to me.
I’m currently working on an early-stage project involving autonomous transaction flows between systems (Mindchain).
One thing that became clear is that AI-generated code needs to be treated as “untrusted by default”.
Even in a simple MVP, I found it useful to:
Curious to see how others are approaching this, especially as systems become more autonomous.
It looks like this. Verification can be auto or human required:
Some comments may only be visible to logged-in visitors. Sign in to view all comments.