"Will AI coding assistants replace AI engineers in 5 years?" ⬇️ My friend Drazen Zaric asked me this question over coffee, and it got me thinking about the future of AI engineering—and every other job. Here's what I learned from 10+ years in AI/ML: > 𝗟𝗟𝗠𝘀 𝗮𝗹𝗼𝗻𝗲 𝗰𝗮𝗻'𝘁 𝘀𝗼𝗹𝘃𝗲 𝗰𝗼𝗺𝗽𝗹𝗲𝘅 𝗽𝗿𝗼𝗯𝗹𝗲𝗺𝘀. They need the right context and expert human guidance. When I use Cursor for Python (my expertise), I code 10x faster. But with Rust (where I'm less expert)? It actually slows me down. > 𝗧𝗵𝗲 𝗿𝗲𝗮𝗹 𝗴𝗮𝗺𝗲 𝗶𝘀𝗻'𝘁 (𝗮𝗻𝗱 𝗻𝗲𝘃𝗲𝗿 𝘄𝗮𝘀) 𝗮𝘁 𝘁𝗵𝗲 𝗰𝗼𝗱𝗶𝗻𝗴 𝗹𝗲𝘃𝗲𝗹 𝗮𝗻𝘆𝗺𝗼𝗿𝗲 It's about knowing WHAT to build and HOW systems work end-to-end. Companies need people who can: • Design the right solution architecture • Provide high-quality context to AI tools • Filter and refine AI outputs effectively • Understand the full stack from infrastructure to business logic > 𝗧𝗵𝗲 𝘄𝗶𝗻𝗻𝗲𝗿𝘀 𝘄𝗼𝗻'𝘁 𝗯𝗲 𝘁𝗵𝗼𝘀𝗲 𝘄𝗮𝗶𝘁𝗶𝗻𝗴 𝗳𝗼𝗿 𝗔𝗜 𝘁𝗼 𝗱𝗼 𝗲𝘃𝗲𝗿𝘆𝘁𝗵𝗶𝗻𝗴. They'll be the experts who can accelerate their work 10x by combining deep system understanding with AI assistance. 10 years ago, knowing Python was enough for a data science job. Today, that's just the entry ticket. The value is in understanding how to orchestrate complex systems—from Kubernetes clusters to agentic workflows. > 𝗕𝗼𝘁𝘁𝗼𝗺 𝗹𝗶𝗻𝗲 Human expertise + LLMs = acceleration. Human expertise alone = slow progress. LLMs alone = endless loops and compounding errors. What's your experience using AI tools in your domain of expertise vs. areas where you're still learning? --- Follow Pau Labarta Bajo for more thoughtful posts
Why Use Expert-in-the-Loop for LLM Coding
Explore top LinkedIn content from expert professionals.
Summary
Expert-in-the-loop for LLM (large language model) coding refers to having experienced programmers review, guide, and collaborate with AI coding assistants to produce reliable, high-quality code. While LLMs can generate and automate code quickly, they often need human expertise to provide context, catch errors, and refine outputs.
- Guide the AI: Bring your programming knowledge to steer LLMs by asking precise questions and giving clear instructions, which leads to better and more accurate code.
- Review and refine: Always check, adjust, and improve code generated by LLMs, as experts can spot subtle mistakes or misunderstandings that an AI might miss.
- Balance automation and judgment: Use your experience to decide when to rely on AI for repetitive tasks and when complex decisions require deeper human insight and intervention.
-
-
It's disappointing to me how many people are downplaying programming expertise and education with the rise of LLMs. A simple thought experiment: We've had the ability to outsource software engineering since the early 2000's by offshoring the work. Ask yourself this simple question. With offshoring, would it be more successful if the onsite lead knew how to code well? Or would it make no difference? Sure, large sections of my code base are written by Claude Code, but the reason I'm able to use it well is because I can still code relatively well without it. I still end up refactoring large chunks of what comes out of the LLMs, sometimes to fix functionality, other times to condense the codebase in such a way that the LLMs when they iterate or add new features they are more effective. My expertise as a programmer comes into play in the precision in how I prompt the LLM, how I steer the LLM as the code base becomes more complex either by prompting or manual refactoring and how I make suggestions on how to make the code faster. There is a huge difference between asking the LLM to make your code faster vs. telling it to cache large model objects with an LRU cache, both in the # of tokens spent and the code performance afterwards. And for the students out there. Please don't use the LLMs for all your course work. Use it for some, but please don't use it in your data structures and algorithms course. The point of the course work isn't to get an A, but to train yourself to see complex problems and break them down into their consistent pieces. In much the same way I make my kids to math by hand even though graphing calculators with computer algebraic solvers exist. (After they get to a point of expertise do the advanced calculators come out) At the end of the day, education is about you learning the fundamentals so you can use the tools better.
-
I use Claude Code on production codebases enough that I hit the Max limits. "Vibe coding" does not describe my work. A deep understanding of software engineering and computer systems is required to make the calls that keep a complex codebase healthy and keep my company’s engineering org able to maintain our production apps and services. LLMs get many details right, but it is also the norm for a few things to be wrong or not aligned with how we think about software engineering. It takes an expert eye to spot which 1 out of 10 outputs needs rework, or is simply wrong. A novice who trusts the LLM’s capabilities more than their own judgment will believe all 10. This is an excerpt from a memo on AI agents I shared with our CRO Joe Ryan: > LLMs accept imprecision. You can leave out details of your problem and solution and LLMs will fill in the blanks. They’ll often be wrong, but you will get something working end to end, which is valuable to iterate on. But you need to be able to spot gaps and mistakes in your prompts because the LLM will not reliably identify them. > LLMs create imprecision. You need to be able to spot mistakes in the LLM’s outputs, and the LLM cannot always check its own work. You need to already have a vision for the end state and the direction it lies in, and use the LLM to automate getting there faster. > Experts who understand a problem and are looking to accelerate solving it will be amplified in positive directions, scaling themselves. Novices who trust LLMs will be amplified in negative directions, becoming confident in wrong solutions. The frontier of what it means to be an expert will change. Experts will need to know how to apply AI and the boundaries of its capabilities. An expert software engineer will need the dexterity to wield a coding agent well. That dexterity will come from experience, intuition, and talent. A senior skill will be getting codebases, teams, and companies to work productively with agents. It has always been a senior skill to set organizations up for success and then achieve it. Typing source code is mostly dead. We’ll still edit a few lines here and there. Reading, and more importantly understanding, source code is very much alive. We’ll do more of this as code is written faster. The art and science of software engineering are blossoming again. This is not a renaissance; software engineering was never dead and is not being reborn. “Vibe coding” is different. It is something new being born. The dominant change, though, is the industry and discipline of software engineering are evolving more than they have since the internet, if not since the beginning.
-
LLM models make a TON of mistakes, but with 1. good documentation, 2. good code review, 3. the best models available, you can flawlessly accomplish very large changes, FASTER and BETTER than a human. Here’s a real example. At Formation, we have Session Studio: our live session environment. It’s a real-time system with video, audio, chat, reactions, slides, hand-raising, polls, collaborative coding pads… the works. We recently changed the definitions of participant roles. It was a deep permission and behavior refactor across a complex, real-time surface area with dozens of flags and conditional checks. The kind of change that’s easy to partially ship and quietly break production. Here’s how I used AI to pull it off: 1. Full System Audit: Codex generated a ~1,300-line audit of the entire current state, every permission path, flag, edge case, and role interaction. 2. Proposed Redesign: Codex then wrote a second document detailing every change required to support the new role definitions. 3. Engineering Plan: Using "plan mode" first, Claude merged both documents into a structured engineering spec with clear implementation phases. 4. "Adversarial" Iteration: Claude and Codex iterated on the docs, flagging inconsistencies, ambiguities, and decisions that required human judgment. I acted as editor-in-chief, resolving tradeoffs and clarifying intent. 5. Phased Execution (8 Phases). For each phase: Claude implemented, Codex reviewed, Claude fixed... Repeat until clean, then Final Claude review. Total time: ~24 hours of async back-and-forth. The key insight: LLMs are unreliable in isolation. They’re extremely powerful inside a system of documentation, review, and phased execution.
-
I started by asking AI to do everything. Six months later, 65% of my agent’s workflow nodes run as non-AI code. The first version was fully agentic : every task went to an LLM. LLMs would confidently progress through tasks, though not always accurately. So I added tools to constrain what the LLM could call. Limited its ability to deviate. I added a Discovery tool to help the AI find those tools. Better, but not enough. Then I found Stripe’s minion architecture. Their insight : deterministic code handles the predictable ; LLMs tackle the ambiguous. I implemented blueprints, workflow charts written in code. Each blueprint specifies nodes, transitions between them, trigger conditions for matching tasks, & explicit error handling. This differs from skills or prompts. A skill tells the LLM what to do. A blueprint tells the system when to involve the LLM at all. Each blueprint is a directed graph of nodes. Nodes come in two types : deterministic (code) & agentic (LLM). Transitions between nodes can branch based on conditions. Deal pipeline updates, chat messages, & email routing account for 29% of workflows, all without a single LLM call. Company research, newsletter processing, & person research need the LLM for extraction & synthesis only. Another 36%. The workflow runs 67-91% as code. The LLM sees only what it needs : a chunk of text to summarize, a list to categorize, processed in one to three turns with constrained tools. Blog posts, document analysis, bug fixes are genuinely hybrid. 21% of workflows. Multiple LLM calls iterate toward quality. Only 14% remain fully agentic. Data transforms & error investigations. These tend to be coding tasks rather than evaluating a decision point in a workflow. The LLM needs freedom to explore. AI started doing everything. Now it handles routing, exceptions, research, planning, & coding. The rest runs without it. Is AI doing less? Yes. Is the system doing more? Also yes. The blueprints, the tools, the skills might be temporary scaffolding. With each new model release, capabilities expand. Tasks that required deterministic code six months ago might not tomorrow.