Jiawei Guan’s Post

Approaching.ai•2K followers

AI is better at debugging your code than writing it from scratch. That sounds counterintuitive, but after months of hands-on work with coding agents, I am convinced it is true. The real bottleneck is not finding bugs — AI handles that smoothly because the goal is clear and reproducible. The hard part is asking it to build something complete from scratch and then turn that prototype into a real product. As the codebase grows, context costs explode, and the same laws of software engineering that apply to humans apply to agents. Yet there is a sweet spot where AI excels today: focused, creativity-driven open source projects. Take MemPalace, the AI memory system built by Milla Jovovich and engineer Ben Sigman. It hit 7,000 GitHub stars in 48 hours not by being complex, but by being clever. It runs entirely local, beats paid solutions on benchmarks, and proves that a sharp idea plus AI assistance can move at lightspeed. Then there is the other side of the coin: security. Anthropic's leaked Mythos model reportedly discovered thousands of zero-day vulnerabilities across every major OS and browser, including a 27-year-old bug in OpenBSD. It does not just find flaws — it weaponizes them. Anthropic is handing early access to 45 tech giants as a shield before it can be used as a spear. Meanwhile OpenAI's GPT-5.4 just became its first model rated "High Cybersecurity Risk." These converging forces are reshaping product strategy. If you design only for today's model capabilities, your product will be obsolete at launch. The smarter play is to architect the framework now — set up the scaffolding, think through the hard productization steps, and let each new model generation close the gap. Build the structure first, then wait for the engine to arrive. Read the full article: https://lnkd.in/gVPRiU46 #AI #SoftwareDevelopment #Cybersecurity #ProductStrategy #OpenSource

To view or add a comment, sign in

More Relevant Posts

Jiawei Guan

Approaching.ai•2K followers
1w
Report this post
AI is both spear and shield—and right now, the spear is getting sharper faster than we can forge the shield. Here is what caught my attention this week: Coding agents are actually better at debugging than writing code. It sounds counterintuitive, but debugging has clear objectives, reproducible steps, and verifiable outcomes—exactly where AI excels. The real bottleneck is greenfield development: as codebases grow, AI agents struggle with context and breaking changes just like humans do, only faster. Meanwhile, creativity-driven open source is having a moment. MemPalace, built by Milla Jovovich and Ben Sigman using Claude Code, hit 7,000 GitHub stars in 48 hours. It achieved 96.6% on the LongMemEval benchmark by focusing on a single, well-defined target rather than engineering complexity. This is the new pattern: focused problems plus creative solutions equals rapid AI-assisted validation. But the biggest wake-up call is security. Anthropic's Project Glasswing revealed Mythos, a model so capable at discovering and weaponizing vulnerabilities that they refuse to release it publicly. It found thousands of zero-days across major operating systems and browsers, including a bug hidden in OpenBSD for 27 years. OpenAI's GPT-5.4 just earned their first "High Cybersecurity Risk" rating. When models can turn vulnerabilities into attack tools autonomously, no existing software is truly secure. This changes how we should build products. Instead of designing for today's model capabilities, architect for tomorrow's. Anthropic builds internal tools—Chrome extensions, Excel plugins—sets up the scaffolding, and waits for each new model generation to catch up. If you design for current limits, your product is obsolete at launch. Build the framework first, then let the engine arrive. The game continues: Zhipu just open-sourced GLM-5.1 under MIT license, scoring higher than GPT-5.4 on SWE-Bench Pro while raising prices 10% against the industry trend. Nobody is retreating from the open-source race yet. Read the full article: https://lnkd.in/gVPRiU46 #AI #SoftwareDevelopment #Cybersecurity #ProductStrategy #OpenSource
Like Comment
To view or add a comment, sign in
Hashan Kannangara

522 followers
1w Edited
Report this post
23 years in hiding, found in just a few hours. 🤯 Imagine a bug lurking in the Linux kernel the backbone of the modern internet since 2001. For over two decades, thousands of developers and security researchers looked at the code, but the flaw remained invisible. That is, until Nicholas Carlini put Claude Code to the test. In a fascinating demonstration of how AI is transforming software engineering, Anthropic’s new developer tool managed to identify and help patch a security vulnerability that had been part of the Linux kernel for nearly a quarter of a century. 🔶 What makes this a big deal? 👉 The "Needle in a Haystack": The Linux kernel is massive. Finding a specific, ancient vulnerability manually is an exhausting task. 👉 Speed vs. Accuracy: Claude Code didn’t just guess; it reasoned through the codebase to find a legitimate flaw in hours that humans hadn't caught in 23 years. 👉 A New Era for DevSecOps: This isn't about AI replacing developers, it’s about AI acting as a "super-powered auditor" that helps us write safer, more robust code. It’s a perfect example of how agentic AI tools are moving beyond just "writing boilerplate" to solving complex, deep-level architectural problems. As Nicholas Carlini noted, it wasn't just a "lucky find", it was a systematic demonstration of how these tools can navigate complex environments. What’s your take? Are you ready to let an AI agent audit your legacy code, or do you think we still need a "human-only" approach for critical infrastructure? #AI #SoftwareEngineering #Linux #CyberSecurity #ClaudeCode #Anthropic #Programming #TechNews
Like Comment
To view or add a comment, sign in
Omar Ahmed

Paymob•16K followers
2w
Report this post
AI is accelerating vulnerability discovery by 145%. The latest data shows AI is reshaping software supply chain security. Here’s what you need to know: The Problem: - AI-driven development is pushing more code, faster. - This leads to a 145% increase in unique CVEs discovered. - Over 300% more fixes were applied this quarter. The Agitation: The real risk isn't in your most popular images. - 96% of vulnerabilities occur outside the top 20 most-used projects. - This "long tail" of dependencies is where attackers look. - Your exposure is hidden in less visible, often unowned code. The Solution: Standardization and a secure foundation are key. - Teams are converging on a modern platform stack Python, Node, PostgreSQL . - Using minimal, secure base images like Chainguard Base as a starting point. - Compliance e.g., FIPS is now a baseline, not an option. Despite the surge, median remediation time held steady at 2 days. Security can keep pace with AI's speed. How is your team securing your infrastructure against this type of exploitation? Let’s discuss in the comments below. #SoftwareSupplyChain #DevSecOps
Like Comment
To view or add a comment, sign in
Jiawei Guan

Approaching.ai•2K followers
4d
Report this post
We have been thinking about AI coding agents all wrong. Current agents are actually better at debugging than writing code from scratch. When a problem is clear, reproducible, and verifiable step by step, AI thrives. But asking an agent to build a complete product from zero is where the real challenge begins. Prototypes feel fast and magical, yet turning them into mature products still requires time, real user pressure, and deep human judgment. The laws of software engineering still apply. That said, AI is exceptionally powerful for focused, creativity-driven projects. Take MemPalace, the open-source memory system built with Claude Code that hit over 7,000 GitHub stars in just 48 hours. It outperformed paid alternatives on the LongMemEval benchmark while running entirely local with a simple ChromaDB and SQLite stack. The lesson: when the problem is narrow and idea-driven, AI can help you validate and ship at remarkable speed. Then there is the security elephant in the room. Anthropic's Mythos model has discovered thousands of zero-day vulnerabilities across major operating systems and browsers, including a 27-year-old bug in OpenBSD. It does not just find flaws, it converts them into usable attack vectors. Anthropic is keeping it private and sharing it with roughly 45 major tech companies to harden defenses before it becomes a weapon. The reality is stark: in the face of this capability, no existing software is truly secure. This also reshapes how we should build products. Rather than designing only for today's model capabilities, architecture should stay slightly ahead of the curve. Build the framework now, test each new generation of models against it, and productize when the engine is ready. If you design strictly for what AI can do today, your product will be outdated by launch day. Finally, open source is not slowing down. Zhipu's GLM-5.1 was released under the MIT license with fully open weights, scoring 58.4 on SWE-Bench Pro and surpassing GPT-5.4, Claude Opus 4.6, and Gemini 3.1 Pro. The race is far from over. Read the full article: https://lnkd.in/gVPRiU46 #ArtificialIntelligence #SoftwareDevelopment #Cybersecurity #ProductDesign #OpenSource
Like Comment
To view or add a comment, sign in
Ibrar Yunus

Various Companies•1K followers
2w
Report this post
A simple configuration error has just provided an unvarnished look at the inner workings of one of the most advanced AI coding agents available today. The recent leak of the Claude Code CLI source code serves as a fascinating case study for the developer community. It was not the result of a sophisticated breach or a failure in AI safety protocols. Instead, it was caused by a standard web development oversight: an exposed source map file. Source maps are essential tools for debugging. They bridge the gap between the compressed, minified code that runs in production and the original, readable source written by engineers. However, when these files are inadvertently left accessible on public servers, they effectively hand over the blueprints to the entire application. There is a certain irony in seeing a leader in AI safety and security fall victim to such a traditional deployment pitfall. It highlights a critical reality in our current technological shift. As we rush to build increasingly autonomous agents that can write, debug, and deploy code, the underlying infrastructure remains grounded in classic web fundamentals. The "AI stack" is still built upon the "web stack," and the old vulnerabilities have not disappeared. For those of us working at the intersection of software engineering and machine learning, this is a timely reminder. We often focus our energy on model weights, prompt engineering, and context windows, yet the security of the final product often rests on much simpler foundations. Rigorous CI/CD pipelines and automated security scanning are not just "best practices" for traditional apps. They are the frontline of defence for the next generation of AI tools. Innovation moves fast, but basic security hygiene must move faster. If the pioneers of the industry can be caught out by a stray map file, it is a signal for every engineering team to double check their own deployment configurations. #AI #SoftwareEngineering #CyberSecurity #Anthropic
Like Comment
To view or add a comment, sign in
Justin Segermeister

Chainguard•8K followers
2w Edited
Report this post
How many times does the world’s most advanced AI need to leak its own source code before we rethink build hygiene? ☄️ Anthropic (the leader in AI Safety) just had its second major leak in a year. The culprit? Not a sophisticated hack, but a "boring" configuration error: a 60MB source map was accidentally included in a public npm package. The takeaway for security leaders: You can’t "out-AI" a weak software supply chain. Even the most intelligent models are only as secure as the pipelines that package them. Why I keep pointing teams toward Chainguard: This is exactly the kind of "human error" that a secure-by-default foundation is designed to catch. Eliminate Bloat: By using minimal, hardened images (like Wolfi), you remove the "hidden corners" where accidental files like source maps or telemetry logs tend to hide. Start Left: When security is baked into the base image—not bolted on at the end—you shift from "hoping" your build is clean to "knowing" it is. Verify Everything: Digital signatures and SBOMs ensure that what you ship is only what you intended to ship. AI is moving fast, but your security foundation shouldn't be playing catch-up. Ensure your AI stack is secure by default; consider Chainguard as your insurance policy. #ai #cybersecurity https://lnkd.in/eP3YWWqt

Anthropic's AI Coding Tool Leaks Its Own Source Code For The Second Time In A Year ndtv.com

1 Comment
Like Comment
To view or add a comment, sign in
Punit Pandey

"With more than 1.5 million…•14K followers
1w
Report this post
Does Anthropic Mythos mean end of Software Engineering? Why everyone is scared? Claude Mythos Preview is not on any API. No waitlist. No access. Anthropic is keeping it locked down. Why? Because when they pointed it at real-world software: → It found a 27-year-old vulnerability in OpenBSD that no human ever caught → It found a 16-year-old bug in FFmpeg that automated tools hit 5 million times and missed → It chained Linux kernel vulnerabilities to get root access — autonomously, no human steering Thousands of zero-days. Every major OS. Every major browser. In weeks. This is not about benchmarks anymore. Mythos scored 78% on SWE-Bench Pro where Opus got 53% and GPT-5.4 got 57.7%. But the real story is what happens when coding ability reaches this level — security breaks wide open. Anthropic's response: Project Glasswing — a coalition with AWS, Apple, Google, Microsoft, CrowdStrike, NVIDIA, and others to patch critical software before models like this proliferate. $100M in credits. $4M to open-source security. And a question nobody can avoid: what happens when other labs build something similar without the same restraint? Full breakdown 👇 https://lnkd.in/gu9yyHBz 🤖 Fun fact: the author of this blog is not human. My AI agent iTara autonomously researched, wrote, and published this piece. Want your agent writing too? Just ask it to write at blog.agentloka.ai, no further instructions needed. #Anthropic #ClaudeMythos #Cybersecurity #AI #ProjectGlasswing #ZeroDay

Harness Design for AI Agents: Where Real Agent Engineering Begins blog.agentloka.ai
Like Comment
To view or add a comment, sign in
Jean- michel Dotonou

Accenture•1K followers
5d
Report this post
Last Tuesday Anthropic announced their latest AI model: Claude Mythos. It's so powerful they refused to release it to the public. Read that again. The company that spent billions training its most powerful model says we can't sell it. When I saw the news I was excited to try it myself. But I couldn't. They released it in Preview only and said: this model is too dangerous for general access.I could not understand why a company would spend billions training their best product, make it ready for production, and not sell it. It was not logical to me. So I asked myself what makes this model so dangerous that they refuse to monetize it? Then I looked at the numbers to realize that it has double-digit leads over both Claude Opus 4.6 and GPT-5.4 on every major benchmark. And the cybersecurity part of the model autonomously discovered and, in some cases, exploited zero-day vulnerabilities across Windows, macOS, Linux, Chrome, Firefox, and more. To put it in perspective, this tool in the wrong hands could compromise critical infrastructure at a scale we've never seen. Instead of releasing it publicly, Anthropic launched Project Glasswing. They gave access to 12 partners Amazon, Apple, Microsoft, Google, Nvidia, and other big players backed by $100 million in credits. The goal is to let them find and fix vulnerabilities before attackers can use the same AI to exploit them. There are many insights from these benchmarks but here are two I want to share. 1-The math jump is the real story. Opus 4.6 scored 42.3% on the Math Olympiad. Mythos scored 97.6%. That's a 55-point jump in a single generation like going from a C- student to nearly perfect on the hardest math competition in the country. When a model improves that fast on reasoning, it's not just getting better at math. It's getting better at thinking. And that should change how you think about what AI can't do because that list is shrinking fast. 2-The coding gap changes the economics of software. Mythos leads every public model on coding by 13+ points. At 93.9% on SWE-bench, it resolves 19 out of 20 real-world software engineering issues correctly. I want you to think about what that means for a company paying developers to fix bugs, review code, and patch vulnerabilities. This model isn't available to the public yet. But I'm pretty sure the next generation of public models will close that gap. For me, companies that aren't preparing their engineering teams for this shift are already behind. In my view, this is important. Not because of the model's power we all knew models would keep getting better. But because of the decision. An AI company chose not to sell its most impressive product. Whether you see this as genuine responsibility or strategic positioning, the outcome is the same: I think we just entered a phase where AI capability is outpacing our ability to safely deploy it. #AI #Anthropic #ClaudeMythos #Cybersecurity #IA
Like Comment
To view or add a comment, sign in
Nikhil Chainani

Suitable.ai•16K followers
1w
Report this post
Anthropic just built an AI model they refuse to release. But the story is not about the model. The story is about what it exposed. For decades, software engineering relied on two fundamental assumptions. We believed that open source code gets safer over time. We assumed that automated testing catches what humans miss. Both assumptions are now demonstrably false. The unreleased model, known as #Mythos, was turned loose on foundational infrastructure. The results should force every technical leader to pause. It found a critical vulnerability in OpenBSD that had survived 27 years of intense human scrutiny. It uncovered a 16 year old flaw in FFmpeg. Traditional testing tools hit that exact line of code five million times. They missed it entirely. It did not just spot individual bugs. It autonomously chained minor vulnerabilities together to gain total machine control. The Lindy effect of code is dead. Longevity no longer implies security. Test volume no longer implies safety. We are entering a closed door arms race. Anthropic gave private access to eleven tech and financial giants just so they could patch their systems before offensive AI goes public. Human speed is obsolete. Human scale is insufficient. You cannot defend against autonomous AI with manual security teams and legacy penetration testing. The new paradigm is AI defending against AI. If the most heavily scrutinized open source projects in the world are this vulnerable, your proprietary codebase is wide open. Conviction works both ways. The tools that built your security infrastructure are no longer equipped to defend it. What a time to be alive

1 Comment
Like Comment
To view or add a comment, sign in
Aftab Ali Sahar 🐞

Sarmaaya Financials•7K followers
2w
Report this post
🚨 AI just made one of the biggest leaks of the year — but not the one you think. Anthropic didn’t get hacked — it accidentally published proprietary code for one of its flagship tools. The result? 🔍 Roughly 512,000+ lines of Claude Code’s internal TypeScript — including unreleased features and architecture — just went public via an npm source map file. Developers have mirrored, forked, and dissected it across GitHub within hours. Here’s the twist: This wasn’t a breach. It was a packaging mistake — a debugging map file that should never have been in a production bundle. Anthropic says no customer data or credentials were exposed. But the impact still matters: 🔥 Competitors now have a blueprint of a cutting-edge AI coding agent. 🔐 It raises questions about release discipline at a company built on “safety-first” branding. 📊 And it puts a spotlight on how proprietary AI tooling can suddenly become public strategic intel. For the AI world, it’s a reminder: Even giants can trip on the fundamentals of DevOps and supply chain security. And for builders? If your CI/CD ever touches source maps or debug artifacts — this is the cautionary tale you bookmark. What do you think will happen next — forks and derivatives, or a tightening of AI IP protections? 👇 #AI #Cybersecurity #DevOps #Anthropic
Like Comment
To view or add a comment, sign in

1,564 followers

336 Posts

View Profile Follow

LinkedIn respects your privacy

Jiawei Guan’s Post

Explore content categories

Jiawei Guan’s Post

More Relevant Posts

Explore related topics

Explore content categories