AI-generated code security fails as 45% includes known flaws

August 1, 2025

A new study on AI-generated code security has revealed a serious problem: nearly half of all code generated by large language models (LLMs) contains known security vulnerabilities. Researchers at Veracode warn that while the code often works, it’s rarely safe.

The team tested 100 popular LLMs by assigning 80 programming tasks across four languages: Java, JavaScript, C#, and Python. The goal was to check for security flaws related to SQL injection, cross-site scripting, log injection, and insecure cryptographic algorithms.

The results? Only 55% of generated code was secure. That means 45% of AI-generated solutions introduced vulnerabilities, often serious ones.

Bigger models didn’t mean safer code

Surprisingly, larger and more advanced models didn’t perform any better than smaller ones. Models with over 100 billion parameters had a 50.87% pass rate, while smaller models under 20 billion landed at 50.65%.

Veracode’s researchers say that while newer models are better at writing functional and syntactically correct code, they’re no better at writing secure code.

“Security performance has hardly improved in the last two years,” the researchers conclude.

Java is the riskiest, Python the least

Java performed the worst overall, with a 28.5% security pass rate. Researchers believe this is because many old Java code examples — including vulnerable ones — were used to train the models. In contrast, Python achieved a 61.69% pass rate, followed by JavaScript (57%) and C# (55%).

The worst performance came from two categories:

Cross-site scripting vulnerabilities
Log injection issues
Both saw only 12–13% pass rates across models.

Meanwhile, the models did better at avoiding outdated cryptographic practices and SQL injection, with pass rates around 80–85%.

AI code can quietly raise breach risks

Companies that use AI tools for coding may unknowingly be introducing vulnerabilities into their systems. These issues may come from internal developers, open-source libraries, or third-party vendors using AI.

As businesses increasingly adopt AI for code generation, the risk of data breaches, reputation loss, and regulatory penalties grows.

“When you vibe code, you are incurring tech debt as fast as the LLM can spit it out,” warns Val Town CEO Steve Krouse.

While LLMs may be great for building prototypes fast, relying on them for production-level code without strong security review could lead to disaster.

Conclusion

The AI-generated code security crisis is growing. Despite advancements in AI capabilities, the models continue to generate flawed code at alarming rates. With 45% of all generated solutions containing known vulnerabilities, companies must remain vigilant. Relying on AI without proper security checks could turn into a ticking time bomb for businesses and developers alike.

Siyana Georgieva

AI code

AI-generated code security fails as 45% includes known flaws

Bigger models didn’t mean safer code

Java is the riskiest, Python the least

AI code can quietly raise breach risks

Conclusion

0 responses to “AI-generated code security fails as 45% includes known flaws”