News

Study finds AI coding tools unreliable

By Howie Jones
Last updated October 3, 2024

To me, to date, algorithmic coding assistants look like supercharged regexps, that may refactor code from existing code bases (i.e. provided code or configuration data). I do not trust them for any task of higher level.https://t.co/RfAqvt0lTV
— François PELLEGRINI (@FrPellegrini) September 30, 2024

Joe Smith, a software developer at a mid-sized tech company, has been using GitHub Copilot, an AI coding assistant, for the past three months. He was excited about the tool’s potential to boost his productivity and make his job easier. However, after using Copilot for a while, Joe found that it did not live up to his expectations.

This might be the best "AI Engineer" I've tried so far.

I'm an old school developer who started 30 years ago. I feel very uncomfortable letting AI take control of my code, but for the sake of science, I spent two hours building an application that took me weeks to build a… pic.twitter.com/4EQH0Hb85t
— Santiago (@svpino) October 1, 2024

He noticed that the code generated by the AI often contained bugs and errors that he had to spend time fixing. “I thought Copilot would save me time, but I ended up spending more time debugging the code it generated,” Joe said. “It was frustrating because I had to double-check everything to make sure it was correct.”

Joe’s experience is not unique.

I've been reading a lot of AI stans claiming that AI will replace software developers—clear proof that whoever claims that is not a developer. First, the days of the coder—who just codes up a specification—are over. A software developer does way more than code—it's a…
— Allen Holub @allenholub.bsky.social (@allenholub) September 29, 2024

A recent study conducted by Uplevel, a code analysis firm, found that developers using GitHub Copilot did not see significant improvements in productivity. The study measured key metrics such as pull request cycle time and throughput over a six-month period. The data, generated by 800 developers, showed that the use of GitHub Copilot actually introduced 41% more bugs into the code.

This finding suggests that while AI coding assistants can generate code quickly, the quality of the code may not be up to par.

ai coding tools introduce frequent bugs

Ivan Gekht, CEO of Gehtsoft USA, a custom software development firm, expressed skepticism about the current state of AI coding assistants.

His company has experimented with these tools but has not implemented them in client projects due to the difficulty in understanding and debugging AI-generated code. “Software development is more about understanding requirements, designing systems, and considering limitations, which AI cannot fully handle,” Gekht said. Despite the challenges, some developers have reported positive experiences with AI coding assistants.

Travis Rehl, CTO of Innovative Solutions, saw a two to threefold increase in developer productivity using tools like Claude Dev and GitHub Copilot. However, Rehl cautioned against the unrealistic expectation that coding assistants can replace human developers entirely. He emphasized that the tools are still evolving and that their success may depend on improved accuracy and integration methods that reduce debugging and error rates.

As AI coding assistants continue to develop, organizations should monitor their progress and evaluate their potential benefits. While these tools may not be a silver bullet for all development inefficiencies, they could still prove valuable in augmenting developer efforts in specific contexts. For now, developers like Joe Smith will have to weigh the pros and cons of using AI coding assistants in their daily work.

As the technology advances, it remains to be seen whether these tools will truly revolutionize software development or remain a promising but flawed solution.

Howie Jones

Howie is an expert in business, software, and it's applications. She writes on various technologies and their uses in enterprise businesses.

Study finds AI coding tools unreliable

ai coding tools introduce frequent bugs

Howie Jones

More Stories

My Favorite Desktop Wallpapers Sites

Improving the Double Diamond Design Process

AI coding startup Poolside raises $500M

First World Quantum Readiness Day Held

Generative AI tools reshape software development

Gaming firms embrace AI for development

Tesla Cybertruck recall for delayed rearview image

Starlink systems deployed to aid North Carolina