Erikona/Getty Images
Our story begins, as many stories do, with a man and his AI. The man, like many men, is a bit of a geek and a bit of a programmer. He also needs a haircut.
The AI is the culmination of thousands of years of human advancement, all put to the service of making the manās life a little easier. The man, of course, is me. Iām that guy.
Also:Ā The best AI for coding in 2025 (and what not to use)
Unfortunately, while AI can be incredibly brilliant, it also has a propensity to lie, mislead, and make shockingly stupid mistakes. It is the stupid part that we will be discussing in this article.
Anecdotal evidence does have value. My reports on how Iāve solved some problems quickly with AI are real. The programs I used AI to write with are still in use. I have used AI to help speed up aspects of my programming flow, especially when I focus on the sweet spots where Iām less productive and the AI is quite knowledgeable, like writing functions that call publicly published APIs.
Also: Iām an AI tools expert, and these are the only two I pay for (plus three Iām considering)
You know how we got here. Generative AI burst onto the scene at the cusp of 2023 and has been blasting its way into knowledge work ever since.
One area, as the narrative goes, where AI truly shines is its ability to write code and help manage IT systems. Those claims are not untrue. I have shown, several times, how AI has solved coding and systems engineering problems I have personally experienced.
AI coding in the real world: What science reveals
New tools always come with big promises. But do they deliver in real-world settings?
Most of my reporting on programming effectiveness has been based on personal anecdotal evidence: my own programming experiences using AI. But Iām one guy. I have limited time to devote to programming and, like every programmer, I have certain areas where I spend most of my coding time.
Also:Ā I tested 10 AI content detectors ā and these 5 correctly identified AI text every time
Recently, though, a nonprofit research organization called METR (Model Evaluation & Threat Research) did a more thorough analysis of AI coding productivity.
Their methodology seems sound. They worked with 16 experienced open-source developers who have actively contributed to large, popular repositories. The METR analysts provided those developers with 246 issues from the repositories that needed fixing. The coders were given about half the issues where they had to work on their own, and about half where they could use an AI for help.
The results were striking and unexpected. While the developers themselves estimated that AI assistance increased their productivity by an average of 24%, METRās analytics showed instead that AI assistance slowed them down by an average of 19%.
Thatās a bit of a head-scratcher. METR put together a list of factors that might explain the slowdown, including over-optimism about AI usefulness, high-developer familiarity with their repositories (and less AI knowledge), the complexity of large repositories, lack of AI reliability, and an ongoing problem where the AI refuses to use āimportant tacit knowledge or context.ā
Also:Ā How AI coding agents could destroy open-source software
I would suggest that two other factors might have limited effectiveness:
Choice of problem: The developers were told which issues they had to use AI help on and which issues they couldnāt. My experience suggests knowledgeable developers must choose where to use AI based on the problem that needs to be solved. In my case, for example, getting the AI to write a regular expression (something I donāt like doing and Iām fairly crappy at) would save me a lot more time than getting the AI to modify unique code Iāve already written, work on regularly, and know inside and out.
Choice of AI: According to the report, the developers used Cursor, an AI-centric fork of VS Code, which used Claude 3.5/3.7 Sonnet at the time. When I tested 3.5 Sonnet, the results were terrible, with Sonnet failing three out of four of my tests. Subsequently, my tests of Claude 4 Sonnet were considerably better. METR reported that developers rejected more than 65% of the code the AI generated. Thatās going to take time.
That time when ChatGPT suggested nuking my system
METRs results are interesting. AI is clearly a double-edged sword when it comes to coding help. But thereās also no doubt that AI can provide considerable value to coders. If anything, I think this test once again proves the contention that AI is a great tool for experienced programmers, but a potential high-risk resource for newbies.
Also: Why Iām switching to VS Code. Hint: Itās all about AI tool integration
Letās look at a concrete example, one that could have cost me a lot of time and trouble if I followed ChatGPTās advice.
I was setting up a Docker container on my home lab using Portainer (a tool that helps manage Docker containers). For some reason, Portainer would not enable the Deploy button to create the container.
It had been a long day, so I didnāt see the obvious problem. Instead, I asked ChatGPT. I fed ChatGPT screenshots of the configuration, as well as my Docker configuration file.
ChatGPT recommended that I uninstall and reinstall Portainer. It also suggested I remove Docker from the Linux distro and use the package manager to reinstall it. These actions would have had the effect of killing all my containers.
Of note, ChatGPT didnāt recommend or ask if I had backups of the containers. It just gave me the command line sequences it recommended I cut and paste to delete and rebuild Portainer and Docker. It was a wildly destructive and irresponsible recommendation.
The irony is that ChatGPT never figured out why Portainer wouldnāt let me deploy the new container, but I did. It turns out I never filled out the containerās name field. Thatās it.
Also: What is AI vibe coding? Itās all the rage but itās not for everyone ā hereās why
Because Iām fairly experienced, I hesitated when ChatGPT told me to nuke my installation. However, someone relying on the AI for advice could have potentially brought down an entire server for want of typing in a container name.
Overconfident and underinformed AIs: A dangerous combo
Iāve also experienced the AI going completely off the rails. Iāve experienced it giving advice that was not only completely useless, but also presented with the apparent confidence of an expert.
Also: Googleās Jules AI coding agent built a new feature I could actually ship ā while I made coffee
If youāre going to use AI tools to support your development or IT work, these tips might keep you out of trouble:
- If thereās not much publicly available information, the AI canāt help. But the AI will make stuff up based on what little it knows, without admitting that it is lacking experience.
- Like my dog, once the AI gets fixated on one thing, it often refuses to look at alternatives. If the AI is stuck on one approach, donāt make the mistake of believing that its polite recommendations about a new approach are real. Itās still going down the same rabbit hole. Start a new session.
- If you donāt know a lot, donāt rely on the AI. Keep up your learning. Experienced devs can tell the difference between what will work and what wonāt. But if youāre trying to put all the coding on the back of the AI, you wonāt know when or where it goes wrong or how to fix it.
- Coders often use specific tools for specific tasks. A website might be built using Python, CSS, HTML, JavaScript, Flask, and Jinja. You choose each tool because you know what it does well. Choose your AI tools the same way. For example, I donāt use AI for business logic, but I gain productivity using AI to write API calls and public knowledge, where it can save me a lot of time.
- Test everything an AI produces. Everything. Line by individual line. The AI can save a ton of time, but it can also make enormous mistakes. Yes, taking the time and energy to test by hand can help prevent errors. If the AI offers to write unit tests, let it. But test the tests.
Based on your experience level, hereās how I recommend you think about AI assistance:
- If you know nothing about a subject or skill: AI can help you pass as if you do, but it could be amazingly wrong, and you might not know.
- If youāre an expert in a subject or skill: AI can help, but it will piss you off. Your expertise gets used not only to separate the AI-stupid from the AI-useful, but to carefully craft a path where AI can actually help.
- If youāre in between: AI is a mixed bag. It could help you or get you in trouble. Donāt delegate your skill-building to the AI because it could leave you behind.
Also: How I used ChatGPT to analyze, debug, and rewrite a broken plugin from scratch ā in an hour
Generative AI can be an excellent helper for experienced developers and IT pros, especially when used for targeted, well-understood tasks. But its confidence can be deceptive and dangerous.
AI can be useful, but always double-check its work.
Have you used AI tools like ChatGPT or Claude to help with your development or IT work? Did they speed things up, or nearly blow things up? Are you more confident or more cautious when using AI on critical systems? Have you found specific use cases where AI really shines, or where it fails hilariously? Let us know in the comments below.
You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.

