AI-Generated Code is Good, Actually

OK, I admit, I’m clickbaiting you a little with the title. Plenty of code written by large language models (LLMs) are complete crap. That being said, I’ve found a way to make them work for me.

Disclaimer: I only use AI-generated code (or AI-generated anything, for that matter) in my personal projects or as authorized by my employer. Only use LLMs for work in a way that’s authorized by your employer/organization.

I’m gonna try to convince you AI is bad, and then go back and try to convince you AI is good. I’m sorry, but that’s just how it has to be.

AI Writes Bad Code

So why is generative AI bad at writing code? Right now, LLMs are generally not so good at complex, nuanced tasks, but are good at straightforward tasks. If you ask an LLM to create a simple Python function to write a file to disk, you’re likely to get a pretty good result. If you ask it to generate a full-stack web app, you’re likely to get a buggy, insecure mess, if it even runs.

If you spend a lot of time on X or LinkedIn (I don’t recommend it), you’re sure to see plenty of people saying that ChatGPT can make you an app in a couple hours that will make you millions. I was gonna feature a YouTube video here and highlight how they’re feeding people a line, but I watched a bunch of them and it just made me sad, so I won’t. Long story (and an hour of my life I’ll never get back) short, if someone is telling you this, they’re selling you something. Probably a course subscription.

But just because grifters gonna grift doesn’t mean LLMs write bad code, right? There are actually a few reasons that LLMs write low-quality code for problems more complex than your standard LeetCode:

Hallucination: LLMs make things up. I don’t really know why, but they do. They can make up libraries, syntax, function names, etc. — all of which lead to bugs.
Context limits: LLMs can only store so much history in their context window. As your conversation goes on, they’ll “forget” messages that happened earlier, in reverse order, and context is lost.
Inconsistency: Outputs can vary wildly for the same or similar prompts.
Aim for simplicity: Sometimes, LLMs go for a “simple” or “POC” answer that doesn’t address things like security or scalability.

All these factors and more combine to make writing large, complex codebases with AI a bad idea.

AI Writes Bad Code, Good

AI doesn’t know how to build software in the same way a hammer doesn’t know how to build a house. You’d be a fool to ask the hammer to build the house, but the hammer makes the hammering part way easier. The trick with writing software is that people tend to think the hammering part is all there is to building the house.

Writing code is, at best, 50% of building software systems that people actually use. (I think 50% is high, but it’s good enough.) The other half is design, testing, getting coffee, etc. Building software is all about making decisions. It’s like the famous 1979 IBM slide said:

A computer can never be held accountable

Therefore a computer must never make a management decision

Replace management with engineering, and we’re there.

LLMs can help you make decisions, but the ultimate responsibility lies with you, the engineer. So if AI writes bad code, how can it help you? Because it writes bad code very fast. But I’ve been doing this for a while, so I know bad code when I see it.

I don’t blindly copy and paste what I get in the ChatGPT window. I read it, take the parts that I don’t want to bother typing myself, and ignore the rest. Or sometimes, I’ll take the code, fix the obvious bugs, test it, and then fix the non-obvious bugs. As long as it comes out to having been faster than remembering how to do the thing (or, ugh, reading the docs) and typing it myself, I’ve won!

And I have a hard and fast rule: never, ever use code from an LLM that I don’t 100% understand. (I have this rule for code copied from docs/blogs/etc. too.)

So what does that mean? Unfortunately, you have to be so tall to ride that ride. If you’re going to be a competent LLM user, you have to be experienced enough to supervise its work. You have to be able to look at what it’s telling you and know when it’s right and when it’s wrong. You can’t (safely) have AI do something for you that you don’t know how to do yourself. The AI can be faster than you, but not smarter than you. If it is, how do you know if it made a mistake?

In addition to fast code generation, I’ve found LLMs extremely helpful for debugging. For example, I once spent a couple hours looking for an extremely sneaky illegal memory access in about 800 lines of C++. I was at my wits end, so I asked GPT3.5 (this was basically the Bronze Age) to find the bug. I didn’t give it any context about what the code did, or what I had already tested. I just said: “Find the bug in this code” and pasted the code. It immediately (and correctly) found the offending line and offered a correction. (I think I had forgotten to allocate an array pointer, how embarrassing.)

LLMs also regularly come up with approaches I wouldn’t think of. There are at least dozens of ways to solve every software problem of decent complexity, and I can’t think of them all. I’m just not that guy. Sometimes, I’ll tell an LLM “I want to do X. Come up with 3 approaches to do X.” It will, and at least will be one I didn’t think of. I typically won’t use any of them, but my actual solution will often contain pieces of at least 2 of them.

I won’t go as far as saying that LLMs writing bad code is actually a strength. It’s not, that would be stupid. People will still use it to write crappy software, and we’ll still have to deal with them. Grifters will still try to sell you their magic AI get rich quick course with rented Lamborghini’s in a their garage. (Why do you have bookshelves in your garage, Tai?)

Even if it isn’t a strength, I’m not convinced it’s an interesting weakness. Just like the hammer isn’t a licensed contractor, AI is a tool to help you code more productively. And if I’m wrong and our AI overlords take over, I hope they remember I wrote nice things about them in my blog.

If you have thoughts, or you think I’m wrong and just have to tell me, email me!