Home ยป Why is it so bad to optimize too early?

Why is it so bad to optimize too early?

Solutons:


Preamble:

A few objections have been raised in the comments, and I think they largely stem from a misunderstanding of what we mean when we say “premature optimization” – so I wanted to add a little clarification on that.

“Don’t optimize prematurely” does not mean “write code you know is bad, because Knuth says you’re not allowed to clean it up until the end”

It means “don’t sacrifice time & legibility for optimization until you know what parts of your program actually need help being faster.” Since a typical program spends most of its time in a few bottlenecks, investing in optimizing “everything” might not get you the same speed boost as focusing that same investment on just the bottlenecked code.

This means, when in doubt, we should:

  • Prefer code that’s simple to write, clear to understand, and easy to modify for starters

  • Check whether further optimization is needed (usually by profiling the running program, though one comment below notes doing mathematical analysis – the only risk there is you also need to check that your math is right)

A premature optimization is not:

  • Architectural decisions to structure your code in a way that will scale to your needs – choosing appropriate modules / responsibilities / interfaces / communication systems in a considered way.

  • Simple efficiencies that don’t take extra time or make your code harder to read. Things like using strong typing can be both efficient and make your intent more clear. Caching a reference instead of searching for it repeatedly is another example (as long as your case doesn’t demand complex cache-invalidation logic – maybe hold off on writing that until you’ve profiled the simple way first).

  • Using the right algorithm for the job. A* is more optimal and more complex than exhaustively searching a pathfinding graph. It’s also an industry standard. Repeating the theme, sticking to tried-and-true methods like this can actually make your code easier to understand than if you do something simple but counter to known best practices. If you have experience running into bottlenecks implementing game feature X one way on a previous project, you don’t need to hit the same bottleneck again on this project to know it’s real – you can and should re-use solutions that have worked for past games.

All those types of optimizations are well-justified and would generally not be labelled “premature” (unless you’re going down a rabbit hole implementing cutting-edge pathfinding for your 8×8 chessboard map…)

So now with that cleared up, on to why we might find this policy useful in games specifically:


In gamedev especially, iteration speed is the most precious thing. We’ll often implement and re-implement far more ideas than will ultimately ship with the finished game, trying to “find the fun.”

If you can prototype a mechanic in a straightforward & maybe a bit naive way and be playtesting it the next day, you’re in a much better position than if you spent a week making the most optimal version of it first. Especially if it turns out to suck and you end up throwing out that feature. Doing it the simple way so you can test early can save a ton of wasted work optimizing code you don’t keep.

Non-optimized code is also generally easier to modify and try different variants on than code that’s finely-tuned to do one precise thing optimally, which tends to be brittle and harder to modify without breaking it, introducing bugs, or slowing it way down. So keeping the code simple and easy to change is often worth a little runtime inefficiency throughout most of development (we’re usually developing on machines above the target spec, so we can absorb the overhead and focus on getting the target experience first) until we’ve locked down what we need from the feature and can optimize the parts we now know are slow.

Yes, refactoring parts of the project late in development to optimize the slow spots can be hard. But so is refactoring repeatedly throughout development because the optimizations you made last month aren’t compatible with the direction the game has evolved since then, or were fixing something that turned out not to be the real bottleneck once you got more of the features & content in.

Games are weird and experimental โ€” it’s hard to predict how a game project and its tech needs will evolve and where the performance will be tightest. In practice, we often end up worrying about the wrong things โ€” search through the performance questions on here and you’ll see a common theme emerge of devs getting distracted by stuff on paper that likely is not a problem at all.

To take a dramatic example: if your game is GPU-bound (not uncommon) then all that time spent hyper-optimizing and threading the CPU work might yield no tangible benefit at all. All those dev hours could have been spent implementing & polishing gameplay features instead, for a better player experience.

Overall, most of the time you spend working on a game will not be spent on the code that ends up being the bottleneck. Especially when you’re working on an existing engine, the super expensive inner loop stuff in the rendering and physics systems is largely out of your hands. At that point, your job in the gameplay scripts is to basically stay out of the engine’s way – as long as you don’t throw a wrench in there then you’ll probably come out pretty OK for a first build.

So, apart from a bit of code hygiene and budgeting (eg. don’t repeatedly search for/construct stuff if you can easily reuse it, keep your pathfinding/physics queries or GPU readbacks modest, etc), making a habit of not over-optimizing before we know where the real problems are turns out to be good for productivity – saving us from wasting time optimizing the wrong things, and keeping our code simpler and easier to tweak overall.

note: this answer began as a comment on DMGregory’s answer, and so doesn’t duplicate the very good points he makes.

“Would it not be incredibly difficult to change some of the core structures of the game at the end, rather than developing them the first time with performance in mind?”

This, to me, is the crux of the question.

When creating your original design, you should try to design for efficiency – at the top level. This is less optimisation, and is more about structure.

Example:
You need to create a system to cross a river. The obvious designs are a bridge or a ferry, so which do you choose?
The answer of course depends on the size of the crossing and the volume of traffic.
This isn’t an optimisation, this is, instead starting out with a design suited for your problem.

When presented with design choices, you pick the one best suited to what you want to do.

So, lets say that our volume of traffic is fairly low, so we decide to build two terminals and buy in a ferry to handle the traffic. A nice simple implementation
Unfortunately, once we have it up and running, we find that it is seeing more traffic than expected. We need to optimise the ferry! (because it works, and building a bridge now isn’t a good plan)

Options:

  • Buy a second ferry (parallel processing)
  • Add another car deck to ferry (compression of traffic)
  • Upgrade the ferry’s engine to make it faster (re-written processing algorithms)

This is where you should attempt to make your original design as modular as possible.
All of the above are possible optimisations, and you could even do all three.
But how do you make these changes without large structural changes?

If you have a modular design with clearly defined interfaces, then it should be simple to implement these changes.
If your code is not tightly coupled, then changes to modules don’t affect the surrounding structure.

Lets take a look at adding an extra ferry.
A ‘bad’ program might be built around the idea of a single ferry, and have the dock states and ferry state and position all bundled together and sharing state. This will be hard to modify to allow for an extra ferry being added to the system.
A better design would be to have the docks and ferry as seperate entities. There isn’t any tight coupling between them, but they have an interface, where a ferry can arrive, unload passengers, take on new ones, and leave. The dock and ferry share only this interface, and this makes it easy to make changes to the system, in this case by adding a second ferry. The dock doesn’t care about what ferries there actually are, all it is concerned with is that something (anything) is using its interface.

tl;dr:

  • Try to design for efficiency in the first place.
  • When presented with two choices, you pick the more efficient one.
  • Write your code as modularly as possible with well defined interfaces.

You can then change the mechanisms within each module without restructuring the entire codebase when you need to optimise.

“Do not optimise early” doesn’t mean “pick the worst possible way to do things”. You still need to consider performance implications (unless you’re just prototyping). The point is not to cripple other, more important things at that point in development – like flexibility, reliability etc. Pick simple, safe optimisations – choose the things you limit, and the things you keep free; keep track of the costs. Should you use strong-typing? Most games did and work fine; how much would it cost you to remove that if you found interesting uses of the flexibility for gamemplay?

It’s much harder to modify optimised code, especially “smart” code. It’s always a choice that makes some things better, and others worse (for example, you might be trading CPU time for memory usage). When making that choice, you need to be aware of all the implications – they might be disastrous, but they can also be helpful.

For example, Commander Keen, Wolfenstein and Doom were each built on top of an optimized rendering engine. Each had their “trick” that enabled the game to exist in the first place (each also had further optimizations developed over time, but that’s not important here). That’s fine. It’s okay to heavily optimize the very core of the game, the think that makes the game possible; especially if you’re exploring new territory where this particular optimized feature allows you to consider game designs that weren’t much explored. The limitations the optimization introduces may give you interesting gameplay as well (e.g. unit count limits in RTS games may have started as a way to improve performance, but they have a gameplay effect as well).

But do note that in each of these examples, the game couldn’t exist without the optimization. They didn’t start with a “fully optimized” engine – they started with the bare necessity, and worked their way up. They were developing new technologies, and using them to make fun games. And the engine tricks were limited to as small part of the codebase as possible – the heavier optimizations were only introduced when the gameplay was mostly done, or where it allowed an interesting new feature to emerge.

Now consider a game you might want to make. Is there really some technological miracle that makes or breaks that game? Maybe you’re envisioning an open-world game on an infinite world. Is that really the central piece of the game? Would the game simply not work without it? Maybe you’re thinking about a game where the terrain is deformable without limit, with realistic geology and such; can you make it work with a smaller scope? Would it work in 2D instead of 3D? Get something fun as soon as possible – even if optimizations might require you to rework a huge chunk of your existing code, it may be worth it; and you might even realize that making things bigger doesn’t really make the game better.

As an example of a recent game with lots of optimisations, I’d point to Factorio. One critical part of the game are the belts – there are many thousands of them, and they carry many individual bits of materials all around your factory. Did the game started with a heavily-optimised belt engine? No! In fact, the original belt design was almost impossible to optimise – it kind of did a physical simulation of the items on the belt, which created some interesting things you could do (this is the way you get “emergent” gameplay – gameplay that surprises the designer), but meant you had to simulate every single item on the belt. With thousands of belts, you get tens of thousands of physically-simulated items – even just removing that and letting the belts do the work allows you to cut the associated CPU time by 95-99%, even without considering things like memory locality. But it’s only useful to do that when you actually reach those limits.

Pretty much everything that had anything to do with belts had to be remade to allow the belts to be optimised. And the belts needed to be optimised, because you needed a lot of belts for a large factory, and large factories are one attraction of the game. After all, if you can’t have large factories, why have an infinite world? Funny you should ask – early versions didn’t ๐Ÿ™‚ The game was reworked and reshaped all over many times to get where they are now – including a 100% ground-up remake when they realized Java isn’t the way to go for a game like this and switched to C++. And it worked great for Factorio (though it was still a good thing it wasn’t optimised from the get-go – especially since this was a hobby project, which might have simply failed otherwise for lack of interest).

But the thing is, there are lots of things you can do with a limited-scope factory – and many games have shown just that. Limits can be even more empowering for fun than freedoms; would Spacechem be more fun if the “maps” were infinite? If you started with heavily optimised “belts”, you would pretty much be forced to go that way; and you couldn’t explore other design directions (like seeing what interesting things you can do with physics-simulated conveyor belts). You’re limiting your potential design-space. It may not seem like that because you don’t see a lot of unfinished games, but the hard part is getting the fun right – for every fun game you see, there’s probably hundreds that just couldn’t get there and were scrapped (or worse, released as horrible messes). If optimisation helps you do that – go ahead. If it doesn’t… it’s likely premature. If you think some gameplay mechanic works great, but needs optimisations to truly shine – go ahead. If you don’t have interesting mechanics, don’t optimise them. Find the fun first – you will find most optimisations don’t help with that, and are often detriminal.

Finally, you have a great, fun game. Does it make sense to optimise now? Ha! It’s still not as clear as you might think. Is there something fun you can do instead? Don’t forget your time is still limited. Everything takes an effort, and you want to focus that effort on where it matters most. Yes, even if you’re making a “free game”, or an “open source” game. Watch how the game is played; notice where the performance becomes a bottleneck. Does optimising those places make for more fun (like building ever bigger, ever more tangled factories)? Does it allow you to attract more players (e.g. with weaker computers, or on different platforms)? You always need to prioritise – look for the effort to yield ratio. You’ll likely find plenty of low-hanging fruit just from playing your game and watching others play the game. But note the important part – to get there, you need a game. Focus on that.

As a cherry on top, consider that optimisation never ends. It’s not a task with a little check mark that you finish and move on to other tasks. There’s always “one more optimisation” you can do, and a big part of any development is understanding the priorities. You don’t do optimisation for optimisation’s sake – you do it to achieve a particular goal (e.g. “200 units on the screen at once on a 333 MHz Pentium” is a great goal). Don’t lose track of the terminal goal just because you focus too much on the intermediate goals that might not even be pre-requisites for the terminal goal anymore.

Related Solutions

When should I not kill -9 a process?

Generally, you should use kill (short for kill -s TERM, or on most systems kill -15) before kill -9 (kill -s KILL) to give the target process a chance to clean up after itself. (Processes can't catch or ignore SIGKILL, but they can and often do catch SIGTERM.)...

Default value for UUID column in Postgres

tl;dr Call DEFAULT when defining a column to invoke one of the OSSP uuid functions. The Postgres server will automatically invoke the function every time a row is inserted. CREATE TABLE tbl ( pkey UUID NOT NULL DEFAULT uuid_generate_v1() , CONSTRAINT pkey_tbl...

comparing five integers with if , else if statement

try this : int main () { int n1, n2, n3, n4, n5, biggest,smallest; cout << "Enter the five numbers: "; cin >> n1 >> n2 >> n3 >> n4 >> n5 ; smallest=biggest=n1; if(n2>biggest){ biggest=n2; } if(n2<smallest){ smallest=n2;...

How to play YouTube audio in background/minimised?

Here's a solution using entirely free and open source software. The basic idea is that although YouTube can't play clips in the background, VLC for Android can play clips in the background, so all we need to do is pipe the clip to VLC where we can listen to it...

Why not use “which”? What to use then?

Here is all you never thought you would ever not want to know about it: Summary To get the pathname of an executable in a Bourne-like shell script (there are a few caveats; see below): ls=$(command -v ls) To find out if a given command exists: if command -v...

Split string into Array of Arrays [closed]

If I got correct what you want to receive as a result, then this code would make what you want: extension Array { func chunked(into size: Int) -> [[Element]] { return stride(from: 0, to: self.count, by: size).map { Array(self[$0 ..< Swift.min($0 + size,...

Retrieving n rows per group

Let's start with the basic scenario. If I want to get some number of rows out of a table, I have two main options: ranking functions; or TOP. First, let's consider the whole set from Production.TransactionHistory for a particular ProductID: SELECT...

Don’t understand how my mum’s Gmail account was hacked

IMPORTANT: this is based on data I got from your link, but the server might implement some protection. For example, once it has sent its "silver bullet" against a victim, it might answer with a faked "silver bullet" to the same request, so that anyone...

What is /storage/emulated/0/?

/storage/emulated/0/Download is the actual path to the files. /sdcard/Download is a symlink to the actual path of /storage/emulated/0/Download However, the actual files are located in the filesystem in /data/media, which is then mounted to /storage/emulated/0...

How can I pass a command line argument into a shell script?

The shell command and any arguments to that command appear as numbered shell variables: $0 has the string value of the command itself, something like script, ./script, /home/user/bin/script or whatever. Any arguments appear as "$1", "$2", "$3" and so on. The...

What is pointer to string in C?

argv is an array of pointers pointing to zero terminated c-strings. I painted the following pretty picture to help you visualize something about the pointers. And here is a code example that shows you how an operating system would pass arguments to your...

How do mobile carriers know video resolution over HTTPS connections?

This is an active area of research. I happen to have done some work in this area, so I'll share what I can about the basic idea (this work was with industry partners and I can't share the secret details ๐Ÿ™‚ ). The tl;dr is that it's often possible to identify an...

How do I change the name of my Android device?

To change the hostname (device name) you have to use the terminal (as root): For Eclair (2.1): echo MYNAME > /proc/sys/kernel/hostname For Froyo (2.2): (works also on most 2.3) setprop net.hostname MYNAME Then restart your wi-fi. To see the change, type...

How does reverse SSH tunneling work?

I love explaining this kind of thing through visualization. ๐Ÿ™‚ Think of your SSH connections as tubes. Big tubes. Normally, you'll reach through these tubes to run a shell on a remote computer. The shell runs in a virtual terminal (tty). But you know this part...

Difference between database vs user vs schema

In Oracle, users and schemas are essentially the same thing. You can consider that a user is the account you use to connect to a database, and a schema is the set of objects (tables, views, etc.) that belong to that account. See this post on Stack Overflow:...

What’s the output of this code written in java?

//if you're using Eclipse, press ctrl-shift-f to "beautify" your code and make it easier to read int arr[] = new int[3]; //create a new array containing 3 elements for (int i = 0; i < 3; i++) { arr[i] = i;//assign each successive value of i to an entry in...