Premature Optimization

May 6, 2009 at 8:43 pm

"Premature Optimization is the root of all evil" says Donald Knuth (no relation). After having worked on the Sunset engine for the past year (most engine work is done now, but I sometimes revisit it), I've found that I should probably follow that bit of advice more than I do. A couple of examples follow for those who are into this kind of thing.

One of the first things I did was create a utility that could "compile" a level, which was the process of generating a quadtree from the level geometry, and then determining visibility between leaves. Also, collision detection was computed. The screenshot up above (click for a fullsize image) is from the level editor I wrote. Light green squares are solid, white squares are passable. The red lines show the quadtree, and the entire map image is tinted based on visibility (tinted areas are invisible from the position of the camera (represented by the blue arrow/carrot thing)).

As an added bonus, the screenshot also shows a few of the new textures, and the preview window is fairly representative of what the actual game will look like, so this constitutes the first media release of Sunset :D.

Anyway, the original game renderer was heavily focused on rendering as little geometry as possible, and it did this by rendering only the walls that were visible according to the compiled level data set. This was a fairly complicated bit of code, and it proved to optimize the wrong thing when the goal was performance. With geometry as simple as the levels in Sunset, it was actually faster to just render the entire level in one shot, going from one draw call per texture per visible leaf, to just one draw call per texture for the entire level. I would've saved myself a lot of work if I just tried that simpler road from the beginning.

Another example involved drawing bad guys and other sprite-based entities. Because of the way that OpenGL works, geometry for sprites must be drawn back to front (the furthest away bad guys first). For every single frame I build a list of visible entities (which, incidentally, I determine via the dataset I computed earlier for level geometry visibility, I guess that work wasn't a total wash) and then sort the list based off the distance from the entities to the camera.

Because of how frequently the distance between entities and the camera is calculated, I figured the method that computed it had to be really fast, so I employed a common trick when only an approximate distance is needed: I omitted the square root. This resulted in the ordering of entities being correct about 98% of the time. Sometimes, if entities were at just the right spot and moving in just the right direction, you could see them ordered incorrectly for about a quarter of a second. I figured that slight inaccuracy was worth the speed boost.

Well, just recently I decided to do things "the right way" just for giggles, and began computing distances complete with square roots. The result? Entities are always ordered correctly, and there was no qualitative drop in performance. Premature optimization, how foul.

I think Donald Knuth should insist on being called "Don."