Spellcaster Studios

Make it happen…

Frustrating optimizations

Today I added one optimization that was on the list for ages now: camera culling…

Before you start shouting at me, there was already some camera culling being performed, mainly on the voxels (which is the biggest hit on the rendering), and I don’t believe in premature optimization…

Anyway, since this was on the list, decided to tackle it… So, take a typical game scene in a cave…

I had 63 objects being rendered, taking 12 ms for the full frame (including the ambient occlusion, effects, UI, etc).

So I added culling to the props and enemies… That took me down from 63 objects to 52 objects… Render time stayed on 12 ms, so there was no improvement… Bah… Well, maybe there’s too few objects, let me try it on a jungle scene… there’s a lot of trees there, each of them a prop!

First, I found a bug:

screen637

No trees being rendered on the bottom of the screen! But the problem wasn’t bottom of the screen or top, if I put the game in first person view, the same happens all over the place…

After some 10 minutes of debugging, I found the problem:

AABB    aabb(-ml+_pos.x,_pos.y,-ml+_pos.z,ml+_pos.x,_pos.y+(_ground)?(1.0f):(_size.y),ml+_pos.z);

This computes the AABB of the sprite… See the highlighted part?

What I wanted was to add the Y position to the result of the following expression, which evaluates to “1” if the ground variable is true, and to the vertical size otherwise… But what happened instead, due to operator precedence was that the vertical position was being added to the ground variable (which is normally false, so it evaluates to 0), and if that result was non-null, the result was “1”, which was of course wrong in 99% of the cases!~

The correct expression would be:

AABB    aabb(-ml+_pos.x,_pos.y,-ml+_pos.z,ml+_pos.x,_pos.y+((_ground)?(1.0f):(_size.y)),ml+_pos.z);

Which results in:

screen638

But I digress… So originally, without culling I was at 16 ms, with 855 objects!

With the culling, objects was now 252 (much better), but render time was still 16 ms…

So I thought, the effect of not rendering was being negated by the effect of checking if it needed to be rendered or not… So I plunged into the camera frustum clipping code, optimized a lot of stuff… End result: same thing…

I think that the setup part of the game is already very lean, so the actual render time is the part spent in the GPU (considering that swap buffers, which waits for VSync is about 9 ms, that means that the game is waiting for stuff to finish rendering)…

So, the good news is that the system is more efficient, but the bad news is that it doesn’t make a difference in the bottleneck!

Oh well… Disappointed smile

Now listening to “Host” by “Paradise Lost”

Link of the Day: I’m a newly converted to the franchise… Never cared much for the character or the games, but that started changing with “Tomb Raider: Legend”, and then with the “Tomb Raider” reboot… So this one caught my eye:

Comment