Getting the best performance out of a GPU is a tricky business. In the fixed function pipeline days (without programmable shaders) most attention was focused on reducing the number of triangles rendered, but now you get massive boosts in performance by batching your various render calls so that the same geometry of triangles is rendered in a single set of calls.
This batch approach is illustrated well in the, now old, paper by Nvidia “Batch, Batch, Batch”, which instructs you to organise your render loop around the assets to be drawn rather than any other algorithm. For instance when drawing twenty identical houses, made up of several planes of triangles each with their own texture, organise the rendering around the textures, not the houses. Render all the roofs first, then all the walls, then all the doors etc. In organising around the resources and not the real-world objects (i.e. render house number 1, then house number 2) all the roofs can be rendered by simply changing the Transform matrix for each roof call, and no other resources need to be swapped in and out of the GPU (a state change).
This particular approach works really well with the technique of Hardware Instancing available on DirectX which allows the programmer to set up an array of Transform matrixes and repeat a render call for every item in that array. It delegates the loop that would otherwise be done in your CPU onto the GPU, speeding things up.
Batching will take you only so far. In previous posts I have used Imposters and Billboards to remove the quantity of geometry I’m drawing for distant objects. This works very well for objects which are symmetrical around their vertical axis, like trees, pylons, and even some buildings. The fact remains that “the triangle you dont render is the fastest triangle you render” – there still is a place for simplifying your geometry to draw fewer triangles, especially at a distance.
Here is an illustration, the first mesh is a simple planar surface with a regular underlying geometry, the second one has been simplified from 512 vertexes to 222.
|A nice planar textured surface
||The underlying mesh
Simplification can be carried out using a number of algorithms and Managed DirectX (MDX) used to contain simple methods for simplifying meshes, but these have been dropped in XNA, so we have to hand roll our own. There are not that many code examples on the web but I found this C example to easily be translated into C#. (C# version here)
Its obvious that a planar mesh like the one above can be readily simplified to just the four corner vertexes but how does simplification perform on some more complex multi-plane meshes ?
Bearing in mind that the simplified version will only be seen at a distance, and the most complex one for close-up,
At this point I’d thought I’d post a series of gradually simplified meshes to demonstrate the efficacy of the concept, but the results were always appalling. It took me a while to realise the problem wasn’t the mesh simplification – which always got predictable results, but that most of what we see in a complex object is not geometry but texture. In the case of the house above, the mesh is only a series of really simple boxes and cant reasonably be simplified without removing whole sets of faces, and that completely screws the look of the object.
So currently I’m saving this technique for where I have objects with the same texture on all faces (rocks would be a good example).