Better Models and Z-Fighting

I’ve added some better models now from TurboSquid (they do a nice range of <£5.00 medieval buildings) and added them to the landscape. Still not convincing but much better on the eye than the previous ones from the Sketchup store.


One consequence of using professionally constructed models was immediate and extensive Z-fighting. This is the well known problem that co-planar textures interfere with each other, flipping backwards and forwards between frames, because there is not enough world distance between them.

The problem is that model developers will tend to overlay textured quads on top of the basic model shape to create things like windows and doors, and roof textures. It takes less vertexes to add a simple slim box on the side of a building and paste the window texture on it, than to insert a window frame into the house mesh. Unfortunately the underlying house box model still exists beneath the window box. The diagram shows how a thin Window has been added to the house box (seen from the side)BetterModels2.jpg

This looks great in a modelling application but using a real-world rendering system there just isnt enough differentiation between the plane of the window and the plane of the house. The consequence is that the window keeps flicking between window and wall. This happens because the depth buffer, which keeps track of how far away from the camera a paricular pixel is, is stored as a 24bit number and that number represents the proportional distance between the near clipping plane and the far clipping plane that the pixel lies on. In a modelling application the near and far planes are going to encompass a very short distance; on a real-world application it could be up to 20,000 meters.

This proportional distance is stored as a logarithmic value, on the reasonable basis that the further something is from the camera, the less likely any depth differences are going to be visible to the end user. There is a huge amount of literature explaining Z-fighting on the web. The fixes for it are either;

  1. Avoiding using the hardware depth buffer entirely and write out your own calculation for depth into a texture buffer, and recycle that texture buffer back out on each render so the pixel shader can check the current pixel depth by sampling it.
  2. Do something else.

(1) is not recommended because hardware depth buffering is extremely fast and often takes place before your pixel shader is run, eliminating whole triangles where all three of their vertexes are further away than the recorded pixel depth.

So that leaves (2).

The simplest way I found to achieve a massive reduction in Z-fighting was the technique of reversing the depth buffer. There are three stages to this;

  1. When calculating the projection matrix, simply pass in the Far and Near clip values in reverse order;
  2. When clearing the depth buffer, clear it to 1,0 not 0,1
  3. When setting the depth comparison function, use GREATER_THAN not LESS_THAN

Step 1

float nearClip = ReverseDepthBuffer ? this.FarClippingDistance : this.NearClippingDistance;

float farClip = ReverseDepthBuffer ? this.NearClippingDistance : this.FarClippingDistance;

if (this.CameraType == enumCameraType.Perspective)


return Matrix.PerspectiveFovLH(







Step 2

if (reverseDepthBuffer)


this.Context.ClearDepthStencilView(dsv, SharpDX.Direct3D11.DepthStencilClearFlags.Depth, 0, 1);




this.Context.ClearDepthStencilView(dsv, SharpDX.Direct3D11.DepthStencilClearFlags.Depth, 1, 0);


I found this entirely eliminated my Z-fighting by more smoothly distributing the available depths over the field of view. This is very nicely illustrated in this NVIDIA article.


Settlement Toplogy with Grass and Trees

I’ve now combined the settlement generation with the grass generation I outlined way back. Although the buildings are still cartoonish, and the grass too uniform, the general outline is OK. A problem persists with the blending of the four different densities of grass (I use the Outerra method of doubling the grass blade width whilst halving the number of blades) – its easy to see in the YouTube video. By adjusting the terrain base texture with the same mottling interval as the grass planting distant grass sections dont need to be painted – they just (almost) seamlessly grow of the ground of the same colour as the grass blades.

One thing to work on is shadowing in grass – there are too many polygons being generated to depth test every one of them – so I think I need to add a ‘shadow pass’ to generate a shadow map of the current scene, not just a depth map. It shows where a tree’s shadow doesnt affect the underlying grass, but does affect the underlying landscape.


Settlement Topology with Trees

Following the previous post I could easily add a rotation element to the buldings so they always faced the road. Adding trees is relatively easy and I de-conflicted them from the building by creating a set of bounding spheres and checking for their intersect.


By creating a a set of parallel vertexes to describe the path line I am able to apply a texture blended across the path to prevent a hard edge.ParallelPathLines.jpg


I make the settlement base texture slightly different from the surround so I can spot it easily in the landscape – I need to do a bit more work to make it segue into the standard textures.




Settlement Topology

To generate settlements I overlay my landscape with a Voroni graph and select cells based on the following criteria

  • All the voroni points are within a given deviation in height so my settlements are on relatively flat land
  • The voroni sides contain either a river or a road – my settlements are no in the middle of nowhere
  • The voroni point heights are in the lower third of the landscape range – my settlements are lowland settlements.

Once I have a target site I generate  a random set of points within a bounding circle inside the Voroni cell. I then triangulate these and remove the underlying landscape vertexes and insert the new settlement vertexes. This overrides the natrual landscape with the settlement and allows me to vary the settlement footprint heights as needed and change the texture that it gets painted with. For now I am painting the settlement using a uniform mud texture to make it stand out, but will get more subtle later.

To create a settlement topology I take the settlement voroni graph and describe some paths using the following rules

  1. Take a hull voroni point
  2. Get a random adjacent voroni point, rejecting any other hull points. This will lead my path inward.
  3. Walk the adjacement voroni points iteratively until either (a) I hit another hull point, or (b) I hit an existing path point.

I use these path points as control points on a spline to generate a set of interconnected smooth curver paths. Again I texture these distinctly for debugging but will make them stand out less later. An example of the voroni graph with paths shown in red is below. The yellow points are building locations which are simply the tangents from the centre point of each path line.



The simplistic algorithm is a good start to generating topologically pleasing villages.


Taking control of Triangulation

Up until now I have been using the excellent library to carry out my trianglulation for models I generate myself. Models I buy from turbosquid are already triangulated so dont require any attention.

I’ve been running into problems trying to constrain the trinangulation algorithm to force certain triangle edges to be used; for instance in the case of roads, I definitely want the road edge line to be used irrespective if the triangulation algorithm could make a better (more equalateral) triangle.

The library does support a level of constraints, but its quite poorly documented and I wanted to understand much better the whole concept of Delaunay Triangulation and write a post on the practical implementation. There are quite a few C and C++ libraries out there, but little explanation of how they work or why they work the way they do – most of them concentrate on performance optimisation tricks, but I wanted just to be able to understand the basic process before getting into performance.

Step 1 – Definitions

A Delaunay Triangulation defines a set of triangles which join a set of points such that all the points belong to a triangle, and no triangle edges cross. A Delaunay Triangulation also fulfills the criteria that the triangles are as equalateral as possible. This is important in graphics programming because “long thin” triangles create visual artefacts and suffer from mathematical inaccuracies when carrying out hit tests etc. So the more equalateral the triangle is, the better it looks, the more evenly it tessellates and the easier it is to decimate (make more or less dense accordingly)

Heres a Delaunay triangulation of a mesh of 8 points.


Step 2 – The Method

Starting with an arbitrary set of 2D Points, collect them into a list (or other array type) structure.


Calculate an additional 3 points (we’ll name “Pf”) which, when made into a triangle, are further away from all the original points than they are to each other. This is called the Super Triangle – you could theoretically use infinity points to describe this, but you’ll get floating point inaccuracies from this, so just calculate the bounding box of the original point set and double it. This forms the first Triangle.

DelaunyTriangulator_FirstTriangle - Infinity.jpg

For the first Point (arbitrarily point 1) insert it into the super Triangle and split it into three resultant triangles. You now have three triangles and have used up one of your Points.

DelaunyTriangulator_FirstTriangle - Infinity - Point 1.jpg

Each of your Triangles must be stored in some kind of array, along with its Circumcircle; the center and radius of a circle on whose circumference all the points of the triangle fall. Its clear that them more equalateral the Triangle the tighter the Circumcircle will be to the Triangle area – if the triangle is very un-equalateral, the Circumcircle will need to be much larger.

For every subsequent Point

  1. Gather all the Triangles where the Point falls within their Circumcircle – i.e. the distance of point to the Circumcircle center is less than the Circumcircle radius. (this is where a lot of optimisation is concentrated on – for my case, I just iterate all the Triangles testing each one in turn). We’ll call this set of Triangles T0.
  2. For all the triangles in T0 work out which of their edges are internal – i.e. shared between other members of the set T0. These edges we’ll call E0.
  3. Delete all T0 triangles from the set of all Triangles.
  4. Join all the points of T0 that do not describe any of the E0 edges to the Point we are inserting, to form a set of new Triangles T1.
  5. Add all the new Triangles T1 to the set of all remaining triangles.

This process is illustrated having selected Point 0 and found two of the existing triangle circumcircles overlap Point 0; the green lines show the internal shared edges of those triangles (these get deleted). Consequently we create new triangles 9-0-10, 9-1-0, 8-1-0 and 8-0-10

DelaunyTriangulator_FirstTriangle - Infinity - E0.jpg

DelaunyTriangulator_FirstTriangle - Infinity - E0 - Copy.jpg

The key to the algorithm is that you just keep repeating the same basic move until you run out of points. You will end up with this;


Once you’ve processed every Point.

  1. Delete all Triangles that have as one of their points a point from the set “Pf” (8,9 and 10 in this example)
  2. What you have left is the Delaunay triangulation of your points. Almost….

DelaunyTriangulator_UnConstrained - Copy.jpg

Whereas the mesh is nice and is quite likely to be a Delaunay mesh, there are circumstances where the algorithm will join two triangles along an edge where a a shorter alternative edge existed (in this case the quad 0,1,7,6 has a rather acute angle on the corner 0,1,7, and would be better expressed as triangles 0,6,7 and 0,7,1. For a high quality “true” Delaunay mesh we should

  • Look at each pair of triangles that share a side.
  • If that resultant polygon is convex (you can test this by checking all the internal angles – they should all be less than 180 degrees – if they are not, a concavity exists) then calculate the length of both possible configurations of shared edge;
  • If the alternative shared edge is shorter than the existing one, flip the triangle definitions around so they share the shorter edge, delete the existing triangles, and insert the new pair.

Step 3 – Constraints

The above method will generate a convex polygon – but you have no control over the triangles generated. In many cases (not least of which is defining a concave hull) there are certain edges you want to pre-define which must be present in the eventual mesh, even though they are a sub-optimal Delaunay mesh. For example there may be some reason why we need triangle 3,4,6 to be represented in the eventual mesh, even though it would never get created using Delaunay


  1. Once triangulation has finished, process a list of Constrained Edges that you want to be in your eventual mesh.
  2. For each Constrained Edge (CE1) generate a circle and identify all the Triangles whose circumcircle intersects it. (this is an optimisation step – you could simply iterate all triangles). This is set T2.
  3. For each Triangle in T2 test each edge and if it crosses CE1 its add it to set T3 and delete the Triangle from the main mesh. In this case triangles 3,7,4 and 7,4,6 both intersect the edge 3,6 and get deleted from the mesh.DelaunyTriangulator_Constrained1.jpg
  4. Use a divide-and-conquor algorithm to generate a mesh joining all the triangles in T3 with the Constrained Edge CE1.

Ok – that last step was a bit of a mouthful. Whats a divide-and-conquor algorithm ? All the points of triangles T3 must lie either on one side or other of the CE1 line (line 3,6 in this example). So divide the points into their two constituent piles (Pile1 and Pile2) and add the two points of the constraint line CE1. Since both sets came from a Delaunay triangulation they will form two seperate convex polygons with edge CE1 forming one of the outside edges – just Delaunay both polygons and add the resultant triangles back into the main mesh.

This is hard to picture with just one triangle, so heres a better illustration; the blue line was a Contraint on a Delaunay mesh – all the red points belonged to triangles that intersected the blue line. The points are sorted into Pile1 (red) and Pile2 (yellow). Each set is triangulated using standard Delaunay to produce two convex polygons, but seperately. The subsequent mesh of both halves is no longer Delaunay, but honours the Constraint line.


Step 3 – Data, Optimizations.

You will need to work with double-precision data for all the mathematical elements of this algorithm – floats are just too imprecise. Many examples are quite prescriptive about the kinds of list structures to be used to efficiently parse the many triangles and points you will generate – you can experiment with these, but from the point of view of learning the method, I just stuck with a simple List<> and a couple of others; a Triangle class and a Vector2Double struct. I didn’t bother storing the triangle adjacencies, but did store the Triangle circumcircles in the Triangle class.

In terms of optimisations, a very good method of accellerating the initial Delaunay triangulation is to process all the points in spatial order – either top to bottom or left to right. When each triangle is generated and stored you can check the list of triangles to see if their circumcircles can need be considered any longer for future points – all future points will be further away in either the X or Y coordinate (depending on your choice of spatial order) so triangles whose circumcircles didn’t intersect for the current point and are left/below it, cannot now match any future point, so you no longer need to worry about them. This means the set of triangle/circle matches to do gets smaller and smaller for each subsequent point.





Putting it Together

Having re-instated the ground coverage, rivers, paths, trees etc. I get quite a good result.

I still dont know why some of my rivers run uphill. I feel another debugging session coming on.

Performance is acceptible settling in at around 60fps on a CPU loading of around 10% of an i5 4 core box. My i7 laptop struggles to keep 55fps due to a poorer graphics card. I can’t say my shader coding is very efficient and I suspect there is a whole lot of optimisation to be had there, but unfortunately shader optimisation is really quite difficult. There isn’t the equivalent of profilers or inline debuggers – it all has to be done by intution and dark arts.

A typical scene is 150,000 triangles in around 20 render calls.

See here for the video. Recorded using the new magic feature of Windows 10 – holding down the Windows key and ‘G’ opens the built-in video recorder.



A New DB ?

As part of this work I have converted my data store from SQL Server Compact Edition to the key-value pair database by Raptor, from NuGet and documented here. Although its not noticibly faster (data load times were never a limiting factor on my engine) it can store more than SQL CEs 500mb of data – which I was bumping into when generating a full data set.

Its pretty easy to use and works just like a persistent dictionary, so would be very easy to swap out for another no-SQL database in the future. Happily I took the precaution of hiding my SQL CE implementation behind an IDictionary interface so there was zero impact of swapping it over.

Combining this with a nice fast generic QuadTree implementation found here I can now genreate, store and carry out two dimensional indexes of a large object graph for landscape decoration (shrubs, rocks etc.) Rather than use a database index  to carry out 2D rectangle searches I just use a QuadTree index in memory and save it to the database as an object in its own right. When its deserialized into memory its as fast as it can be and doesn’t rely on the database provider to implement spatial indexing.

During profiling the method most called, and one which I can’t really improve on is the Rectangle intersect function. This checks whether one rectangle (typically based on the uses View Frustum) intersects a rectangle of interest (typically a planted region, or individual plant). This is at the heart of the 2D quadtree index and is called many, many times. To be sure I had the best implementation I just used the System.Drawing.RectangleF implementation hoping that the boffins at Microsoft had got it right. This only really shows up on the profile becasue it is called so many times – not because it is inherently inefficient. 60 Frames a second I parse the entire object QuadTree against the current View Frustum and reject all those rectangles which dont intersect the View Frustum bounding box. I recurse into those that can be seen until I have a list of visible objects.

Something tells me that I can actually cache the results of my visibility check and only refresh it if the users viewpoint changes significantly, but I’ll wait to do more profiling before I dive into that solution. The fact that 2015 Community Edition included the excellent Microsoft Profiler for free really helps the process of understanding the code performance.

After all the work to convert from SQL Server Compact Edition to RaptorDB I got a nasty shock when I examined the file size of my stored data – 15GB. It took a huge amount of time to open (I now understand it reads all its indexes into RAM when the db is opened) and once loaded was indeed really fast; but the slow open time and massive disk size just made it a poor choice

Scores : RaptorDB 0 : SQL Server Compact Edition 1

This made me search for a more reliable database, in-process, just to use as a key-value store. I stumbed across the database built into every Windows OS – “Jet”. Originally a derivative from Microsoft’s acquisition of the FoxPro database technology this is the database format used to store the Windows Registry key-value store, so is pretty robust and performant. The API is a pig, but luckily a .net wrapper and associated IDictionary implementation is available on NuGet. It was a simple drop-in for the RaptorDB and once adjusted to take complex type (explained here ) rather than just simple types, could easily replace it.

Scores : RaptorDB 0 : ESENT 1

The only other thing to look out for is the built-in use of BinaryFormatter to serialize all the data being stored. Since I already use my own custom serializer and store/retrieve only byte arrays, I wanted to bypass that and luckily it implements a plugin field save/retrieve delegate to allow me to do just that.