The basic problem of displaying large scale 3D landscapes is that the size of landscape visible to the camera is exponentially larger, in terms of area, the further the visual field goes back into the Z axis. This is why even “open” games like Oblivion and Skyrim which seems to give vast views need to use a complex set of backdrops and “blue screen” effects to give an impression of distance.

There are three principal techniques used to produce a reasonable scale of landscape. Others do exist, but they are mainly used for producing static renders of photorealistic landscapes (*vterrain*). The three main concepts are

- Dynamic Level of Detail (DLOD) and Continuous Level of Detail (CLOD)
- Landscape Tiling
- Geo-Clipmapping

Ultimately all these are techniques for reducing the number of triangles visible to the camera while maintaining a tolerable level of detail close to the viewer for fine detail. This can be avoided by adding fogging to your view, and arbitrarily reducing the viewers depth of field, but this is very noticeable. A very clever fogging technique is to create such a crowded near field view (forests etc) that the viewer doesn’t realise they have a restricted field of view.

## Dynamic LOD

This technique makes use of the gradual decomposition of triangles into larger and larger triangles based on some measure of utility – typically a measurement of how near two adjacent triangles are in terms of their gradient or slope.

A really good example is Dustin Horne’s C# based tutorial here which illustrates how you can start with a dense grid of regular triangles and merge the triangles progressively using some deterministic method until you have a mesh of variously larger and smaller triangles which is optimised to the degree you are after. Dustin achieves this by simply determining that triangles further away are bigger than nearby triangles.

A problem with this approach is that all this work is done on the CPU pretty much every frame render – although you could do the culling work every n’th frame its still pretty hard work for the CPU and leaves your GPU sitting idle.

A second issue is that the CPU must have access to the original highly detailed mesh, wipe it clean, and re-decompose it every time – this means there is a definite and quite small limit on the size of the terrain that can be processed this way.

An alternative approach is to pre-simplify the terrain mesh using some algorithm that identifies adjacent triangles which are similar in slope, and merge these. This approach is termed “mesh simplification using quadric error metrics” and many implementations of this exist, along with many, many, scholarly articles. This does get limited in the end though; it does not take into account the need for the viewer to perceive a high level of detail and in fact is just a pre-processing step before you get to the real meat of rendering your landscape. You might reduce your original triangle mesh by 50% using this; but your non-linear problem still exists.

The above triangle mesh has been simplified to eliminate all adjacent similar triangles.

In all cases of dynamic LOD the mesh describes the landscape in its three dimensional form – each point on the mesh indicates a real point on the landscape.

## Landscape Tiling

If you break up the original vast mesh of triangles into discreet portions you can apply either mesh simplification techniques and/or triangle reduction techniques to each tile in turn, either at runtime or at preparation time, and render that specific tile at the given resolution as required.

This is a very common technique and has advantages in terms of terrain surface texturing, because each tile can have its own distinct set of textures to be applied. One disadvantage is the need to manage the LOD of each tile while moving through the landscape, and this must be done on the CPU, but this is a relatively small load.

Two common problems which occur are

- The need to stitch tiles of dissimilar detail together to prevent gaps between the tiles appearing
- Each tile needs a separate render call and ultimately this is the limiting factor

Tile stitching problems can be overcome with three techniques

- Drop a vertical “skirt” around each tile so the gaps are not visible (they always occur in the vertical plane). Nasty, but effective. This covers up the problem rather than solve it.
- Make sure that adjacent tiles can only be one LOD resolution different from each other and calculate and draw a series of “bridging” triangles that match up tiles from the various LOD levels. This requires that the terrain tiles have been simplified in a regular form, not using a CLOD algorithm.
- Make sure that each tile edge always forms a tessellating edge to the next lower resolution and accept that high LOD tiles adjacent to each other have a slightly lower resolution join between them for the sake of being able to join seamlessly with a low resolution tile further away,

As with dynamic LOD the points on the resultant meshes indicate real points in space with the X,Y,Z coordinates providing a real-world sample of that points height.

## Geo-Clipmapping

(I use this term to describe my technique, although its not quite an accurate term – but it does share some characteristics with the GPU Gems geoclipmapping reference below).

A version of this technique is described in the excellent free online resource GPU Gems and a variant by SkyTiger in his blog.

I use a slightly different approach, the key of which is a doughnut or annulus of triangles

*(the heavier weighted lines are an artefact of my picture and are not significant).*

If you ignore the centre square you can see that this shape can be scaled by a factor of 2 and the larger mesh will fit neatly over the smaller mesh – like this;

and this transform can be repeated until you get the size you are comfortable with and that your GPU can accommodate.

Each time we add a new annulus we are not creating any new geometry – we are just redrawing the same geometry at a different scale. Taking this one step further, its clear that there is no need to store the entire annulus and draw it – it is symmetrical in both axis, and so we only need one quadrant; we can then scale and rotate this to make the entire mesh.

Because each annulus exactly fits the previous smaller scale they don’t need edge stitching like the landscape tile technique and there is only one render call per annulus, each of which doubles the depth of field viewed.

Unlike the previous techniques the mesh itself is just a structure for displaying the landscape on top of it – the X,Z points do not represent any specific point in the landscape model. In fact, because the viewer never moves in relation to the mesh (they are always camera locked to the centre of the progressively larger annulus meshes) each frame it is likely that the X,Z coordinates of any one vertex represent a slightly different point than before – the mesh is not a model of the terrain – it is a structure on which the terrain is rendered.

A good way to visualise what is going on here is to think that the viewer is surrounded by some massive skirt of triangles, spread out into the infinite distance, with each triangle getting steadily more coarse as it gets further away from the centre. As the viewer moves, their “skirt of triangles” moves with them, flowing over the lumps and bumps of the underlying landscape, getting higher or lower as the underlying terrain forces the skirt up and down.

So what makes the individual triangles go up and down ?

Unlike other techniques this absolutely relies on being able to access the Vertex Texture Fetch feature of HLSL Shader Model 3.

The height coordinate is supplied using a heightmap texture (i.e. a grayscale texture where the blackness of each point provides the measurement of how high it is). The texture is normally 4096×4096. As the viewpoint moves, for each triangle vertex, a calculation is made as follows

- Where is this point in world space ? This is the difference between the vertex X,Z coordinates and the viewers X,Z coordinates.
- Where is this point on the height map ? This is simply scaling the world coordinate obtained above to the heightmap resolution.

The resulting pixel location on the heightmap is sampled and the height is scaled from the 0->1 value held in the texture coordinate to the appropriate real-world height scale. This is then used to provide the Y coordinate for the vertex.

This is more tricky than it sounds because we are juggling multiple different coordinate systems, but outcome is a fully scaled terrain system.

A real catch here is that many triangles will map to a single height map pixel, and this will naturally lead to a Minecraft blocky landscape. In reality the vertex shader must sample the four nearest points to the calculated pixel and blend the heights between them. Normally HLSL would do this for us through the magic of Linear texture sampling but Vertex Texture Fetch does not support this and you have to do it manually. The manual LERP is shown here on Catalins blog but this implementation has a flaw that is only evident when doing geo-clipmapping; the example assumes that the texture POINT sampler is based on a round() (nearest whole integer) when in fact it is a floor() calculation (i.e. lowest whole intenger) . This took me a long while to work out why my landscape was doing some peculiar gyrations. Heres my long winded solution;

// // Bilinear LERP of a map. // float4 tex2Dlod_bilinear( sampler heightMapSampler, float4 uv, float texelSize) { // Must round down to the nearest whole texel because thats what Point Sampling does. float4 truncUv = float4( trunc(uv.x / texelSize) * texelSize, trunc(uv.y / texelSize) * texelSize, uv.z, uv.w); float4 height00 = tex2Dlod(heightMapSampler, truncUv); float4 offsetHeight = truncUv; offsetHeight.x += texelSize; float height10 = tex2Dlod(heightMapSampler, offsetHeight); offsetHeight = truncUv; offsetHeight.y += texelSize; float height01 = tex2Dlod(heightMapSampler, offsetHeight); offsetHeight = truncUv; offsetHeight.y += texelSize; offsetHeight.x += texelSize; float height11 = tex2Dlod(heightMapSampler, offsetHeight); float2 f = float2( (uv.x - truncUv.x) / texelSize, (uv.y - truncUv.y) / texelSize); float4 tA = lerp( height00, height10, f.x ); float4 tB = lerp( height01, height11, f.x ); return lerp( tA, tB, f.y ); }

In the above you can see the annulus working in wireframe, and the watertight mesh that results once textured.

The really nice thing about this technique is that its all carried out on the GPU – the same mesh is used, unaltered in every frame, and the GPU just distorts it up and down based on the arithmetic of where the viewer is in relation to the heightmap texture.

Because the geometry does not represent any real world locations it cannot be used to store Normals, Tangents or Bitangents or any of the other useful information that terrain meshes normally have. In order to use that information in the shader each must be computed into a texture and loaded into the shader, and sampled using the same arithmetic as the heightmap.

Good idea and well explained.

A picture of a single annulus would make things clearer.

Clipmapping is primarily concerned with loading multiple levels of resolution from disk.

In my technique the entire terrain (heightmap and colour) is stored on the GPU with normals calculated as a by-product of the bicubic interpolation.

This has the advantage of creating perfectly smooth (C2 continuous) normals which allows me to massively “magnify” the terrain with no unwanted shading problems.

(linear interpolation looks VERY ugly past a certain level of magnification …)

Skytiger,

Thanks for taking the time to read the post 🙂

Agreed that Clipmapping isn’t quite the right title for the concept, perhaps “draping” would be better.

I’ll update with some more pictures – a good suggestion.

I currently provide a Normal map and a Texture Blend Map at the same resolution as my heightmap into the shader – what advantage do you gain in calculating your normal map dynamically ? I can switch to that technique but am putt of by the cost of having to calculate all the contributing vertexes (via bicubic sampling) inside the shader when I can do this outside, and just load the results as a normal map.

C2 normals look fantastic under high magnification

and the memory and bandwidth I save is huge

bicubic sampling is cache friendly (same samples getting hit 1000x for 1 triangle)

and there is plenty of PS ALU to spare …

even the X360 GPU breezes through bicubics and I combine normal mapping + texture splatting + water reflections + wave animations all in a single pixel shader

So you sample the surrounding pixels and calc the normal from there ? I see your point about caching. I’ll give it a go.