Landscape Mesh Smoothing using Bicubic Splines

Prompted by a comment from SkyTiger on my previous post “Large Scale Terrain” that I could use bicubic splines to dispense with a large amount of my geometry, I cooked up a version of the landscape generation code which used this technique instead of the linear interpolation I was previously using.

The requirement to have interpolation of any sort is governed by the large size of the terrain we are generating in proportion to the resolution of the heightmap passed to the shader. Typically out heightmap may be a 512×512 texture but we will have vastly more unique terrain vertexes on screen that that. So somehow we need to find the height of terrain vertexes that don’t map exactly to a pixel in the heightmap.

A nasty problem is that Vertex Texture Fetch (i.e. sampling a texture from within a vertex shader) is restricted to point sampling only – you have to do your own calculations as to the relative position within the pixel (typically done for you by a linear sampler) – so beware; if your landscape suddenly goes all “minecraft” its probably because you missed this subtlety.

Linear Interpolation of Heights

Interpolation is a method of taking into account neighbouring pixels to weight the current pixel data. Given two neighbouring pixels of value 10 and 15, the value 10 should only be obtained if the sample was exactly in the centre of the 10 value pixel – any variance in the X and Y coordinate should also take into account the value of the next pixel. In linear interpolation this is a simple distance-proportion; if we are halfway between the centre of the 10 pixel and the 15 pixel, the value should be about 12.5.

It follows that linear interpolation requires that you know the value of the pixels surrounding the one you are positioned over. However because your sample is always in one of the four quadrants of the central pixel you need only consider the pixels adjacent to that quadrant; therefore linear interpolation requires four texture lookups to obtain your height.

Linear interpolation gives an effect like this;

image

(awesome picture Smile)

The linear interpolation prevents the blue (actual sampled value) from being evident, and shows a line between the centre of each pixel sampled, gradually altering to conform to the centre of the next pixel in the sample. The result is a series of flat gradients that do not vary in slope until they hit the next pixel centre.

Heres a picture of landscape using Linear Interpolation for its heights.

image

No matter how dense my geometry is, its clear that all the triangles lie on the same plane if they lie within a single height map pixel.

Bicubic Interpolation of Heights

A quick guide to “what is a bicubic spline” is covered in SkyTiger here. Using the awesome power of Paint.net here is an illustration of the difference

image

Notice that the individual height samples, even when falling between the same two pixels, don’t fall onto the same plane ? This is because (and I haven’t really illustrated it very well here) each height sample takes into account the pixels that neighbour its neighbours; requiring 16 samples for each height to be calculated. Ouch. This gives the sampler a concept of the “trajectory” of the line that it forms part of – it needs to take into account its previous “slope” when deciding how much influence the next sample will have on it – its kind of like an “inertia” applied to the slope transition.

There is some payoff here though – in the example on linear interpolation its clear that most of those triangles are completely wasted – the viewer gets no benefit from their being a high density of triangles; in fact one triangle quad per heightmap texel would be an efficient quantity. Essentially the landscape is low-resolution and that resolution is determined by the heightmap texture size.

When using bicubic interpolation the viewer gets the benefit of all that lovely extra geometry and there is no angular transition betraying the underlying low-res heightmap. Of course, its as smooth as a cueball, so introducing a level of randomness onto the bicubic calculation from another texture (such as a perlin noise generated texture) might break up the monotony a bit.

image

However, 16 texture lookups per vertex painted does sound a bit profligate, especially when you need the same calculation to operate for placing terrain objects like trees and grass billboards

Mixed Linear and Bicubic

If the 16 texture lookups per vertex is a scary thought, you can always blend the two concepts based on distance from camera. Following the technique described in “Large Scale Terrain” each of my “rings” of terrain is rendered as a single pass each and can swap shader techniques for each “ring” – in the following I render the nearest rings as bicubic and the more distant ones as linear, saving a lot of texture lookups and hopefully retaining the high detail and high cost bicubic lookup for the terrain that really matters – the one close to the viewer.

image

For reference; a good source of HLSL interpolation code is here http://code.google.com/p/mcore3d/source/browse/trunk/MCD/Render/DX9/interpolation.hlsl?spec=svn1050&r=1050

Large Scale Terrain

The basic problem of displaying large scale 3D landscapes is that the size of landscape visible to the camera is exponentially larger, in terms of area, the further the visual field goes back into the Z axis. This is why even “open” games like Oblivion and Skyrim which seems to give vast views need to use a complex set of backdrops and “blue screen” effects to give an impression of distance.

There are three principal techniques used to produce a reasonable scale of landscape. Others do exist, but they are mainly used for producing static renders of photorealistic landscapes (vterrain). The three main concepts are

  • Dynamic Level of Detail (DLOD) and Continuous Level of Detail (CLOD)
  • Landscape Tiling
  • Geo-Clipmapping

Ultimately all these are techniques for reducing the number of triangles visible to the camera while maintaining a tolerable level of detail close to the viewer for fine detail. This can be avoided by adding fogging to your view, and arbitrarily reducing the viewers depth of field, but this is very noticeable. A very clever fogging technique is to create such a crowded near field view (forests etc) that the viewer doesn’t realise they have a restricted field of view.

Dynamic LOD

This technique makes use of the gradual decomposition of triangles into larger and larger triangles based on some measure of utility – typically a measurement of how near two adjacent triangles are in terms of their gradient or slope.

A really good example is Dustin Horne’s C# based tutorial here which illustrates how you can start with a dense grid of regular triangles and merge the triangles progressively using some deterministic method until you have a mesh of variously larger and smaller triangles which is optimised to the degree you are after. Dustin achieves this by simply determining that triangles further away are bigger than nearby triangles.

A problem with this approach is that all this work is done on the CPU pretty much every frame render – although you could do the culling work every n’th frame its still pretty hard work for the CPU and leaves your GPU sitting idle.

A second issue is that the CPU must have access to the original highly detailed mesh, wipe it clean, and re-decompose it every time – this means there is a definite and quite small limit on the size of the terrain that can be processed this way.

An alternative approach is to pre-simplify the terrain mesh using some algorithm that identifies adjacent triangles which are similar in slope, and merge these. This approach is termed “mesh simplification using quadric error metrics” and many implementations of this exist, along with many, many, scholarly articles. This does get limited in the end though; it does not take into account the need for the viewer to perceive a high level of detail and in fact is just a pre-processing step before you get to the real meat of rendering your landscape. You might reduce your original triangle mesh by 50% using this; but your non-linear problem still exists.

The above triangle mesh has been simplified to eliminate all adjacent similar triangles.

In all cases of dynamic LOD the mesh describes the landscape in its three dimensional form – each point on the mesh indicates a real point on the landscape.

Landscape Tiling

If you break up the original vast mesh of triangles into discreet portions you can apply either mesh simplification techniques and/or triangle reduction techniques to each tile in turn, either at runtime or at preparation time, and render that specific tile at the given resolution as required.

This is a very common technique and has advantages in terms of terrain surface texturing, because each tile can have its own distinct set of textures to be applied. One disadvantage is the need to manage the LOD of each tile while moving through the landscape, and this must be done on the CPU, but this is a relatively small load.

Two common problems which occur are

  • The need to stitch tiles of dissimilar detail together to prevent gaps between the tiles appearing
  • Each tile needs a separate render call and ultimately this is the limiting factor

Tile stitching problems can be overcome with three techniques

  • Drop a vertical “skirt” around each tile so the gaps are not visible (they always occur in the vertical plane). Nasty, but effective. This covers up the problem rather than solve it.
  • Make sure that adjacent tiles can only be one LOD resolution different from each other and calculate and draw a series of “bridging” triangles that match up tiles from the various LOD levels. This requires that the terrain tiles have been simplified in a regular form, not using a CLOD algorithm.
  • Make sure that each tile edge always forms a tessellating edge to the next lower resolution and accept that high LOD tiles adjacent to each other have a slightly lower resolution join between them for the sake of being able to join seamlessly with a low resolution tile further away,

As with dynamic LOD the points on the resultant meshes indicate real points in space with the X,Y,Z coordinates providing a real-world sample of that points height.

Geo-Clipmapping

(I use this term to describe my technique, although its not quite an accurate term – but it does share some characteristics with the GPU Gems geoclipmapping reference below).

A version of this technique is described in the excellent free online resource GPU Gems and a variant by SkyTiger in his blog.

I use a slightly different approach, the key of which is a doughnut or annulus of triangles

image

(the heavier weighted lines are an artefact of my picture and are not significant).

If you ignore the centre square you can see that this shape can be scaled by a factor of 2 and the larger mesh will fit neatly over the smaller mesh – like this;

image

and this transform can be repeated until you get the size you are comfortable with and that your GPU can accommodate.

image

Each time we add a new annulus we are not creating any new geometry – we are just redrawing the same geometry at a different scale. Taking this one step further, its clear that there is no need to store the entire annulus and draw it – it is symmetrical in both axis, and so we only need one quadrant; we can then scale and rotate this to make the entire mesh.

image

Because each annulus exactly fits the previous smaller scale they don’t need edge stitching like the landscape tile technique and there is only one render call per annulus, each of which doubles the depth of field viewed.

Unlike the previous techniques the mesh itself is just a structure for displaying the landscape on top of it – the X,Z points do not represent any specific point in the landscape model. In fact, because the viewer never moves in relation to the mesh (they are always camera locked to the centre of the progressively larger annulus meshes) each frame it is likely that the X,Z coordinates of any one vertex represent a slightly different point than before – the mesh is not a model of the terrain – it is a structure on which the terrain is rendered.

A good way to visualise what is going on here is to think that the viewer is surrounded by some massive skirt of triangles, spread out into the infinite distance, with each triangle getting steadily more coarse as it gets further away from the centre. As the viewer moves, their “skirt of triangles” moves with them, flowing over the lumps and bumps of the underlying landscape, getting higher or lower as the underlying terrain forces the skirt up and down.

image

So what makes the individual triangles go up and down ?

Unlike other techniques this absolutely relies on being able to access the Vertex Texture Fetch feature of HLSL Shader Model 3.

The height coordinate is supplied using a heightmap texture (i.e. a grayscale texture where the blackness of each point provides the measurement of how high it is). The texture is normally 4096×4096. As the viewpoint moves, for each triangle vertex, a calculation is made as follows

  • Where is this point in world space ? This is the difference between the vertex X,Z coordinates and the viewers X,Z coordinates.
  • Where is this point on the height map ? This is simply scaling the world coordinate obtained above to the heightmap resolution.

The resulting pixel location on the heightmap is sampled and the height is scaled from the 0->1 value held in the texture coordinate to the appropriate real-world height scale. This is then used to provide the Y coordinate for the vertex.

This is more tricky than it sounds because we are juggling multiple different coordinate systems, but outcome is a fully scaled terrain system.

A real catch here is that many triangles will map to a single height map pixel, and this will naturally lead to a Minecraft blocky landscape. In reality the vertex shader must sample the four nearest points to the calculated pixel and blend the heights between them. Normally HLSL would do this for us through the magic of Linear texture sampling but Vertex Texture Fetch does not support this and you have to do it manually. The manual LERP is shown here on Catalins blog but this implementation has a flaw that is only evident when doing geo-clipmapping; the example assumes that the texture POINT sampler is based on a round() (nearest whole integer) when in fact it is a floor() calculation (i.e. lowest whole intenger) . This took me a long while to work out why my landscape was doing some peculiar gyrations. Heres my long winded solution;

//
// Bilinear LERP of a map. 
// 
float4 tex2Dlod_bilinear( sampler heightMapSampler, float4 uv, float texelSize)
{   
    
    // Must round down to the nearest whole texel because thats what Point Sampling does.
    float4 truncUv = float4(
        trunc(uv.x / texelSize) * texelSize,
        trunc(uv.y / texelSize) * texelSize,
        uv.z,
        uv.w);
                
    float4 height00 = tex2Dlod(heightMapSampler, truncUv);         
    
    float4 offsetHeight = truncUv;
    offsetHeight.x += texelSize;
    float height10 = tex2Dlod(heightMapSampler, offsetHeight);         
    
    offsetHeight = truncUv;
    offsetHeight.y += texelSize;
    float height01 = tex2Dlod(heightMapSampler, offsetHeight);         
    
    offsetHeight = truncUv;
    offsetHeight.y += texelSize;
    offsetHeight.x += texelSize;
    float height11 = tex2Dlod(heightMapSampler, offsetHeight);
    
    float2 f = float2(
        (uv.x - truncUv.x) / texelSize,
        (uv.y - truncUv.y) / texelSize);

    float4 tA = lerp( height00, height10, f.x );        
    float4 tB = lerp( height01, height11, f.x );        
    
    return lerp( tA, tB, f.y );
    

}

In the above you can see the annulus working in wireframe, and the watertight mesh that results once textured.

image

image

The really nice thing about this technique is that its all carried out on the GPU – the same mesh is used, unaltered in every frame, and the GPU just distorts it up and down based on the arithmetic of where the viewer is in relation to the heightmap texture.

Because the geometry does not represent any real world locations it cannot be used to store Normals, Tangents or Bitangents or any of the other useful information that terrain meshes normally have. In order to use that information in the shader each must be computed into a texture and loaded into the shader, and sampled using the same arithmetic as the heightmap.

Billboards, Imposters and Meshes

A problem in landscaping display is the requirement to populate the landscape with content, not just the terrain surface. This a problem of managing exponential visibility (without recourse to fogging) and can only be properly realised with a combination of

  • Meshes
  • Imposters
  • Billboards

Billboards

Billboards are the easiest solution to projecting the image of thousands of similarly objects in 3D space, so long as you don’t get close enough to interact with the supposedly 3D object, and if that 3D object is symmetrical around its vertical axis.

In terms of tutorial Riemers is very accessible and presents a good visualisation of the technique.

However Riemer’s example soaks CPU as it requires a separate render call for each billboard, passing a CPU calculated dot product for the rotation angle between the camera and each billboard and making a separate call for each billboard.

A better option is to pre-calculate the four vectors of the billboard quad as a single point in space, and for a geographical area, compound all the billboard centre point vectors into a single VertexBuffer; and within the Vertex Shader you will transform each vertex to orient it to the camera.

You can then group all the billboards in a single geographical area into a single VertexBuffer which can then be rendered in a single call, trading vertex count for call count (a modern GPU can handle a lot of vertexes in a single call).

A bad billboard

image

Simple billboard of a cottage (converted from a Sketchup 8 Collada Export file). This looks OK from a distance, but suffers from

  1. Non-symmetrical vertical axis, meaning that it will really obviously change shape when it flips into 3D mesh mode.
  2. The viewer sees only one face of a complex object – which again will have a horrible transition when it moves to become a 3D mesh.

A good billboard

image

This tree is a much better example being almost symmetrical around the vertical axis, and its unlikely an observer would be able to tell the facing from afar.

Imposters

Imposters are a midway house between Meshes and Billboards. These use the concept of a texture atlas of rotated views of a Mesh, rendered at compile time, and sampled at render time to provide a camera view specific Billboard based on the dot product of the camera to model world location.

I use an 8×8 texture up to 2048×2048 pixels to provide the widest range of possible angles.

An acceptable imposter

image

Although these trees look very similar when rotated and sampled onto a texture atlas, the benefits are more obvious with this cottage.

A nice imposter

image

Which not only has a rich variety of faces, but strikingly different colouration which would be very obvious if treated as a billboard.

Producing the texture atlas is a matter of introducing a step within your asset pipeline which renders your Mesh model to a Texture and rotates the mesh the required rotation, taking another snapshot at appropriate angles, merging them all onto a single large texture sheet.

Meshes

These are the relatively easy part – a .X or other file containing a full 3D description of the triangle mesh required to render your object in 3D, including textures, bump maps etc. The problem with these is that they are large (typically 2k->50k triangles per object) and even if you share the same geometry, and therefore the same assets, for multiple instances of the same object, you still have to render it multiple times, and those draw calls soon add up to an unsustainable amount.

Close-in however, meshes are vital to providing a realistic impression.

image

The one important capability that a mesh gives you; is the ability to look down on your model – neither billboards or imposters are particularly good at that trick, even if they only ever orient towards the camera in the X,Z axis.

Transitioning

For all the models in my 3D landscape I manage the display of them according to distance from camera, transitioning between Billboard (where appropriate to the model) to Imposter and then to Mesh. I blend the transition so it overlaps, and hopefully hides any obvious transition effects.

image

This view shows all three techniques in effect; the grass are billboards, the distant tress are imposters and the close trees are meshes. The colour difference between the imposters and the trees is something I’m working on.

image

Transition between a distant cottage imposterimage

… to a full mesh

image

… though I’m not sure I’d want to stay the night…

image