Relief Profiling

Detail in a complex landscape  is normally generated by using a progressively detailed mesh of triangles, but this has various limiting factors, including the need to have a massively detailed mesh. An alternative way to make landscapes (or anything else) seem to have lots of surface detail is to manipulate the texture coordinates dynamically to simulate heights along the surface.

This is covered in detail with a working example here http://developer.download.nvidia.com/shaderlibrary/webpages/shader_library.html under Relief Profiling (along with a ton of other useful shader examples).

The technique uses the viewers position above the surface being painted and calculates the angle of viewing. It translates the 3D viewing angle into a 2D vector, essentially the track over the surface that the viewers eyesight is taking. Consulting a second underlying Relief Texture (essentially a height map) indicating a simulated height above the surface of each pixel, the calculation steps into the texture and selects the pixel along the 2D track that is the highest, taking into account the angle of viewing. The first pixel whose height (found by sampling the Relief Map) that is higher than the viewers “ray” is taken and painted. This may well be different from the original pixel indicated by the Texture Coordinate passed into the pixel shader, so creates the same effect of having a mass of surface detail.

The effect is dramatically good, but the calculation needs 15 separate texture samples for each pixel drawn on the screen, so should only be used for objects near to the camera. For situations where the near field is covered by a texture using Relief Profiling there will be a heavy GPU penalty for using this technique, so a good idea is to not use it if the camera is moving above a certain velocity, where the benefit wont be visible anyway.

The bumpy landscape below is made up of only 2 triangles – all other apparent surface detail is generated in the pixel shader using Relief Mapping. Because the technique takes into account the viewers position relative to each pixel, parallax is correctly calculated and gives a stunning visual effect (especially when moving)

ReliefMap1

Advertisements

Limitations of Runtime Heightmap Terrain

In previous posts I have discussed the techniques for runtime generation of landscape form from fixed meshes using heighmap textures to generate Y coordinate offsets, this technique is very fast in execution and very frugal in GPU bandwidth. It does have some limitations;

  • The heightmap size itself is limited to a single texture, meaning that its not infinitely extensible.
  • Ultimately the heightmap pixels map to a world voxel coordinate, and can only be interpolated to smaller world coordinates by using terrain-type specific noise (i.e. bumpy ground height noise etc).
  • Continual sampling of the heightmap at different resolutions using floating point can lead to sampling errors, especially near to the edge of the heightmap texture (DirectX no longer supports the margin property of the sampler so we have to include a gutter on the heightmap, further degrading its usable size).
  • Specific landform types cannot be usefully described – river channels, moraines, geological rock outcrops, and more generally any overhang, tunnel or cave.
  • The deformation of the heightmap by features such as rivers, roads, building platforms etc proved to be insurmountable – the heightmap resolution required in the near field of view was excessive, and the progressive halving of the resolution out in the medium and long field of view rendered these landscape features visibly incorrect (rivers got wider and less distinct, roads became impossible to depict).

These limitations can sometimes be overcome.

Heightmap Maximum Size

I generated a set of tessellating heightmaps to create an infinitely extensible heightfield. This approach caused the following issues;

  • In order to reduce to an acceptable limit the number of heightfield textures being submitted to the shader a progression of re-sampled heightmaps, each being the aggregation of four tessellating heightmaps were needed and a selection algorithm used to send the appropriately detailed map to the shader was introduced.
  • Each call to the shader required a minimum of four heightmaps as it was unlikely that the geometry mesh being drawn would match any given heightmap footprint.
  • Sampling errors at the junction between tessellations of different resolutions of heightmap and geometry became quite a problem.

Heightmap Minimum Resolution

As the viewer came closer to the landscape surface the resolution of the heightmap no longer gave a progressively more detailed landscape. Only the introduction of landscape specific noise using perlin textures or other procedural height generation solved this problem.

Next Steps

In order to overcome this problem my next steps will be to “go back to the beginning” and examine the techniques needed to render a large scale landscape without runtime height generation, using tiled meshes unique to each landscape section. This will require a lot more meshes, but its strength is that each landscape tile is a self contained mesh, and meshes are generally cheap. With a mesh you can describe any shape with a known level of resolution (which can vary over the mesh) and accurately describe linear features within the design pipeline rather than at runtime.

Post Processing Distance Blurring

Distance blurring is an effective technique to cover up the inconsistencies of distant objects, and to present a “real world” feel to distant objects.

In order to achieve this you need to render your scene to a Texture instead of the screen (deferred rendering) and pass that Texture through to a XNA SpriteBatch (i.e. render it in 2D) with an attached pixel shader to blur the colours.

In order to blur the colours based on relative depth from the current perspective the pixel shader should sample the depth buffer (recorded in an another Texture generated during deferred rendering and passed into the blur shader)

This technique is similar to attenuation where the colours are allowed to fade out but affects the sharpness of edges as well. Distant objects blur together slightly out of focus.

The pixel shader pseudo-code can be found here http://xboxforums.create.msdn.com/forums/t/7015.aspx.

image

Trees softening into the distance. To get double the blur effect, just run the shader twice.

Attenuation

A simple trick to make the landscape look more realistic – attenuation. This is the process of slowly blending the distant pixels into a general background colour with a tendency to make distant feature silhouette.

This example below (Snowdonia, Wales) shows the effect. Notice how the distant peaks lose definition and become two dimensional.

20080526 - Phillip - Snowdonia 006

This is a relatively simple calculation to be carried out in the pixel shader.

//
// Attenuation with distance
//
float attenuate(float4 vectorPosition, float3 cameraPosition, float lightRadius)
{
   float d = distance( cameraPosition, vectorPosition );
   float attenuation = saturate( (1-saturate( (d * d)/(lightRadius*lightRadius) )) );
   return attenuation * attenuation;
}

Taking the result of this function and multiplying like this

float attenuation = attenuate(ps_input.WorldPosition, Param_CameraPosition, Param_WorldSize ); float4 attenuationColor = float4(0.34f,0.40f,0.52f,1.0f) ; output.PixelColor = (output.PixelColor * attenuation) + (attenuationColor * (1-attenuation));

will blend your pixel into a blue-ish tint (hard coded in this example, but should be a parameter). In my engine it gives a nice result.

image

image

Large Scale Terrain

The basic problem of displaying large scale 3D landscapes is that the size of landscape visible to the camera is exponentially larger, in terms of area, the further the visual field goes back into the Z axis. This is why even “open” games like Oblivion and Skyrim which seems to give vast views need to use a complex set of backdrops and “blue screen” effects to give an impression of distance.

There are three principal techniques used to produce a reasonable scale of landscape. Others do exist, but they are mainly used for producing static renders of photorealistic landscapes (vterrain). The three main concepts are

  • Dynamic Level of Detail (DLOD) and Continuous Level of Detail (CLOD)
  • Landscape Tiling
  • Geo-Clipmapping

Ultimately all these are techniques for reducing the number of triangles visible to the camera while maintaining a tolerable level of detail close to the viewer for fine detail. This can be avoided by adding fogging to your view, and arbitrarily reducing the viewers depth of field, but this is very noticeable. A very clever fogging technique is to create such a crowded near field view (forests etc) that the viewer doesn’t realise they have a restricted field of view.

Dynamic LOD

This technique makes use of the gradual decomposition of triangles into larger and larger triangles based on some measure of utility – typically a measurement of how near two adjacent triangles are in terms of their gradient or slope.

A really good example is Dustin Horne’s C# based tutorial here which illustrates how you can start with a dense grid of regular triangles and merge the triangles progressively using some deterministic method until you have a mesh of variously larger and smaller triangles which is optimised to the degree you are after. Dustin achieves this by simply determining that triangles further away are bigger than nearby triangles.

A problem with this approach is that all this work is done on the CPU pretty much every frame render – although you could do the culling work every n’th frame its still pretty hard work for the CPU and leaves your GPU sitting idle.

A second issue is that the CPU must have access to the original highly detailed mesh, wipe it clean, and re-decompose it every time – this means there is a definite and quite small limit on the size of the terrain that can be processed this way.

An alternative approach is to pre-simplify the terrain mesh using some algorithm that identifies adjacent triangles which are similar in slope, and merge these. This approach is termed “mesh simplification using quadric error metrics” and many implementations of this exist, along with many, many, scholarly articles. This does get limited in the end though; it does not take into account the need for the viewer to perceive a high level of detail and in fact is just a pre-processing step before you get to the real meat of rendering your landscape. You might reduce your original triangle mesh by 50% using this; but your non-linear problem still exists.

The above triangle mesh has been simplified to eliminate all adjacent similar triangles.

In all cases of dynamic LOD the mesh describes the landscape in its three dimensional form – each point on the mesh indicates a real point on the landscape.

Landscape Tiling

If you break up the original vast mesh of triangles into discreet portions you can apply either mesh simplification techniques and/or triangle reduction techniques to each tile in turn, either at runtime or at preparation time, and render that specific tile at the given resolution as required.

This is a very common technique and has advantages in terms of terrain surface texturing, because each tile can have its own distinct set of textures to be applied. One disadvantage is the need to manage the LOD of each tile while moving through the landscape, and this must be done on the CPU, but this is a relatively small load.

Two common problems which occur are

  • The need to stitch tiles of dissimilar detail together to prevent gaps between the tiles appearing
  • Each tile needs a separate render call and ultimately this is the limiting factor

Tile stitching problems can be overcome with three techniques

  • Drop a vertical “skirt” around each tile so the gaps are not visible (they always occur in the vertical plane). Nasty, but effective. This covers up the problem rather than solve it.
  • Make sure that adjacent tiles can only be one LOD resolution different from each other and calculate and draw a series of “bridging” triangles that match up tiles from the various LOD levels. This requires that the terrain tiles have been simplified in a regular form, not using a CLOD algorithm.
  • Make sure that each tile edge always forms a tessellating edge to the next lower resolution and accept that high LOD tiles adjacent to each other have a slightly lower resolution join between them for the sake of being able to join seamlessly with a low resolution tile further away,

As with dynamic LOD the points on the resultant meshes indicate real points in space with the X,Y,Z coordinates providing a real-world sample of that points height.

Geo-Clipmapping

(I use this term to describe my technique, although its not quite an accurate term – but it does share some characteristics with the GPU Gems geoclipmapping reference below).

A version of this technique is described in the excellent free online resource GPU Gems and a variant by SkyTiger in his blog.

I use a slightly different approach, the key of which is a doughnut or annulus of triangles

image

(the heavier weighted lines are an artefact of my picture and are not significant).

If you ignore the centre square you can see that this shape can be scaled by a factor of 2 and the larger mesh will fit neatly over the smaller mesh – like this;

image

and this transform can be repeated until you get the size you are comfortable with and that your GPU can accommodate.

image

Each time we add a new annulus we are not creating any new geometry – we are just redrawing the same geometry at a different scale. Taking this one step further, its clear that there is no need to store the entire annulus and draw it – it is symmetrical in both axis, and so we only need one quadrant; we can then scale and rotate this to make the entire mesh.

image

Because each annulus exactly fits the previous smaller scale they don’t need edge stitching like the landscape tile technique and there is only one render call per annulus, each of which doubles the depth of field viewed.

Unlike the previous techniques the mesh itself is just a structure for displaying the landscape on top of it – the X,Z points do not represent any specific point in the landscape model. In fact, because the viewer never moves in relation to the mesh (they are always camera locked to the centre of the progressively larger annulus meshes) each frame it is likely that the X,Z coordinates of any one vertex represent a slightly different point than before – the mesh is not a model of the terrain – it is a structure on which the terrain is rendered.

A good way to visualise what is going on here is to think that the viewer is surrounded by some massive skirt of triangles, spread out into the infinite distance, with each triangle getting steadily more coarse as it gets further away from the centre. As the viewer moves, their “skirt of triangles” moves with them, flowing over the lumps and bumps of the underlying landscape, getting higher or lower as the underlying terrain forces the skirt up and down.

image

So what makes the individual triangles go up and down ?

Unlike other techniques this absolutely relies on being able to access the Vertex Texture Fetch feature of HLSL Shader Model 3.

The height coordinate is supplied using a heightmap texture (i.e. a grayscale texture where the blackness of each point provides the measurement of how high it is). The texture is normally 4096×4096. As the viewpoint moves, for each triangle vertex, a calculation is made as follows

  • Where is this point in world space ? This is the difference between the vertex X,Z coordinates and the viewers X,Z coordinates.
  • Where is this point on the height map ? This is simply scaling the world coordinate obtained above to the heightmap resolution.

The resulting pixel location on the heightmap is sampled and the height is scaled from the 0->1 value held in the texture coordinate to the appropriate real-world height scale. This is then used to provide the Y coordinate for the vertex.

This is more tricky than it sounds because we are juggling multiple different coordinate systems, but outcome is a fully scaled terrain system.

A real catch here is that many triangles will map to a single height map pixel, and this will naturally lead to a Minecraft blocky landscape. In reality the vertex shader must sample the four nearest points to the calculated pixel and blend the heights between them. Normally HLSL would do this for us through the magic of Linear texture sampling but Vertex Texture Fetch does not support this and you have to do it manually. The manual LERP is shown here on Catalins blog but this implementation has a flaw that is only evident when doing geo-clipmapping; the example assumes that the texture POINT sampler is based on a round() (nearest whole integer) when in fact it is a floor() calculation (i.e. lowest whole intenger) . This took me a long while to work out why my landscape was doing some peculiar gyrations. Heres my long winded solution;

//
// Bilinear LERP of a map. 
// 
float4 tex2Dlod_bilinear( sampler heightMapSampler, float4 uv, float texelSize)
{   
    
    // Must round down to the nearest whole texel because thats what Point Sampling does.
    float4 truncUv = float4(
        trunc(uv.x / texelSize) * texelSize,
        trunc(uv.y / texelSize) * texelSize,
        uv.z,
        uv.w);
                
    float4 height00 = tex2Dlod(heightMapSampler, truncUv);         
    
    float4 offsetHeight = truncUv;
    offsetHeight.x += texelSize;
    float height10 = tex2Dlod(heightMapSampler, offsetHeight);         
    
    offsetHeight = truncUv;
    offsetHeight.y += texelSize;
    float height01 = tex2Dlod(heightMapSampler, offsetHeight);         
    
    offsetHeight = truncUv;
    offsetHeight.y += texelSize;
    offsetHeight.x += texelSize;
    float height11 = tex2Dlod(heightMapSampler, offsetHeight);
    
    float2 f = float2(
        (uv.x - truncUv.x) / texelSize,
        (uv.y - truncUv.y) / texelSize);

    float4 tA = lerp( height00, height10, f.x );        
    float4 tB = lerp( height01, height11, f.x );        
    
    return lerp( tA, tB, f.y );
    

}

In the above you can see the annulus working in wireframe, and the watertight mesh that results once textured.

image

image

The really nice thing about this technique is that its all carried out on the GPU – the same mesh is used, unaltered in every frame, and the GPU just distorts it up and down based on the arithmetic of where the viewer is in relation to the heightmap texture.

Because the geometry does not represent any real world locations it cannot be used to store Normals, Tangents or Bitangents or any of the other useful information that terrain meshes normally have. In order to use that information in the shader each must be computed into a texture and loaded into the shader, and sampled using the same arithmetic as the heightmap.

Near Field Landscape Decoration

A challenge for large scale terrain engines is the ability to provide enough detail in the near field of view, with grass, vegetation, rocks and other ground cover. In a brute force implementation these would all be represented by individual meshes (perhaps using billboards or imposters for distant objects) in an series of 2D rectangular areas stored in a quadtree, which are then rendered when in the view frustum and within the feature specific far clipping plane.

image

Individually placed trees in a landscape

A problem with this approach is the vast scale of storage required to store every bush, rock and grass clump, even though retrieval is fast (using a quadtree) this is still a huge data set.

A more scalable, fast, method of achieving this is to store the above but with a small spatial sample of the decoration required. This can be randomly scattered in a 1.0 x 1.0 rectangle at a given density appropriate to your decoration type. This is then stored with the area of coverage in the quadtree, but crucially the sample vegetation covers only a fraction of the area of coverage – the perception of continuous coverage is generated within the vertex shader.

In the above example individual trees do require constant geolocation, but smaller features, visible at only shorter ranges, can get away with a repeating wrap of a small set of meshes.

Assuming a maximum visible range for a feature is 100 world units, and a scatter of sample of meshes inside a 1×1 world unit square, the user can trigger the DrawIndexedPrimitives() when their camera position encounters the large area of coverage.

Inside the VertexShader the position passed in can be scaled and wrapped on a 100×100 scale to produce a repeating field of vegetation that seems to the viewer fixed in space, but in reality is being wrapped in the same way that repeating textures are wrapped in a texture sampler.

float wrap(float value, float lower, float upper)
{
  float dist = upper - lower;
  float times = (float)floor((value - lower) / dist);

  return value - (times * dist);
}


VertexShaderOutput VertexShaderFunction_Decoration(VertexShaderInput input)
{
    VertexShaderOutput output;

    // This calculation from http://books.google.co.uk/books?id=08fx86eFQikC&pg=PA240&lpg=PA240&dq=billboard+rotation+inside+shader&source=bl&ots=0ApjfGYTyu&sig=wIGHzbjmn_B2S4koEc5nRgZIkVQ&hl=en&sa=X&ei=BtTmUPLSMK6k0AWln4HoCw&ved=0CHUQ6AEwCQ#v=onepage&q=billboard%20rotation%20inside%20shader&f=false

    // Wrap the coordinates based on the camera position.
    input.Position.x = wrap(input.Position.x - frac(Param_CameraPosition.x / 100)  ,-0.5,0.5);
    input.Position.z = wrap(input.Position.z - frac(Param_CameraPosition.z / 100)  ,-0.5,0.5);

    // Scale X,Y to 100,100
    input.Position.x *= 100;
    input.Position.z *= 100;

    float4 worldPosition = mul(input.Position,Param_WorldMatrix);

Example vertex shader fragment for wrapped surface features

image

The view above of grass clumps continues for 1000 world space units, wrapping the visible 100 world units of textures continuously as the camera moves giving the impression of endless grass.

image

In the above the trees are placed specifically in the landscape, but the grass is a wrapped randomised set of billboards.

image

In this shot it more clearly shows that the grass coverage is thicker nearer to the camera. This is done by rendering the same vertex buffer of grass, camera centred, at two different X,Z scales – the first at a world scale of 150 and a second pass at 30, giving a 5:1 density ratio of grass nearer to the camera. This technique reuses the existing vertex buffer and effect, just changing the World matrix.