Tesellation Shader with Noise

In an earlier post I used Directx11 tesellation shaders to generate a high frequency of landscape triangles within the shader, allowing my landscape tiles to generate much higher detail when close to the camera. Other than river surfaces I also mentioned that I hadn’t actually used the extra geometry density for anything yet.

Now I have.

Simply by sampling a perlin noise texture at two frequencies to add extra height within the tesellation shader I can generate a much more interesting landscape close up.

tess_noise

The various lumps and bumps in the foreground here are entirely generated using the tesellation shader and perlin noise. tess_noise2.jpg

In order to prevent obvious visual popping I generate a bump map from the perlin noise at design time, and sample that with the same frequency as the height undulation. I then combine that bump sample with the landscapes more basic normal map to generate a combined normal for light rendering.

Combining normals is not simply a matter of adding both together and renormlizing – this would give an average normal, not a combined normal. Luckily someone has already solved this for me. See here for the details http://blog.selfshadow.com/publications/blending-in-detail/

In order to maintain a good correspondance between close and distant lighting I make sure that I use a high tessellation factor when rendering my design-time “drapes” and normal maps for the distant landscape tiles. When I render them as a simple texture-with-normal map in the distance it looks like a high detail image – the darkening effect of the distant normals gives an illusion of a lot of height variance that doesnt actually exist in the geometry.

Even with the varying height generated from the noise samplers I still place the vegetation directly using a pre-calculated Y coordinate rather than rely on height map sampling in the shader. I just repeat the height undulation code written in HLSL within my C# design pipeline to get an accurate measurement of how high each tree will be at runtime. There are a couple of reasons for this.

  • Trees placed on a slope that use a heightmap to determine their distance from the ground will tend to “skew” in the horizontal plane – the front of the tree is further downhill than the back of the tree and every vertex is offset in the vertical plane based on how high off the ground it is. In the real world trees dont behave like that – they grow vertically without reference to the slope of the ground on either side.
  • Less texture lookups at runtime, traded off with an extra float passed in the Vertex instance stream. Given that the Vertex instance stream is a Matrix, this actually doesn’t cost me anything.

Video on YouTube here.

 

 

Trees, sparse foliage, rivers and paths

A quick video of all of the pieces put together. The landscape now uses a tesellation shader to generate higher levels of detail but I dont use those extra triangles at the moment other than on the river surface, which is animated from a height map. The sea doesn’t yet use a tessellation shader but a fixed concentric circle mesh which achieves the same result but with hard coding.

The trees and shrubbery are generated usign the same techniques, planted using a Voroni cell map. The textures over the landscape have been colour matched so they are not so obvious in transition – but I may have over-done this as it all looks the same basic colour now. Back to the drawing board on that.

Youtube Video

 

 

Tesellation Shaders and Rivers

Having made the move to DX11 and SharpDX I can now use Tesellation Shaders to make my landscape more interesting without having enormous Vertex Buffers. The principles of tessellation shaders are;

  1. Your vertexes are passed into a traditional VertexShader. This VS simply passes through the vertexes to the next stage, and typically does no transformations. This is becasue the vertexes you submit might not be rendered.
  2. The vertexes are passed through to a Hull Shader, where you can decide to discard the set of vertexes that are passed in. This stage is the first time you see the new features of using tessellation – instead of operating on a single vertex at a time, you get passed an array of vertexes which form a triangle. Unless you want to do some culling here, you typically return everything you are passed, unaltered.
  3. The fun starts when your vertexes patches (array of 3 vertexes) are passed into the Domain Shader. Here the DS is also passed a set of barycentric coordinates. The job of the domain shader is to output a new Vertex based on the three passed in, using the barycentric cordinates passed. This is where the majority of the work is done.

In the Domain Shader you end up doing all the work typically done in the Vertex Shader; matrix calculations etc. Traditionally once the vertex leaves the VS is values are interoplated across the triangle and the resultant data structure is passed into the Pixel Shader. Using the DS you are in charge of interpolating the data from the three patch points in any way you see fit. The patch points themselves are never acutally included as vertexes for rendering, and are simply used as control points.

A landcape made of large uniform triangles can be tessellated within the DS to include lots of new sub-triangles, and using noise sampling or other techniques you can make the new vectors have different heights to those that would have been interpolated across the larger triangle. In fact you can change everything – the texture, coordinates etc.

The key to tessellation shaders is to use the extra triangles to generate some useful content and not just to subdivide triangles for the sake of it. The problem is that the introduction of new detail needs to be done in a way which wont affect other rendered object which might not use tesellation. For example a tree will be placed on a landscape based on a sample of a height map. If the landscape introduces tessellation and generates new bumps and other intersting features, that tree placement will be wrong and may sink into the ground or hover above it.

This is nice on a landscape but really comes into its own when painting water surfaces. These have a large appetite for vertexes, and a need to vary the vertexes over a period of time to provide animation. The DS can general a dense mat of sub-triangles and use a height map for the water surface to generate ripples and normals.

Careful selection of the kinds of deformation introduced into the DS is important to getting visually good results. Luckily this doesnt apply to water surfaces since nothing is dependent on their height. The image below uses a linear tessellation algorithm so closer triangles are subdivided more heavily than distant ones. The bump, height and texture samplers are all linked to the game time and so change their sample coordinate to give the illusion of  a flowing river.

river

SharpDX Resources

As a long term .net programmer I am used to a managed memory runtime environment and know all about the Dispose pattern used to release managed resources. However for some time I’ve been concerned with the memory leaks I was seeing in my SharpDX based 3D landscape.

This post is a reminder to all SharpDX .net programmers that Dispose() implemented in SharpDX objects does not work like the standard Dispose pattern for .net.

All SharpDX objects are wrapped COM objects and are subject to proper disposal via reference counting. The .net runtime caters for all reference counts for .net classes and collects their resources as soon as there are no references to them in the managed call stack (and thread local storage). COM places the requirement to manage the lifetime of objects firmly on the programmer, who must add and remove reference counts to objects as new references to them are stored.

Using the standard .net Dispose pattern allows me to handle the COM de-reference without any problems; the SharpDX documentation tells us that “Dispose” is used to dereference the counter.

What I forgot is that COM references aren’t just created when you instantiate a new instance, but also if you receive an instance reference from the SharpDX factory. So the general pattern;

SharpDX.DirectX11.Texture2D tex = new SharpDX.DirectX11.Texture2D();
.. do some work
tex.Dispose();

works perfectly Ok. What isn’t so obvious is that;

SharpDX.Direct3D11.DepthStencilView dsv;
SharpDX.Direct3D11.RenderTargetView[] rtvs = 
   this.Context.OutputMerger.GetRenderTargets(8, out dsv);

to query a list of the existing render targets, also increments the reference count of those RenderTargetView instances, and the DepthStencilView. So you need to remember to dispose of those too.

As a .net programmer this is odd, but manageable. What is particulary weird feeling is that Dispose() is designed in .net to be used when the programmer knows they have the last instance reference to a managed class. If you have four references to a single instance, you clearly should not call Dispose() until you know that you are holding the very last instance, otherwise any other reference pointers will suddenly find themselves holding disposed classes.

When using Dispose with the SharpDX classes you call it as soon as you know your particular reference is going to go out of scope – irrespective of how many other references to the same instance exists. Dispose() doesn’t acutally release any resources – it simply decrements the COM reference count. The SharpDX class factory does the resource release and COM deallocation.

Becuase it is not obvious when looking at code whether you are using a .net managed class or a wrapped SharpDX COM class, and because Dispose() means something entirely differnet in each case I decided to create extension methods for all the SharpDX classes;

/// <summary>
/// Hides the confusion between Dispose (a .net concept) and handle counting (a com concept).
/// </summary>
/// <param name="leasedObject"></param>
public static void ReleaseReference(this SharpDX.Direct3D11.DeviceContext leasedObject)
{
    leasedObject.Dispose();
}

This means my code looks like this;

SharpDX.DirectX11.Texture2D tex = new SharpDX.DirectX11.Texture2D();
.. do some work
tex.ReleaseReference();

Which is a lot more self descriptive. I like making code more maintainable this makes me stop and think when I look at it, and I instantly know whats going on. I wish the SharpDX bods had not used the existing Dispose() method to do their reference counting – but I can see why they did.

More Grass, Denser Grass

The CodeMasters blog entry http://blog.codemasters.com/grid/10/rendering-fields-of-grass-in-grid-autosport/ made me think again about my Grass rendering using a Geometry Shader. I had followed the suggestions from Outerra http://outerra.blogspot.co.uk/2012/05/procedural-grass-rendering.html to generate my grass but CodeMasters suggested combining this approach with simple billboards.

Instead of each geometry shader triangle strip representing a single blade of grass, why not just output a quad with a nicely detailed, colourful, texture. The textured quad might represent 5 or 10 blades of grass, rotated and scaled. This is a massive increase in grass density with better art, than the Outerra model.

With a bit of texture atlasing of various textures I could generate a very varied meadow with only some basic changes to my shader – and use less vertexes per location as well. Although the end result is clearly more “billboard” than “geometry” it still achieves a much higher density of foliage.

Here is the outcome with a four texture atlas

meadowvertical01

 

densemeadow

This is animated in the normal way using some perlin noise textures to generate movement. The density of the grass is overwhelming here – it looks like a forest. Changing the texture atlas to something more “grassy”;

meadowvertical02

densemeadow2

Mmm. Thats nice.

densemeadow3

Meadows underplating shadowed trees, distant ocean and mountains. 80 fps.

Shadows

This is quite a difficult issue to deal with for a large landscape. The basics of shadow mapping are well documented https://msdn.microsoft.com/en-gb/library/windows/desktop/ee416324(v=vs.85).aspx() and briefly;

  1. Draw your scene in two passes. The first pass is drawn from the location of the light source, and the second from the location of the viewer.
  2. On the first (light) pass, you render to an offscreen texture and only actually draw the depth of the pixel not the colour of the scene. The depth is calculated as output.Depth = (ps_input.Position.z / ps_input.Position.w);
  3. On the second (color) pass, you render your scene as normal to the viewport. However you pass in to your shader the texture you drew in Step 2 along with the View Matrix you used when you drew step 2.
  4. In the pixel shader, read the correct depth pixel from the depth texture you generated in Step 2 and compare it with the depth you calculate for your current pixel – if the depth stored in the texture is less than the depth you have calculated then the pixel should be shaded darker – it is in shadow.

This is fairly straightforward, but how does it work ? How do you actually use the depth picture you drew in step 2 ? The first step is to work out which pixel in your standard rendering pass (step 3) is equivalent to the same pixel you drew in step 2. Since both were rendered from different view points (and using a different projection matrix typically) the actual pixel being drawn in your Pixel Shader has no relationship to the one you drew in the earlier light pass.

The key is to pass into the vertex shader the View and Projection matrix values you used to generate your light pass in Step 2. You then calculate the vertexes position to generate a value which would have been the same for that vertex in the Step 2 vertex shader.

Vertex Shader Fragment

In Step 2 (depth pass) you would have calculated;

output.Position = mul(vertexPosition, Param_WorldMatrix);
output.Position = mul(output.Position, Param_ViewMatrix);
output.Position = mul(output.Position, Param_ProjectionMatrix

so in Step 3 (color pass) you need to calculated the same value, and pass in the matrixes you used in the Step 2 as a new set of parameters “Param_LightXXXXMatrix”

output.LightViewPosition = mul(vertexPosition, Param_WorldMatrix);
output.LightViewPosition = mul(output.LightViewPosition, Param_LightViewMatrix);
output.LightViewPosition = mul(output.LightViewPosition, Param_LightProjectionMatrix);

So your pixel shader will now receive the parameter LightViewPosition as well as the Position you would normally calculate in your vertex shader for this pass. The clever part comes in the pixel shader where you use the passed in LightViewPosition to generate a texture coordinate that can be used to read the correct pixel from the depth map texture;

Pixel Shader Fragment

This calculation uses the LightViewPosition you calculated in the vertex shader and generates a coordinate correct for sampling the depth map texture.

float2 projectTexCoord;
projectTexCoord.x = ((LightViewPosition.x / LightViewPosition.w) / 2.0f) + 0.5f;
projectTexCoord.y = ((-LightViewPosition.y / LightViewPosition.w) / 2.0f) + 0.5f;

This is called Texture Projection and this trick can be used anywhere you have a texture that is generated via a different View and Projection matrix.

Once you’ve got the texture coordinate for your depth map, you just read out the depth you recorded in Step 2 and compare it to the value you are currently about to write to your color pass.

Pixel Shader Fragment

So now we can sample the depth texture and read back the depth we calculated when we generated the same pixel from the lights position.

float realDistance = (LightViewPosition.z / LightViewPosition.w);
// Note the use of the free interpolated comparison method.
return depthMap.SampleCmp(DepthMapSampler, projectTexCoord, realDistance - depthBias);

So whats with the special “SampleCmp” ? We expect, because we used a different location and projection matrix for drawing the depth map, that the pixel we sample from the depth map wont be an exact 1:1 match for the pixel we are drawing to the scene. It may be skewed or scaled such that it represents a slightly different world position. Typically you would do a PCF set of four samples around the point you have calculated, and take the average of that calculation – this will give a nice anti-aliasing. However we need to consider that the depth map does not contain color – it contains depths. Trying to use anti-alias concepts on a depth map would generate nonsense. Two pixels lying next to  each other in the depth map might represent depth calculations for two objects very far apart in world space – a nearby object and a really distant object might record wildly different depth values only one pixel apart from each other.

Luckily in Shader Model 5 the designers gave us the new SampleCmp which allows us to do a four-tap PCF sample in the hardware but instead of returning an weighted average of the values it samples, it gives us a weighted average of those pixels that are less than a depth value we pass in (the third parameter). This is much more useful and gives our shadows a nice soft edge.

Shaking Shadows

aka Shadow Trembling, Shadow Shaking etc.

This is visible when you swivel the viewpoint or move the world viewpoint for your camera. Its generated because of the same problem which caused us to use the SampleCmp function described previously. The shadow map does not have a 1:1 mapping between its pixels and the pixels being rendered for the color pass. Slight variations in the floating point calculations between the light projection and the camera projection matrixes lead to pixels moving in and out of shadow seemingly at random round the edges of a shaded area.

This has a relatively simple workaround – dont change the light position or orientation other than in whole pixel steps. This is completely documented in the link referenced earlier. Implementing the “Stable Light Frustrum” calculations has an awesome benefit – becasuet the light matrixes dont change every time the camera matrixes change, you can afford to redraw your shadow map once every 10 or 20 frames (or when the camera substantially moves orientation or location). This means you can go to town on the GPU cost of calculating the shadows, bringing in multiple cascading shadows maps into play, but recalculating only the very nearest ones and then only quite infrequently.

Examples

These examples use false color to indicate which of the three shadow maps is being used to calculate the shadows;

csm1

Here with more natural colours

csm2

 

Naturalistic World Partiniong using Pseudo Voroni

When considering vegetation coverage, woodland coverage, lakes, river courses and all other visual features you need to partition your landscape into regions. The regions need to look naturalisic (i.e. not hexagons, grids or other repeating patterns) but need to be even sized.

A nice trick is to use a random scatter of points and then triangulate them

Graph_GenerateFactory

Now carry out a “pseudo vornoi” algorithm to generate a pleasing set of regular partitions;

PseudoVoroni

Each of these units can be used as a landscape parition.

What is a psuedo-voroni algoirthm ? In this case I dont want a mathematically accurate voroni graph so I just take the centroid of each triangle and join them together. In the above diagram you should see the Voroni cell walls in white but the underlying original triangulation in dark blue.

You can then attach a set of properties to each cell based on their height and the affect of the adjacent voroni cell attributes.

For river courses the Voroni cell borders form a network of routes which an A* routing algorithm can use to describe all the possible routes of a river. Each Voroni cell vertex has only three possible adjacent points, and this makes route finding much faster and helps provide a nice river pattern when used as a set of control points to a spline.

Credit to http://www-cs-students.stanford.edu/~amitp/game-programming/polygon-map-generation/ who inspired this approach