Rendering Fields of Grass
u s i n g D i r e c t X 11 i n
GRID Autosport
Richard Kettlewell
Codemasters
M O T I VAT I O N
Current implementation engineered for PS3/XBox360
M O T I VAT I O N
High-end PC can do much better
DirectX 11
Compute Shaders
Lots of interesting techniques online
Outerra (http://goo.gl/tYlcjN)
Nvidia (http://goo.gl/F43iTY)
GOALS
High density
Keep all data on GPU, for efficiency
Get rid of polygonal look of terrain
Flat polys with grass textures are unconvincing
Interaction
Wind
Deformation
OUR APPROACH
Generate
Populate Append Buffer with blades of grass
Render
Read Append Buffer
Construct geometry in Vertex Shader
Rasterise using Alpha-To-Coverage
No sorting required
ART PROCESS
Simple world space map
RGB defines grass colour
Alpha defines grass height
2K x 2K
Wastes resolution
Simplest approach given time constraints
UV mapping onto terrain would be better
Doesn’t scale well for large point-to-point tracks
G E N E R AT I N G G R A S S
Render Terrain using custom shader
Orthographic top-down render, centred around viewer
Output to Append Buffer, not Render Target
Every pixel could be a blade of grass
Debug mode outputs to render target, for visualisation
G E N E R AT I N G G R A S S
Every pixel could be a blade of grass
Control density using viewport size
Spreads the pixels over more/less distance
Need to cull unimportant blades
Set Scissor Rectangle around view segment
Frustum cull against main scene camera
Read world space map (discard if alpha < threshold)
Scissor Rectangle
Create bounding box from circle segment
View position
2 extents
Any axis intersection
Extra points around viewer
Fixes problem when looking down
G E N E R AT I N G G R A S S
LODs
Vital for performance
Distance based
Each LOD discards increasing amounts of grass
Remaining blades are scaled up to fill gaps
G E N E R AT I N G G R A S S
LODs
Feather distances randomly, to break up transitions
Randomise distance calculation
Fade grass height towards zero over last 15%
R A N D O M I S AT I O N
Generate texture at load time
Fill a 64x64 RGBA texture with rand()
Provides 4 random numbers per grass blade
Align texture to orthographic projection
Used for
Rotation
Position
Scale
Varying Albedo
Etc
APPEND BUFFER
Represents every valid pixel from Generate stage
DirectX 11 Structured Buffer
Each element represents one grass blade
struct Instance
{
float3 position;
Output to this instead of Render Target
float specular;
16 byte aligned
float3 albedo;
Pack 16bit values where possible
f32tof16
uint vertexOffsetAndSkew;
f16tof32
float2 rotation;
float2 scale;
};
DrawInstancedIndirect
Allows the GPU to control Draw arguments
Because we don’t know how many grass instances the GPU generated
Avoids copying the AppendBuffer structure count back to CPU
Same as DrawInstanced, except arguments come from GPU buffer
VertexCountPerInstance
InstanceCount
StartVertexLocation
StartInstanceLocation
Create ID3D11Buffer with D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS
DrawInstancedIndirect
Use CopyStructureCount to copy size of Append Buffer into Constant Buffer
Populate buffer using Compute Shader
Dispatch a single thread
Is there a better way?
// buffer
RWBuffer<uint> g_drawInstancedBuffer : register( u0 );
// vertex buffer counter
cbuffer BufferCounter : register( b12 )
{
uint numInstances;
}
[numthreads( 1, 1, 1 )]
void cp()
{
g_drawInstancedBuffer[
g_drawInstancedBuffer[
g_drawInstancedBuffer[
g_drawInstancedBuffer[
}
0
1
2
3
]
]
]
]
=
=
=
=
6u;
numInstances;
0u;
0u;
//
//
//
//
vertexCountPerInstance
instanceCount
startVertexLocation
startInstanceLocation
DrawInstancedIndirect
Avoid dispatching high instance counts with low vertex counts
http://www.slideshare.net/DevCentralAMD/vertex-shader-tricks-bill-bilodeau
Prefer dispatching a single large instance
Reconstruct vertex/instance ID in Vertex Shader
Use SV_VertexID
// buffer
RWBuffer<uint> g_drawInstancedBuffer : register( u0 );
// vertex buffer counter
cbuffer BufferCounter : register( b12 )
{
uint numInstances;
}
[numthreads( 1, 1, 1 )]
void cp()
{
g_drawInstancedBuffer[
g_drawInstancedBuffer[
g_drawInstancedBuffer[
g_drawInstancedBuffer[
}
0
1
2
3
]
]
]
]
=
=
=
=
6u * numInstances;// vertexCountPerInstance
1u;
// instanceCount
0u;
// startVertexLocation
0u;
// startInstanceLocation
I N I T I A L R E S U LT S
GEOMETRY OR FINS?
Initial implementation used geometry
Inspired by Outerra tech
Heavily vertex bound
Difficult to make grass look soft with few verts per blade
Difficult to achieve desired grass density
We were using only 5 verts per blade
Contributing to spiky results
Could tessellate close grass?
Outerra grass geometry
GEOMETRY OR FINS?
Use fins instead?
More traditional approach to rendering grass
Alpha Testing/ATOC
Each ‘blade’ now represents one billboard
Easier to add variety via UV shifting
Softer grass can be painted into texture
Most of existing Generation tech still valid
RENDERING
Vertex data hardcoded in shader
Use SV_VertexID to generate it
Construct matrix from position and rotation
// sin/cos for rotation matrix
float s = instance.rotation.x;
float c = instance.rotation.y;
Apply scale to all verts
Apply skew to top verts
// world matrix
Texture format (DXT1)
float3 worldPosition = instance.position;
Red: Diffuse tint
float4x4 m = float4x4(
float4( c, 0, s, worldPosition.x ),
Green: Specular map
float4( 0, 1, 0, worldPosition.y ),
Blue: Alpha
float4( -s, 0, c, worldPosition.z ),
Negative LOD Bias
3/4 mip
float4( 0, 0, 0, 1 )
);
LIGHTING
Calculated per instance
More efficient than per vertex/per pixel
Inaccurate for large billboards
Normals
Combine terrain and billboard normal
Randomise albedo
Small amount of noise makes big difference
Darken terrain under grass
Terrain shader reads grass map for height
Specular
Use terrain normal and apply random reduction factor
Fade effects in distance for smooth transition to
terrain
SHADOWS
Game creates a screen-space mask from depth pre-pass
Pixel Shaders read mask instead of cascades
One sample per grass instance
What if grass instance is partially occluded?
Solution
Read shadow cascades directly
SSAO
Same problem as shadows (screen-space mask)
Expensive to add grass to depth pre-pass
Must cope with screen-space problem (no shadow cascades!)
SSAO also includes undercar shadow
Leaks around car edges
Solution
Use depth buffer to compare 2 sample points
Read SSAO from sample with furthest depth value
Solves car occluding grass
SELF OCCLUSION
Tall grass should occlude neighbours
Treat height map like normal map
Sample neighbours to estimate slope
Use normal and sun direction to estimate occlusion
Artist controlled strength
SELF OCCLUSION
D E F O R M AT I O N
Cars/dynamic objects should flatten grass
Render objects into F32 height texture
Pass 1: Render centred around viewer
Pass 2: Update texture into world space tiled texture
Prevents texel swimming
Fade edges of texture as it wraps around
Use skidmarks not wheels
Read height value in Generate stage
If height intersects grass, modify the albedo, scale
and skew, to appear squashed/flattened
D E F O R M AT I O N
PERFORMANCE
Worst Case (ms)
1920 x 1200
4xMSAA
Generate Render Total
AMD R9 290X
1.3
1.5
2.8
Nvidia GTX 780Ti
1.4
1.8
3.2
Nvidia GTX 560Ti
3.9
3.6
7.5
Intel HD5200
5.1
9.4*
14.5
*MSAA Disabled
PERFORMANCE
Average Case (ms)
1920 x 1200
4xMSAA
Generate Render Total
AMD R9 290X
1.5
0.2
1.7
Nvidia GTX 780Ti
1.6
0.3
1.9
Nvidia GTX 560Ti
4.1
0.8
4.9
Intel HD5200
5.2
2.0*
7.2
*MSAA Disabled
FUTURE IMPROVEMENTS
One Generate per LOD
Wind
Prototyped, but too subtle on short grass
Similar to deformation
Render car speed instead of height
Bleed speed values out over texture
Read in Generate stage, to increase existing sine wave sway
Flowers
Meshes
Gravel / small rocks
Improve art authoring pipeline
World space map is naïve, wastes texture space
Translucency
WE ARE HIRING!
http://www.codemasters.com/uk/working-for-us/southam/
THANKS FOR LISTENING!
Questions?
© Copyright 2025