Fall Fury: Part 2 - Shaders

EDN Admin · Jan 23, 2013

Introduction to Shaders

FallFury not only relies on standard C++/C# and XAML code, but also on shaders. This article is intended for developers who are not aware of what shaders are and want to know how to use them in their projects. I will talk about creating shaders, as well as the shaders used in my project.
Check out the video for this article at http://channel9.msdn.com/Series/FallFury/Part-2-Shaders. For a complete, offline version of this series, you may download a nicely formatted PDF of all the articles.
Code on GPU

In simple terms, shaders are small programs that are executed on the Graphical Processing Unit (GPU) instead of the Central Processing Unit (CPU). In recent years, we’ve seen a major spike in the capabilities of graphic devices, allowing hardware manufacturers to design an execution layer tied to the GPU, therefore being able to target device-specific manipulations to a highly optimized unit. Shaders are not used for simple calculations, but rather for image processing. For example, a shader can be used to adjust image lighting or colors.
Modern GPUs give access to the rendering pipeline that allows developers to execute arbitrary image processing code. This is a step forward from a fixed-function pipeline that was present in older GPUs, where image processing tasks were integrated into the hardware unit and were only able to perform a limited set of actions, such as transforms. Unlike the rendering pipeline, the fixed-function pipeline was not programmable, therefore the developers were often tied to the hardware they used for offering specific game effects, in some cases having to resort to software-based adjustments.
Types of Shaders

There are different types of image manipulations that can be performed on a given input, and so there are different types of shaders designed to handle that. Currently, we can highlight three main shader types:

Vertex shaders – because what the user sees on the screen is not really three-dimensional, but rather a three-dimensional simulation in a 2D space, vertex shaders translate the coordinates of a vector in 3D space in relation to the 2D frame. Vertex shaders are executed one time per vector passed to the GPU. Typically, vectors carry data related to their position and the coordinate of the bound texture as well as color. A vertex shader is able to manipulate all these properties, but there is never a situation where a new vertex is created as a result of the execution.
Pixel shaders – these programs are executed on the GPU in relation to every passed pixel, working on a much lower level. For example, if you want specific pixels adjusted for lighting or 3D bump mapping, a pixel shader can provide the desired effect for a surface. Rarely, there are situations where only a few pixels should be re-adjusted at once. Pixel shaders are often run with an input of millions of pixels, resulting in complex effects.
Geometry shaders – these shaders are the next progression from vertex shaders, introduced with DirectX 10. The developer is able to pass specific primitives as input and either have the output represent the modified version of what was passed to the program or have new primitives, such as triangles, be generated as a result. Geometry shaders are always executed on post-vertex processing in the rendering pipeline. When vertex shader execution is completed, geometry shaders step in, if present. Geometry shaders can be used to refine the level of detail of a specific object. For example, when an object is closer or further away from the viewport camera, the mesh would have to be refined to minimize the rendering load.

FallFury uses all these shader types as a part of the game.
Enough Talk, Let’s Code

Now that you are aware of what shaders are, let’s write some sample code and test it. It is worth mentioning that shaders are not written in a standard high-level language, but rather in a language defined by the environment the shader is used in. For Direct3D, this is the High-Level Shader Language, or HLSL. It is somewhat similar to C, but with some specific nuances.
Start by creating a test Direct3D Windows Store application:

If you create and run this sample application, you will see that the output of it is a simple 3D spinning cube:

You probably also noticed that there are two shaders that are a component part of the newly created solution: SimplePixelShader.hlsl and SimpleVertexShader.hlsl. Take a look inside the pixel shader:

struct PixelShaderInput
{
float4 pos : SV_POSITION;
float3 color : COLOR0;
};

float4 main(PixelShaderInput input) : SV_TARGET
{
return float4(input.color,1.0f);
}

First of all, there is a PixelShaderInput struct[DW1] . It represents a single pixel – it has a float4 field that represents the position of the pixel and a float3 field that represents the RGB pixel color. The field itself is also marked by a predefined type, such as SV_POSITION or COLOR. This is a string that determines the use of the field. You can read more about the shader semantics http://msdn.microsoft.com/en-us/library/windows/desktop/bb509647(v=vs.85).aspx here.
The strings that are prefixed with SV_ are representing system-value semantics. Those have special meaning in the pipeline during the processing stages. In the pixel shader above, SV_POSITION will always mean the pixel position.
Look at what is being returned from the pixel shader—instead of the standard float3 color indicator, you are now returning a float4, which couples the existing value, input.color, with a 1.0f float value that represents the alpha-channel value. Remember, that inside shaders the color is clamped between 0 and 1 instead of the standard 255-value limit.
Since the cube rendering mechanism is already in place, let’s experiment with the pixel shader a bit. You’ve already got the float4 color representation:

You can set any of these values to 1.0 to return the solid color, so let’s do that. Let’s render the cube green. Modify the return statement to be this:
return float4(0.0, 1.0, 0.0, 1.0);
At this point, if you will run the program you will get a rendering very similar to this:

This is a good start.. Now, let’s now take a look at the vertex shader in the project. By default, you get this:

cbuffer ModelViewProjectionConstantBuffer : register(b0)
{
matrix model;
matrix view;
matrix projection;
};

struct VertexShaderInput
{
float3 pos : POSITION;
float3 color : COLOR0;
};

struct VertexShaderOutput
{
float4 pos : SV_POSITION;
float3 color : COLOR0;
};

VertexShaderOutput main(VertexShaderInput input)
{
VertexShaderOutput output;
float4 pos = float4(input.pos, 1.0f);

pos = mul(pos, model);
pos = mul(pos, view);
pos = mul(pos, projection);
output.pos = pos;

// Pass through the color without modification.
output.color = input.color;

return output;
}

There are a couple of differences here compared to the above mentioned pixel shader:

cbuffer ModelViewProjectionConstantBuffer – represents a constant buffer that contains three matrices, to which vertices are related—the model, view and projection matrices. Constant buffers are interesting structures that have been optimized to allow block-wise updates, where multiple constants are grouped in one boxed unit and can be updated simultaneously instead of having a one-by-one iterative update cycle.
struct VertexShaderInput – vertices are passed one-by-one to the shader, and this specific structure represents the input. Notice that it carries the 3D representation of the vertex position as well as its RGB color value.
struct VertexShaderOutput – the final processed output, which now carries the position in relation to all three matrices mentioned above as well as the processed color. Notice that the fourth value is the camera distance in the demo.
VertexShaderOutput main(VertexShaderInput input) – this is the main function that performs the vertex processing. Initially, it performs the position adjustment relative to the camera and then uses mul, a built-in function that multiplies matrices, to represent the current vertices in the projected space.

So let’s play around with what we have in stock. Take a look at the main function and how initially, the float3-based position is transformed to contain the camera distance:

float4 pos = float4(input.pos, 1.0f);

This position is normalized in relation to the current viewport. Afterwards, the vertex is transformed in the current 3D space (relative to three matrices – model, view, and projection):

pos = mul(pos, model);
pos = mul(pos, view);

pos = mul(pos, projection);

output.pos = pos;

As with the pixel shader, these are some very basic manipulations. Let’s flip the cube upside-down. To do this, we need to apply a rotation transformation. Here is an interesting piece of advice—when working with manipulations in shaders, make sure that you know basic matrix operations.
For example, to perform a simple 2D rotation, you need to use the rotation matrix:

But since we’re in 3D space, we not only have the X and Y coordinates, which relate to the matrix above, but also the Z coordinate. Therefore, a different method should be applied. There are three fundamental matrices that can be used to rotate an object around the three possible axes:

For now, we want to rotate the cube along the X-axis. To do this, inside the main function, we need to declare a constant that represents the rotation angle, in radians:

const float angle = 1.3962634;

Looking at the matrix above, we need to find out the sine and cosine of the given angle. When working with vertex shaders, HLSL offers a built-in function, called sincos that is able to get both values from an input value:

float cosLength, sinLength;
sincos(angle, sinLength, cosLength);

Now you need to declare the rotation matrix. Again, HLSL comes with some built-in capabilities to declare matrices and set their values:

float3x3 xAxisRotation = {
1.0, 0.0, 0.0, // Row 1
0, cosLength, -sinLength, // Row 2
0, sinLength, cosLength}; // Row 3

The standard format for matrix declaration in this case follows the following pattern:
Matrix<type,size> localName
For matrix multiplication, we can once again leverage the mul function:

float3 temporaryPosition;
temporaryPosition = mul(input.pos,xAxisRotation);

Now we can normalize the position to a float4 value that also includes the camera distance:

float4 pos = float4(temporaryPosition, 1.0f);

The rest of the transformation procedures can be taken directly from the original shader, multiplying the float4 position by the model, view, and projection matrices. Your entire main function should now look like this:

VertexShaderOutput main(VertexShaderInput input)
{
VertexShaderOutput output;

const float angle = 1.3962634;
float cosLength, sinLength;
sincos(angle, sinLength, cosLength);

float3x3 xAxisRotation = {
1.0, 0.0, 0.0, // Row 1
0, cosLength, -sinLength, // Row 2
0, sinLength, cosLength}; // Row 3

float3 temporaryPosition;
temporaryPosition = mul(input.pos,xAxisRotation);
float4 pos = float4(temporaryPosition, 1.0f);

// Transform the vertex position into projected space.
pos = mul(pos, model);
pos = mul(pos, view);
pos = mul(pos, projection);
output.pos = pos;

// Pass through the color without modification.
output.color = input.color;

return output;
}

The current angle is set to 80 degrees, or 1.3962634 radians. If you run the rendering test application, the result you will get will be similar to this:

Now, let’s look at how shaders are handled on the DirectX side.
Shaders in DirectX

Open CubeRenderer.cpp. This is the source file where internal shaders are read and passed to the device to be executed. Notice one interesting aspect of the process – the shader file contents are being read first and then passed to m_d3dDevice->CreateVertexShader. m_d3dDevice is a pointer to a virtual adapter that can be used to create device-specific resources. CreateVertexShader creates a vertex shader from an already compiled shader. You can notice it from the fact that the data is read not from the initial .hlsl file, but rather from the .cso (Compiled Shader Object):
auto loadVSTask = DX::ReadDataAsync("SimpleVertexShader.cso");
As with any program, before it is passed onto the execution layer, it has to be compiled first. Visual Studio 2012 comes with a bundled HLSL compiler, fxc—the Effect Compiler Tool. The default behavior for a shader is to be compiled with the .cso file extension in the same output directory as the project, but this can be changed by setting the Object File Name in the shader properties:

You can read more about the shader compilation process here.
Looking back at the sample vertex shader that was created as a part of the project, you will notice that there is a specific input layout set for the incoming object:

struct VertexShaderInput
{
float3 pos : POSITION;
float3 color : COLOR0;
};

The virtual adapter is not aware of the structure layout. Therefore, the developer needs to explicitly create an internal D3D11_INPUT_ELEMENT_DESC array that will contain the type of information passed to the shader:

const D3D11_INPUT_ELEMENT_DESC vertexDesc[] =
{
{
"POSITION",
0,
DXGI_FORMAT_R32G32B32_FLOAT,
0,
0,
D3D11_INPUT_PER_VERTEX_DATA,
0 },
{
"COLOR",
0,
DXGI_FORMAT_R32G32B32_FLOAT,
0,
12,
D3D11_INPUT_PER_VERTEX_DATA,
0 },
};

Once the description is complete, the input needs to be assembled for processing. That’s where CreateInputLayout comes into play:

DX::ThrowIfFailed(
m_d3dDevice->CreateInputLayout(
vertexDesc,
ARRAYSIZE(vertexDesc),
fileData->Data,
fileData->Length,
&m_inputLayout
)
);

You are basically passing a shader signature to the input assembler that will perform all the consequent processing. A similar process is applied for the existing pixel shader, with the main difference being the fact that instead of CreateVertexShader, CreatePixelShader is called:
Passing Custom Parameters to Shaders

Going back to our vertex shader where we performed the rotation relative to the X-axis, you probably noticed that the angle is hard-coded and is applied to each vector in the same way. This is rarely the case, as the angle would normally come from inside the game itself, often responding to internal behavior, such as character movement or action.
To pass a parameter to the shader, you need to first of all redefine the input structure. Let’s add an angle carrier to the VertexShaderInput struct in SimpleVertexShader.hlsl:

struct VertexShaderInput
{
float3 pos : POSITION;
float3 color : COLOR0;
float angle : TRANSFORM0;
};

This alone won’t do anything. Go the entry point function (main) and make sure that the angle constant is set to input.angle:

const float angle = input.angle;

Now the shader is ready, but you also need to let your game know that the vertex shader has a modified input layout. To do this, go to CubeRenderer.cpp and find the D3D11_INPUT_ELEMENT_DESC array that defines the input layout for incoming vertex shaders. Simply add an identifying item to the existing array:

const D3D11_INPUT_ELEMENT_DESC vertexDesc[] =
{
{ "POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0, D3D11_INPUT_PER_VERTEX_DATA, 0 },
{ "COLOR", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 12, D3D11_INPUT_PER_VERTEX_DATA, 0 },
{ "TRANSFORM", 0, DXGI_FORMAT_R32_FLOAT, 0, 24, D3D11_INPUT_PER_VERTEX_DATA, 0 }
};

The first parameter is the pre-defined semantic used in the shader. The second parameter defines the input index. After that, use DXGI_FORMAT_R32_FLOAT to indicate that the value passed will be a standard float. Pay close attention when you specify the input offset. Notice that the two FLOAT3 values, POSITION and COLOR, take 12 bytes each —a float takes 32 bits (4 bytes), and you have a triplet, therefore the offset for TRANSFORM should be 24 bytes (two times 12 bytes from previous indicators).
Now you need to modify the internal vector descriptors that are being passed to the rendering pipe. In the sample project those are created with the help of the VertexPositionColor class, located in the CubeRenderer.h. Add a simple float field to the existing struct, so it looks like this:

struct VertexPositionColor
{
DirectX::XMFLOAT3 pos;
DirectX::XMFLOAT3 color;
float angle;
};

In CubeRenderer.cpp, find the cubeVerticles array. For each created VertexPositionColor you are now able to add a third value representing the rotation:

auto createCubeTask = (createPSTask && createVSTask).then([this] () {
VertexPositionColor cubeVertices[] =
{
{XMFLOAT3(-0.5f, -0.5f, -0.5f), XMFLOAT3(0.0f, 0.0f, 0.0f), 1.38f},
{XMFLOAT3(-0.5f, -0.5f, 0.5f), XMFLOAT3(0.0f, 0.0f, 1.0f), 1.38f},
{XMFLOAT3(-0.5f, 0.5f, -0.5f), XMFLOAT3(0.0f, 1.0f, 0.0f), 1.38f},
{XMFLOAT3(-0.5f, 0.5f, 0.5f), XMFLOAT3(0.0f, 1.0f, 1.0f), 1.38f},
{XMFLOAT3( 0.5f, -0.5f, -0.5f), XMFLOAT3(1.0f, 0.0f, 0.0f), 1.38f},
{XMFLOAT3( 0.5f, -0.5f, 0.5f), XMFLOAT3(1.0f, 0.0f, 1.0f), 1.38f},
{XMFLOAT3( 0.5f, 0.5f, -0.5f), XMFLOAT3(1.0f, 1.0f, 0.0f), 1.38f},
{XMFLOAT3( 0.5f, 0.5f, 0.5f), XMFLOAT3(1.0f, 1.0f, 1.0f), 1.38f},
};

Now you will be able to pass parameters to the vertex shader without having to manually modify the shader itself.
Shaders in FallFury

There are several shaders used in FallFury, which must be selected depending on the Direct3D feature level available on the machine. A feature level determines what a video adapter can do in terms of rendering—even though Direct3D is a unified framework, it still heavily relies on how individual graphic cards can perform.
There are 5 shaders used internally for texture rendering. Specifically, there is a pixel shader, a replication vertex shader, an instancing vertex shader and geometry shaders. This could be familiar if you worked with Microsoft DirectX sprite samples before:

Given the platform limitations, geometry shaders can only be used on devices supporting Direct3D Feature Level 10. When FallFury detects that the feature level is lower than that, it has to fall back on either the instancing or the replication shaders. Some devices, such as the Microsoft Surface, are at feature level 9.1. Therefore, the replication render technique must be enforced.
Following the Microsoft DirectX Sprite Sample, it is fairly easy to simply select the correct render technique once the feature level is detected with GetFeatureLevel:

auto featureLevel = m_d3dDevice->GetFeatureLevel();
if (featureLevel >= D3D_FEATURE_LEVEL_10_0)
{
m_technique = RenderTechnique::GeometryShader;
}
else if (featureLevel >= D3D_FEATURE_LEVEL_9_3)
{
m_technique = RenderTechnique::Instancing;
}
else
{
m_technique = RenderTechnique::Replication;
if (capacity > static_cast<int>(Parameters::MaximumCapacityCompatible))
{
// The index buffer format for feature-level 9.1 devices may only be 16 bits.
// With 4 vertices per sprite, this allows a maximum of (1 << 16) / 4 sprites.
throw ref new Platform::InvalidArgumentException();
}
}

Depending on the selected render technique, which is determined on application startup, the proper input layout is selected for each shader and is used to control the rendering pipe.
Different graphic adapters also support different shader models. You need to account for those, and before running the application, the HLSL compiler will take on the task of determining whether a proper shader component is supported by the given model. In the shader properties, make sure that you specify the used shader model, as well as the type of the shader:

Without this, you are bound to experience compile time errors that will prevent you from launching the application or deploying it to different devices.
Conclusion

Shaders are not an easy subject. I highly recommend reading The Cg Tutorial, published online for free by nVidia. Even though it describes the Cg shader language, it is virtually identical to HLSL and you will not have any problems using the practices you learned there in building HLSL shaders.

View the full article

Fall Fury: Part 2 - Shaders

EDN Admin

Well-known member

Similar threads