2011-10-10

Dev

Discovering 3D rendering with C# and Direct3D 11

To develop 3D applications, you will have to understand underlying concepts. The aim of this article is to define these concepts in order to use them with Direct3D 11.

The final project is available here : https://www.catuhe.com/msdn/DiscoverD3D11.zip

From vertex to pixel

The first element we have to know is the vertex (with its plural form : vertices). A vertex is a point in 3D space. We can represent it with its simplest structure : a 3 values vector : x, y and z.

All we can see in a 3D application is built upon a set of vertices which define the backbone of our objects.

Figure 1. Vertices of a sphere

By the way, these vertices are not the only important actor. Indeed, to obtain a well shaped object, you also need to define faces.

Faces are triangles composed of 3 vertices. We use triangles because they are the simplest 2D geometric shape we can use to define a volume (a mesh).

Faces are a 3 values vector (i1, i2, i3) where each entry defines a vertex in the vertices list.

Figure 2. Sphere faces

So the definition of a plane can be done by the following code:

      
          float[] vertices = new[]
        
          {
        
          -1.0f, -1.0f, 0f,
        
          1.0f, -1.0f, 0f,
        
          1.0f, 1.0f, 0f,
        
          -1.0f, 1.0f, 0f,
        
          };
        
          short[] faces = new[]
        
          {
        
          (short)0, (short)1, (short)2,
        
          (short)0, (short)2, (short)3
        
          };

A plane is composed of 4 vertices and 2 faces which connect the vertices (for a total of 6 indices).

The main goal of a 3D application will be to use these data to produce pixels. Indeed, we must produce an 2D array where each cell will contains a color. The size of the array will be [Screen Width x Screen Height].

So we must split our code in 2 steps. First of all, we will see how we can transform a list of vertices and faces to a list of pixels (which are 4 values vectors). Then we will see how we can attribute a color to each pixel.

Making movies!

To understand the transition for a R3 space (the 3D) to a R2 space (the screen), we will make an analogy with cinema.

We will become a movie director who wants to make a commercial with a tennis ball.

The global world

At first, we must consider that the ball is stored in a room. For the film, we have to bring it to the stage.

In 3D, we call that operation the world transformation: we take the coordinates of an object and we move them to the scene world. Indeed, to construct a scene (or to make a movie), we must use a lot of objects which are all defined (or stored) with coordinates relative to the center of their world ([0, 0, 0]). If we do nothing, they will all be rendered at the same place (Much easier to do with a virtual scene than in real world Sourire ).

So we have to move them (and perhaps rotate and scale) to their final position.

To do so, we have to use a mathematical tool: the matrix!
A matrix is the representation of a geometric transformation. So, the result of the product of a vector and a matrix will give a vector modified by the matrix.

By multiplying two matrices, we obtain a new matrix containing a new transformation which is equal to the combination of the transformation of each matrix.

For instance: let say that we have two matrices : M1 and M2. M1 is a translation matrix and M2 is a rotation matrix. The result of M1xM2 is a matrix which applies a translation followed by a rotation.

So finally, using a matrix (called world matrix), we are able to define all required transformations to move/rotate/scale objects from their original position to their final position.

The point of view of the “camera”

When all objects are correctly rotated, scaled and moved, we have to apply a new matrix to compute their position from the point of view of the camera (the eye of the observer).

This new matrix is called the view matrix because it defines the point of view. It is essentially defined by a position and a target (where is the camera? what is the target of the camera?)

The projection

Finally, a last matrix is required: the projection matrix. This matrix is responsible of the conversion from the 3D world to the screen space (2D).

For example, starting from (x1, y1, z1) the projection matrix will produce (x2, y2) using the size of the screen, the field of view of the camera and the aspect ratio.

Geometric pipeline

Finally, every vertex will be modified by the following transformation:

Matrix_final = Matrix_world Matrix_view Matrix_projection

Pixel = Vertex * Matrix_final

The shaders or how to develop with your GPU ?

We are now ok with the theory. We will see how to use it with our GPU (Graphics Processing Unit, the brain of your graphics card) to unleash the power of accelerated rendering!

To develop on the GPU, we will use a specific language called HLSL (High Level Shader Language). This language is similar to C and allows to build shaders which are the base programs of the GPU.

And we will see below, there are many categories of shaders.

Vertex shader

The vertex shaders are the first shaders called in the graphic pipeline. They are responsible for transforming vertices to pixels. By the way, they will use the Matrix_final.

      
          cbuffer globals
        
          {
        
              matrix finalMatrix;
        
          }
        
          struct VS_IN
        
          {
        
              float3 pos : POSITION;
        
          };
        
          struct PS_IN
        
          {
        
              float4 pos : SV_POSITION;
        
          };
        
          // Vertex Shader
        
          PS_IN VS( VS_IN input )
        
          {
        
              PS_IN output = (PS_IN);
        
              output.pos = mul(float4(input.pos, 1), finalMatrix);
        
              return output;
        
          }

The vertex shader is a function which takes a vertex as input parameter (we have to define its structure) and returns a pixel (we also need to define its structure).

For now, the structures are really simple: a vector3 in input and a vector4 in output. Of course, in the next steps we will add more information to our structures.

The work of the vertex shader is only to apply the final matrix (which is defined as a global variable) to every vertex. Of course, vertex shaders can be more complex if required.

Pixel shader

The aim of the pixel shader is to produce a color for each pixel. So after processing vertices with the vertex shader, the pixel shader will work on the produced list of pixels.

It is important to note that there is an additional stage between vertex shader and pixel shader: the rasterization. This step will clip the pixel (i.e. will only keep visible pixels) and will do the required interpolation to generate all pixels to fill triangles.

Indeed, the vertex shader will only produce pixels for the three points of a face. The rasterizer will interpolate all missing pixels to fill the gap.

Finally for every pixel, the following pixel shader will be applied:

      
          // Pixel Shader
        
          float4 PS( PS_IN input ) : SV_Target
        
          {
        
              return float4(1, 1, 1, 1);
        
          }

Our pixel shader takes a pixel as input parameter and returns a color. For now, this is the same color for every pixel. So we need to add some code to use a texture in order to produce better looking results.

To do so, we have to update our vertices to add a texture coordinates alongside with the inner coordinates of the vertex.

The vertex shader will take the texture coordinates and will return it unmodified to the pixel shader.

The pixel shader will use the texture coordinates to read a color in a texture and return it as the color of the pixel. To read a texture, we will add a sampler (which is a tool to read a texture) and a texture variable:

      
          cbuffer globals
        
          {
        
              matrix finalMatrix;
        
          }
        
          struct VS_IN
        
          {
        
              float3 pos : POSITION;
        
              float2 uv : TEXCOORD0;
        
          };
        
          struct PS_IN
        
          {
        
              float4 pos : SV_POSITION;
        
              float2 uv : TEXCOORD0;
        
          };
        
          // Vertex Shader
        
          PS_IN VS( VS_IN input )
        
          {
        
              PS_IN output = (PS_IN);
        
              output.pos = mul(float4(input.pos, 1), finalMatrix);
        
              output.uv = input.uv;
        
              return output;
        
          }
        
          Texture2D yodaTexture;
        
          SamplerState currentSampler
        
          {
        
              Filter = MIN_MAG_MIP_LINEAR;
        
              AddressU = Wrap;
        
              AddressV = Wrap;
        
          };
        
          // Pixel Shader
        
          float4 PS( PS_IN input ) : SV_Target
        
          {
        
              return yodaTexture.Sample(currentSampler, input.uv);
        
          }

.FX files

The .FX files (or effect files) allow users to gather all the shaders code in a single file. They also allow the declaration of variables and constants.

Finally they include one or more techniques. A technique is composed of one or more passes. A pass is a declaration of a complete pipeline with at least a vertex and a pixel shader.

In our case, the technique declaration can be:

      
          // Technique
        
          technique10 Render
        
          {
        
              pass P0
        
              {
        
                  SetGeometryShader(  );
        
                  SetVertexShader( CompileShader( vs_4_0, VS() ) );
        
                  SetPixelShader( CompileShader( ps_4_0, PS() ) );
        
              }
        
          }

This file will be used by Direct3D to configure the graphic pipeline.

Using Direct3D 11

To use DirectX 11 (and its 3D part: Direct3D 11), we will use a managed wrapper called SlimDX (https://slimdx.org/download.php).

Indeed, DirectX is a COM API and SlimDX allows us to use it efficiently (by reducing the overhead of marshalling between .NET and COM).

Initialization

To initialize Direct3D 11, we need to define 4 required variables:

The device which will be our broker to the driver of the graphic card
The swap chain which defines how the rendered image will be copied from the graphic card to the display window
The back-buffer which is the graphic card’s memory dedicated to produce the rendered image
The render view which is the view on the back-buffer. With Direct3D 11, the buffers are not used directly but through views. It is a really interesting concept as it allows us to have only one memory resource with many views on it (with different associated semantics)

The initialization code will be the following:

      
          // Creating device (we accept dx10 cards or greater)
        
          FeatureLevel[] levels = {
        
                                      FeatureLevel.Level_11_0,
        
                                      FeatureLevel.Level_10_1,
        
                                      FeatureLevel.Level_10_0
        
                                  };
        
          // Defining our swap chain
        
          SwapChainDescription desc = new SwapChainDescription();
        
          desc.BufferCount = 1;
        
          desc.Usage = Usage.BackBuffer | Usage.RenderTargetOutput;
        
          desc.ModeDescription = new ModeDescription(0, 0, new Rational(0, 0), Format.R8G8B8A8_UNorm);
        
          desc.SampleDescription = new SampleDescription(1, 0);
        
          desc.OutputHandle = Handle;
        
          desc.IsWindowed = true;
        
          desc.SwapEffect = SwapEffect.Discard;
        
          Device.CreateWithSwapChain(DriverType.Hardware, DeviceCreationFlags.None, levels, desc, out device11, out swapChain);
        
          // Getting back buffer
        
          backBuffer = Resource.FromSwapChain<Texture2D>(swapChain, 0);
        
          // Defining render view
        
          renderTargetView = new RenderTargetView(device11, backBuffer);
        
          device11.ImmediateContext.OutputMerger.SetTargets(renderTargetView);
        
          device11.ImmediateContext.Rasterizer.SetViewports(new Viewport(0, 0, ClientSize.Width, ClientSize.Height, 0.0f, 1.0f));

The main part is around the description of the swap chain. We indicate here that we want a swap chain on the back-buffer without anti-aliasing (SampleDescription with only one sample) using a windowed display.

We can also note the use of FeatureLevels which determine that we only want to work with graphic cards supporting Direct3D 10, 10.1 or 11.

Using our shaders

To use our shaders, we need to compile the .FX file:

      
          using (ShaderBytecode byteCode = ShaderBytecode.CompileFromFile(“Effet.fx”, “bidon”, “fx_5_0”, ShaderFlags.OptimizationLevel3, EffectFlags.None))
        
          {
        
              effect = new Effect(device11, byteCode);
        
          }
        
          var technique = effect.GetTechniqueByIndex(0);
        
          var pass = technique.GetPassByIndex(0);
        
          layout = new InputLayout(device11, pass.Description.Signature, new[] {
        
                              new InputElement(“POSITION”, 0, Format.R32G32B32_Float, 0, 0),
        
                              new InputElement(“TEXCOORD”, 0, Format.R32G32_Float, 12, 0)
        
              });

Compilation of an effect is done using the Effect constructor which takes byte code produced by ShaderByteCode.CompileFromFile.

Then we need to describe how our vertices are structured. To do so we have to produce an InputLayout (it is worth noting that we use the signature of the first pass of the effect to ensure the compatibility of our layout with the input vertex structure of the effect).

Preparing geometric data

We will create buffers to save and use our geometric data. The buffers will be created inside the graphic card:

A vertex buffer for the vertices
An index buffer for the faces (the indices)

      
          float[] vertices = new[]
        
                                  {
        
                                      -1.0f, -1.0f, 0f, 0f, 1.0f,
        
                                      1.0f, -1.0f, 0f, 1.0f, 1.0f,
        
                                      1.0f, 1.0f, 0f, 1.0f, 0.0f,
        
                                      -1.0f, 1.0f, 0f, 0.0f, 0.0f,
        
                                  };
        
          short[] faces = new[]
        
                              {
        
                                      (short)0, (short)1, (short)2,
        
                                      (short)0, (short)2, (short)3
        
                              };
        
          // Creating vertex buffer
        
          var stream = new DataStream(4  vertexSize, true, true);
        
          stream.WriteRange(vertices);
        
          stream.Position = 0;
        
          var vertexBuffer = new Buffer(device11, stream, new BufferDescription
        
          {
        
              BindFlags = BindFlags.VertexBuffer,
        
              CpuAccessFlags = CpuAccessFlags.None,
        
              OptionFlags = ResourceOptionFlags.None,
        
              SizeInBytes = (int)stream.Length,
        
              Usage = ResourceUsage.Default
        
          });
        
          stream.Dispose();
        
          // Index buffer
        
          stream = new DataStream(6  2, true, true);
        
          stream.WriteRange(faces);
        
          stream.Position = 0;
        
          var indices = new Buffer(device11, stream, new BufferDescription
        
          {
        
              BindFlags = BindFlags.IndexBuffer,
        
              CpuAccessFlags = CpuAccessFlags.None,
        
              OptionFlags = ResourceOptionFlags.None,
        
              SizeInBytes = (int)stream.Length,
        
              Usage = ResourceUsage.Default
        
          });
        
          stream.Dispose();

The two buffers are created in a way the CPU cannot access them. Thus, Direct3D can create them in the graphic card memory (the more efficient for the GPU).

Then we just have to transfer them to the device:

      
          // Uploading to the device
        
          device11.ImmediateContext.InputAssembler.InputLayout = layout;
        
          device11.ImmediateContext.InputAssembler.PrimitiveTopology = PrimitiveTopology.TriangleList;
        
          device11.ImmediateContext.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(vertexBuffer, vertexSize, 0));
        
          device11.ImmediateContext.InputAssembler.SetIndexBuffer(indices, Format.R16_UInt, 0);

We also define the current input layout and the topology to use (triangle lists).

Affecting constants to shaders

The be ready, shaders need user to define constants and especially the Matrix_Final et the texture to use.

So for our effect, we can do something like that:

      
          // Texture
        
          Texture2D texture2D = Texture2D.FromFile(device11, “yoda.jpg”);
        
          ShaderResourceView view = new ShaderResourceView(device11, texture2D);
        
          effect.GetVariableByName(“yodaTexture”).AsResource().SetResource(view);
        
          RasterizerStateDescription rasterizerStateDescription = new RasterizerStateDescription {CullMode = CullMode.None, FillMode = FillMode.Solid};
        
          device11.ImmediateContext.Rasterizer.State = RasterizerState.FromDescription(device11, rasterizerStateDescription);
        
          // Matrices
        
          Matrix worldMatrix = Matrix.RotationY(0.5f);
        
          Matrix viewMatrix = Matrix.Translation(0, 0, 5.0f);
        
          const float fov = 0.8f;
        
          Matrix projectionMatrix = Matrix.PerspectiveFovLH(fov, ClientSize.Width / (float)ClientSize.Height, 0.1f, 1000.0f);
        
          effect.GetVariableByName(“finalMatrix”).AsMatrix().SetMatrix(worldMatrix  viewMatrix  projectionMatrix);

As seen before, we compute our Matrix_Final by multiplying the three base matrices (built using statics methods of Matrix class).

And using GetVariableByName method, we can set the constants values.

Final render

So we have our geometry (vertex and index buffers) ready to use and our shaders are compiled and defined.

We just have now to launch the rendering process!

      
          // Render
        
          device11.ImmediateContext.ClearRenderTargetView(renderTargetView, new Color4(1.0f, 0, 0, 1.0f));
        
          effect.GetTechniqueByIndex(0).GetPassByIndex(0).Apply(device11.ImmediateContext);
        
          device11.ImmediateContext.DrawIndexed(6, 0, 0);
        
          swapChain.Present(0, PresentFlags.None);

The process is the following:

We clear the back-buffer
Using the desired technique, we get its first pass and we apply it (which means that we affect shaders and constants to the graphic card)
Then using the immediate context of the device, we launch the rendering process on 6 indices (2 faces)
Finally we present the result to the main window

Figure 3. The marvelous final render with Direct3D 11!

Conclusion

So we are now ready to produce high quality rendering. By using shaders and .fx files, we can render every kind of advanced materials. The only limit is our imagination (and our mastering of optical effects Sourire ).

Our system renders a list of vertices and faces, so rendering a plane or a complete city is nearly the same thing (obviously, rendering a city will require some additional optimizations!)

Feel free to play with the associated code and unleash the power of 3D!

Eternalcoding

var life = new[] {"eat", "sleep", "code"}