3D Primer

Introduction

Vertices

Primitives

Coordinate System

Transformed and Untransformed Vertices

Culling

Perspective

Back-Buffers and Depth buffers

Wrap Up

Introduction

In this lesson I'm going to give you some theory and background. We'll discuss vertices, coordinate systems, culling, perspective and back-buffers. This should give you a good enough grounding for the following lessons.

Vertices

The DirectX documentation defines a vertex as "A point in 3-D space". And it is, but it's also more than that. As we will be using them, a vertex is a point in 3D space as well as the attributes describing that point. A vertex can have a colour, it can have a normal (its "facing" used to calculate lighting) and many other attributes. If you apply a texture to a mesh you need a way to determine how a 2D image will be fit to a 3D shape. Each vertex can have texture coordinates (tu/tv analogous to x/y) to accomplish that.

To define a vertex you create a structure to hold the required information and then create a descriptor to pass to Direct3D so it knows the structure of your vertex. The position of the vertex is typically given by 3 floats: x, y z. This can also be represented as a 3-element vector, since a vector is just an array of floats. A vertex often has a colour as well. The colour is a 32-bit value with 8-bits each for alpha (transparency), red, blue and green channels. In hex this value is formed like so: 0xAARRGGBB.

To illustrate, let's design a structure to hold a vertex that has a 3-float position and a colour. Additionally, there is a static member called Format. This member allows Direct3D to tell what members the structure defines. The order of the declarations is important and must be the same. Here is our structure.

struct my_vertex{
   float x,y,z;
   int colour;
   static readonly VertexFormats Format = VertexFormats.Position | VertexFormats.Diffuse;
};
Public Structure my_vertex
   Public X As Single
   Public Y As Single
   Public Z As Single
   Public Colour As Integer
   Public Const Format As D3D.VertexFormats = _
      D3D.VertexFormats.Transformed Or D3D.VertexFormats.Diffuse
End Structure

The position must come before the colour. Technically it is the Diffuse Colour since there can be other colours as well, though we won't cover them in this lesson. You may see this colour just listed as 'diffuse' in a structure, so now you know why.

As mentioned above, the Format is a descriptor of what the structure contains. This is done with FVF flags (Flexible Vertex Format). This is a 32-bit value that you create by ORing together flags describing the structure. Since we have a position, we use the VertexFormats.Position flag and we also use the VertexFormats.Diffuse flag since we provide a diffuse colour. Though the order of the data members within the structure are important, the order that the flags are given are not. This is simply a bit-wise OR combining the flags.

Additionally, there are a number of predefined vertex structures that cover the most common cases. CustomVertex.PositionColored is the predefined version of the structure we created above. We'll see others as we progress through the tutorials.

Now that you know what a vertex is, let's look at what you can build with them.

Primitives

Primitives are the building blocks of your 3D scene, and they are made up of vertices.

Points are the simplest primitive. Only a single vertex is required to define a point.

Lines are the next step up. A line is made up of 2 vertices, which make up it's end points ( which mathematically makes it a line segment rather than a line, but that's just being picky). There are 2 types of line primitives in Direct3D: Line Lists and Line Strips. In a Line List each pair of vertices is a line defined on it's own. In a Line Strip, the first 2 vertices are a line and each vertex after that extends the line. If you need to create a connected sequence of lines, a Line Strip is an efficient way to do it.

Finally, there is the triangle. A triangle is the most complex thing that current 3D hardware can render. A complex mesh is made up of triangles. Even complex things like NPatches are tessellated before being fed to the rendering hardware. Triangles come in 3 flavours: Lists, Strips and Fans. Triangle Lists are similar to Line Lists in that each primitive (in this case, a set of 3 vertices) is a stand-alone primitive unrelated to ones before it or after it. Strips continue on from the previous triangle, again similar to how Line Strips work. Each additional vertex creates a triangle from it and the 2 previous vertices. Fans are an optimized form of strips. The first vertex given sets the central point. All other triangles drawn use that point as well as the previous point as part of their rendering. Because they are so specialized, Fans are rarely used.

In the next few lesson I'll show you how to actually render the various primitives. But before we do that we have some more theory to cover.

Coordinate System

Vertices have a position. For a position to be meaningful, it must be relative to something otherwise we have no context. Coordinate systems tell you what the positions are relative to and also how they are interpreted. The coordinate systems we're going to cover here are Model Space, World Space and Screen Space.

Coordinates in Model Space are given relative to the model origin, which is at position (0,0,0). Typically the model is centered on the origin, though in some cases the origin may be places at one extreme of the model, a model's feet for example. You can't render something in Model Space, first it must be transformed to World Space.

World Space is where you put your scene together. The center of your world is at (0,0,0). Objects (in Model Space) are moved into World Space by transforming them by a matrix. This matrix contains information on how the object must be translated (moved), rotated and scaled to fit into World Space. For non-moving objects may choose to have them pre-translated into World Space. In this case their Model Space coordinates are equal to their World Space coordinates. Once your scene is put together, it can be transformed into Screen Space.

Screen Space coordinates are the final result used for rendering. I would like to note that there are a few other steps along the way from Model to Screen, but for now these 3 are enough. This coordinate system matches the screen dimensions with the X coordinate being equal to 0 on the left and increasing to 1 less than the screen width. Similarly, the Y coordinate is 0 at the top increasing to 1 less than the screen height at the bottom. In a 640x480 display, our coordinates would run from 0-639 on the X-axis and 0-479 on the Y-axis. You can also give all of your coordinates in Screen Space which means the above 2 steps are skipped. When your coordinates are given in Screen Space, they are called Transformed Vertices.

Transformed and Untransformed Vertices

When you feed Transformed vertices to the renderer it does no processing on them. No translation, scaling, rotation or lighting will be applied. Transformed vertices are usually used for 2D elements of 3D games. A Heads Up Display (HUD) or other status information is typically a 2D element overlayed on a 3D scene. Transformed vertices could also be used for every visual element in a 2D game. In a 3D game virtually every other visual element will be a model defined with Untransformed vertices.

Transformed vertices are different in other ways too. Their structure definition and FVF. Transformed vertices use the VertexFormats.Transformed flag for their position rather than the VertexFormats.Position we showed above. And as the flag implies it means there is another element to be added to the vertex structure.

The predefined version of a vertex with Transformed position and a diffuse colour is CustomVertex.TransformedColored. Here's the structure for our Transformed vertices.

struct transformed_vertex{
   float x, y, z, rhw;
   int colour;
   static readonly VertexFormats Format = VertexFormats.Transformed | VertexFormats.Diffuse;
};
Public Structure my_vertex
   Public X As Single
   Public Y As Single
   Public Z As Single
   Public Rhw As Single
   Public Colour As Integer
   Public Const Format As D3D.VertexFormats = _
      D3D.VertexFormats.Transformed Or D3D.VertexFormats.Diffuse
End Structure

So you may be asking, what's RHW mean? I'll try to give you a good answer. The official documentation is a bit skimpy and you rarely need it so it's often ignored. W is the eye-relative distance to the vertex. RHW stands for Reciprocal of Homogeneous W and so is equal to 1/W. This value is used in fog calculations and in clipping calculations. The good news is that unless you're writing a software engine, you don't need to worry about it. Just set it to 1.0 and all should be fine.

Culling

A triangle is made up of 3 vertices which form a face. It is further categorized as being a front face or a back face. Back faces are culled, which means they are not drawn. Facing is determined by the winding order of the vertices. If they are defined in a clock-wise manner they are front facing, otherwise they are back facing.

You can modify this behaviour to reverse the culling order or to disable culling entirely. This is done by setting the devices RenderStates. The state you want to modify is RenderStates.CullMode.

Perspective

When things are close to you they appear large, and when they are far away they seem small. This is perspective. In Direct3D your view can be a perspective view or an orthogonal (affine) view. An orthogonal view does not scale an object based on distance.

When rendering with Screen Space coordinates there is no perspective. While the Z coordinate gives you "depth", no scaling will occur based on it's value. It merely determines which objects are in front of what.

Most games use a perspective view for obvious reasons. 2D games would generally choose an affine view though. In a later lesson we will look at an Orthogonal Off-center Projection, which gives you an environment suitable for a 2D view that can trivially scale to other resolutions while allowing the flexibility of Untransformed vertices. It's pretty cool.

Back-Buffers and Depth buffers

The standard set up in Direct3D gives you a Double-Buffered display. This means that in addition to the scene which is visible to you, there is another buffer which the application is drawing on in the background. This way a scene can be fully rendered and then presented to the user when it is complete.

In Direct3D all rendering is done to a back-buffer. Direct access to the front buffer is not allowed. It's possible to have more than 1 back-buffer as well. 2 back-buffers is Triple Buffering and is sometimes used by games to help smooth out their frame rate.

An optional addition to the back-buffer is a Depth Buffer (commonly known as a Z buffer). A Depth buffer tracks the depth of the scene to prevent rendering objects which are not visible. Without a depth buffer, models in your scene will be rendered in the order you draw them, which is likely not the order in which they should appear. In addition to making the scene draw correctly they can also improve performance considerably. 2D games typically do not use a Depth Buffer. Their draw order is well-defined and would gain little by there use. There are of course always exceptions.

Wrap Up

Hopefully you know have some understanding of these concepts. Don't worry if it seems a little vague. Things will make much more sense when you get to see and use these things in practice.

Back