Cogs and Levers A blog full of technical stuff

Basic 3D

Introduction

In this post, we’ll explore the foundations of 3D graphics, focusing on vector math, matrices, and transformations. By the end, you’ll understand how objects are transformed in 3D space and projected onto the screen. We’ll use Haskell for the code examples, as it closely resembles the mathematical operations involved.

Vectors

A 4D vector has four components: \(x\), \(y\), \(z\), and \(w\).

data Vec4 = Vec4 { x :: Double, y :: Double, z :: Double, w :: Double }
    deriving (Show, Eq)

In 3D graphics, we often work with 4D vectors (also called homogeneous coordinates) rather than 3D vectors. The extra dimension allows us to represent translations (which are not linear transformations) as matrix operations, keeping the math uniform.

A 4D vector is written as:

\[\boldsymbol{v} = \begin{bmatrix} x \\ y \\ z \\ w \end{bmatrix}\]

Where:

  • \(x, y, z\) represent the position in 3D space
  • \(w\) is a homogeneous coordinate that allows us to apply translations and perspective transformations.

The extra \(w\)-component is crucial for distinguishing between points and directions (i.e., vectors). When \(w = 1\), the vector represents a point. When \(w = 0\), it represents a direction or vector.

Operations

We need to perform various operations on vectors in 3D space (or 4D homogeneous space), including addition, subtraction, multiplication, dot products, and normalization.

Addition

Given two vectors \(\boldsymbol{a}\) and \(\boldsymbol{b}\):

\[\boldsymbol{a} + \boldsymbol{b} = \begin{bmatrix} a_x \\ a_y \\ a_z \\ a_w \end{bmatrix} + \begin{bmatrix} b_x \\ b_y \\ b_z \\ b_w \end{bmatrix} = \begin{bmatrix} a_x + b_x \\ a_y + b_y \\ a_z + b_z \\ a_w + b_w \end{bmatrix}\]
add :: Vec4 -> Vec4 -> Vec4
add (Vec4 ax ay az aw) (Vec4 bx by bz bw) = Vec4 (ax + bx) (ay + by) (az + bz) (aw + bw)

Subtraction

\[\boldsymbol{a} - \boldsymbol{b} = \begin{bmatrix} a_x \\ a_y \\ a_z \\ a_w \end{bmatrix} - \begin{bmatrix} b_x \\ b_y \\ b_z \\ b_w \end{bmatrix} = \begin{bmatrix} a_x - b_x \\ a_y - b_y \\ a_z - b_z \\ a_w - b_w \end{bmatrix}\]
sub :: Vec4 -> Vec4 -> Vec4
sub (Vec4 ax ay az aw) (Vec4 bx by bz bw) = Vec4 (ax - bx) (ay - by) (az - bz) (aw - bw)

Dot Product

The dot product of two 3D vectors \(\boldsymbol{a} \cdot \boldsymbol{b}\) gives a scalar value:

\[\boldsymbol{a} \cdot \boldsymbol{b} = a_x \cdot b_x + a_y \cdot b_y + a_z \cdot b_z\]
dot :: Vec4 -> Vec4 -> Double
dot (Vec4 ax ay az _) (Vec4 bx by bz _) = ax * bx + ay * by + az * bz

Cross Product

The cross product is a vector operation that takes two 3D vectors and returns a third vector that is orthogonal (perpendicular) to both of the input vectors. The cross product is commonly used in 3D graphics to calculate surface normals, among other things.

For two 3D vectors \(\boldsymbol{a}\) and \(\boldsymbol{b}\), the cross product \(\boldsymbol{a} \times \boldsymbol{b}\) is defined as:

\[\boldsymbol{a} \times \boldsymbol{b} = \begin{bmatrix} a_y \cdot b_z - a_z \cdot b_y \\ a_z \cdot b_x - a_x \cdot b_z \\ a_x \cdot b_y - a_y \cdot b_x \end{bmatrix}\]

This resulting vector is perpendicular to both \(\boldsymbol{a}\) and \(\boldsymbol{b}\).

To implement the cross product in Haskell, we will only operate on the \(x\), \(y\), and \(z\) components of a Vec4 (ignoring \(w\)) since the cross product is defined for 3D vectors.

-- Compute the cross product of two 3D vectors
cross :: Vec4 -> Vec4 -> Vec4
cross (Vec4 ax ay az _) (Vec4 bx by bz _) =
    Vec4 ((ay * bz) - (az * by))  -- x component
         ((az * bx) - (ax * bz))  -- y component
         ((ax * by) - (ay * bx))  -- z component
         0                        -- w is zero for a direction vector

Length

The length or magnitude of a vector \(\boldsymbol{v}\) is:

\[\lVert \boldsymbol{v} \rVert = \sqrt{x^2 + y^2 + z^2}\]
len :: Vec4 -> Double
len (Vec4 x y z _) = sqrt (x * x + y * y + z * z)

Normalization

To normalize a vector is to scale it so that its length is 1:

\[\boldsymbol{v}_{\text{norm}} = \frac{\boldsymbol{v}}{\lVert \boldsymbol{v} \rVert}\]
normalize :: Vec4 -> Vec4
normalize v = let l = len v in scale (1 / l) v

Matrices

A 4x4 matrix consists of 16 elements. We’ll represent it as a flat structure with 16 values:

\[M = \begin{bmatrix} m_{00} & m_{01} & m_{02} & m_{03} \\ m_{10} & m_{11} & m_{12} & m_{13} \\ m_{20} & m_{21} & m_{22} & m_{23} \\ m_{30} & m_{31} & m_{32} & m_{33} \end{bmatrix}\]

In Haskell, we define it as:

data Mat4 = Mat4 { m00 :: Double, m01 :: Double, m02 :: Double, m03 :: Double
                 , m10 :: Double, m11 :: Double, m12 :: Double, m13 :: Double
                 , m20 :: Double, m21 :: Double, m22 :: Double, m23 :: Double
                 , m30 :: Double, m31 :: Double, m32 :: Double, m33 :: Double
}
deriving (Show, Eq)

In 3D graphics, transformations are applied to objects using 4x4 matrices. These matrices allow us to perform operations like translation, scaling, and rotation.

Operations

Addition

Adding two matrices \(A\) and \(B\) is done element-wise:

\[A + B = \begin{bmatrix} a_{00} + b_{00} & a_{01} + b_{01} & \dots \\ a_{10} + b_{10} & \dots & \dots \end{bmatrix}\]
addM :: Mat4 -> Mat4 -> Mat4
addM (Mat4 a00 a01 a02 a03 a10 a11 a12 a13 a20 a21 a22 a23 a30 a31 a32 a33)
(Mat4 b00 b01 b02 b03 b10 b11 b12 b13 b20 b21 b22 b23 b30 b31 b32 b33) =
    Mat4 (a00 + b00) (a01 + b01) (a02 + b02) (a03 + b03)
         (a10 + b10) (a11 + b11) (a12 + b12) (a13 + b13)
         (a20 + b20) (a21 + b21) (a22 + b22) (a23 + b23)
         (a30 + b30) (a31 + b31) (a32 + b32) (a33 + b33)

Multiplication

Multiplying two matrices \(A\) and \(B\):

\[C = A \cdot B\]

Where each element \(c_{ij}\) of the resulting matrix is calculated as:

\[c_{ij} = a_{i0} \cdot b_{0j} + a_{i1} \cdot b_{1j} + a_{i2} \cdot b_{2j} + a_{i3} \cdot b_{3j}\]
mulM :: Mat4 -> Mat4 -> Mat4
mulM (Mat4 a00 a01 a02 a03 a10 a11 a12 a13 a20 a21 a22 a23 a30 a31 a32 a33)
(Mat4 b00 b01 b02 b03 b10 b11 b12 b13 b20 b21 b22 b23 b30 b31 b32 b33) =
    Mat4 (a00 * b00 + a01 * b10 + a02 * b20 + a03 * b30)
         (a00 * b01 + a01 * b11 + a02 * b21 + a03 * b31)
         (a00 * b02 + a01 * b12 + a02 * b22 + a03 * b32)
         (a00 * b03 + a01 * b13 + a02 * b23 + a03 * b33)
         (a10 * b00 + a11 * b10 + a12 * b20 + a13 * b30)
         (a10 * b01 + a11 * b11 + a12 * b21 + a13 * b31)
         (a10 * b02 + a11 * b12 + a12 * b22 + a13 * b32)
         (a10 * b03 + a11 * b13 + a12 * b23 + a13 * b33)
         (a20 * b00 + a21 * b10 + a22 * b20 + a23 * b30)
         (a20 * b01 + a21 * b11 + a22 * b21 + a23 * b31)
         (a20 * b02 + a21 * b12 + a22 * b22 + a23 * b32)
         (a20 * b03 + a21 * b13 + a22 * b23 + a23 * b33)
         (a30 * b00 + a31 * b10 + a32 * b20 + a33 * b30)
         (a30 * b01 + a31 * b11 + a32 * b21 + a33 * b31)
         (a30 * b02 + a31 * b12 + a32 * b22 + a33 * b32)
         (a30 * b03 + a31 * b13 + a32 * b23 + a33 * b33)

Vector Multiply

We transform a vector by a matrix using a multiply operation.

\[\boldsymbol{v'} = M \cdot \boldsymbol{v}\]
-- Multiplying a 4D vector by a 4x4 matrix
mulMV :: Mat4 -> Vec4 -> Vec4
mulMV (Mat4 m00 m01 m02 m03 m10 m11 m12 m13 m20 m21 m22 m23 m30 m31 m32 m33)
    (Vec4 x y z w) =
        Vec4 (m00 * x + m01 * y + m02 * z + m03 * w)
             (m10 * x + m11 * y + m12 * z + m13 * w)
             (m20 * x + m21 * y + m22 * z + m23 * w)
             (m30 * x + m31 * y + m32 * z + m33 * w)

3D Transformations

In 3D graphics, we apply transformations like translation, scaling, and rotation using matrices. These transformations are applied to 4D vectors, and the operations are represented as matrix multiplications.

Identity Matrix

The identity matrix is a 4x4 matrix that leaves a vector unchanged when multiplied:

\[I = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]
identity :: Mat4
identity = 
    Mat4 1 0 0 0
         0 1 0 0
         0 0 1 0
         0 0 0 1

Translation Matrix

To translate a point by \((t_x, t_y, t_z)\), we use the translation matrix:

\[T = \begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix}\]
translation :: Double -> Double -> Double -> Mat4
translation tx ty tz = 
    Mat4 1 0 0 tx
         0 1 0 ty
         0 0 1 tz
         0 0 0 1

Scale Matrix

Scaling a vector by \(s_x, s_y, s_z\) is done using the following matrix:

\[S = \begin{bmatrix} s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]
scale :: Double -> Double -> Double -> Mat4
scale sx sy sz = 
    Mat4 sx 0  0  0
         0  sy 0  0
         0  0  sz 0
         0  0  0  1

Rotation Matrix

In 3D graphics, we frequently need to rotate objects around the X, Y, and Z axes. Each axis has its own corresponding rotation matrix, which we use to apply the rotation transformation to points in 3D space.

A rotation around the X-axis by an angle \(\theta\) is represented by the following matrix:

\[R_x(\theta) = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta & 0 \\ 0 & \sin \theta & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]

A rotation around the Y-axis by an angle \(\theta\) is represented by the following matrix:

\[R_y(\theta) = \begin{bmatrix} \cos \theta & 0 & \sin \theta & 0 \\ 0 & 1 & 0 & 0 \\ -\sin \theta & 0 & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]

A rotation around the Z-axis by an angle \(\theta\) is represented by the following matrix:

\[R_z(\theta) = \begin{bmatrix} \cos \theta & -\sin \theta & 0 & 0 \\ \sin \theta & \cos \theta & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]

Combining Rotation Matrices

To rotate an object in 3D space about multiple axes, we can multiply the individual rotation matrices. The order of multiplication is crucial since matrix multiplication is not commutative. Typically, we perform rotations in the order of Z, then Y, then X (if required).

\[R = R_x(\theta_x) \cdot R_y(\theta_y) \cdot R_z(\theta_z)\]

Rotation Matrices in Haskell

Let’s implement the rotation matrices for the X, Y, and Z axes in Haskell:

rotationX :: Double -> Mat4
rotationX theta = 
    Mat4 1 0           0            0
         0 (cos theta) (-sin theta) 0
         0 (sin theta) (cos theta)  0
         0 0           0            1

rotationY :: Double -> Mat4
rotationY theta = 
    Mat4 (cos theta)  0 (sin theta) 0
         0            1 0           0
         (-sin theta) 0 (cos theta) 0
         0            0 0           1

rotationZ :: Double -> Mat4
rotationZ theta = 
    Mat4 (cos theta) (-sin theta) 0 0
         (sin theta) (cos theta)  0 0
         0           0            1 0
         0           0            0 1

Example: Rotating an Object

To apply a rotation to an object, you can combine the rotation matrices and multiply them by the object’s position vector. For instance, to rotate a point by \(\theta_x\), \(\theta_y\), and \(\theta_z\), you can multiply the corresponding matrices:

-- Rotate a point by theta_x, theta_y, and theta_z
let rotationMatrix = rotationX thetaX `mulM` rotationY thetaY `mulM` rotationZ thetaZ
let rotatedPoint = mulMV rotationMatrix pointVec

3D Transformations and Projection

Local vs World Coordinates

When dealing with 3D objects, we distinguish between local coordinates (relative to an object) and world coordinates (relative to the entire scene). Vectors are transformed from local to world coordinates by multiplying them by transformation matrices.

Projection Calculation

To project a 3D point onto a 2D screen, we use a projection matrix. The projection matrix transforms 3D coordinates into 2D coordinates by applying a perspective transformation.

A simple perspective projection matrix looks like this:

\[P = \begin{bmatrix} \frac{1}{\text{aspect}} & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & \frac{z_f + z_n}{z_n - z_f} & \frac{2z_f z_n}{z_n - z_f} \\ 0 & 0 & -1 & 0 \end{bmatrix}\]

Where:

  • \(z_f\) is the far clipping plane
  • \(z_n\) is the near clipping plane
  • \(\text{aspect}\) is the aspect ratio of the screen
projection :: Double -> Double -> Double -> Double -> Mat4
projection fov aspect near far =
    let scale = 1 / tan (fov / 2)
        in Mat4 (scale / aspect) 0     0                            0
                0                scale 0                            0
                0                0     (-far - near) / (far - near) (-2 * far * near) / (far - near)
                0                0     -1                           0

Reducing a 4D Vector to 2D Screen Coordinates

In 3D graphics, we often work with 4D vectors in homogeneous coordinates. To display a 3D point on a 2D screen, we need to project that point using a projection matrix and then convert the resulting 4D vector into 2D coordinates that we can draw on the screen.

Here’s how this process works:

Step 1: Apply the Projection Matrix

We start with a 4D vector \(\boldsymbol{v}\) in homogeneous coordinates:

\[\boldsymbol{v} = \begin{bmatrix} x \\ y \\ z \\ w \end{bmatrix}\]

We apply the projection matrix \(P\), which transforms the 4D point into clip space (a space where coordinates can be projected to the screen).

The projection matrix looks something like this for perspective projection:

\[P = \begin{bmatrix} \frac{1}{\text{aspect}} & 0 & 0 & 0 \\ 0 & \frac{1}{\tan(\frac{fov}{2})} & 0 & 0 \\ 0 & 0 & \frac{z_f + z_n}{z_n - z_f} & \frac{2z_f z_n}{z_n - z_f} \\ 0 & 0 & -1 & 0 \end{bmatrix}\]

Multiplying \(\boldsymbol{v}\) by \(P\) gives us:

\[\boldsymbol{v'} = P \cdot \boldsymbol{v} = \begin{bmatrix} x' \\ y' \\ z' \\ w' \end{bmatrix}\]

Where:

\[\boldsymbol{v'} = \begin{bmatrix} x' \\ y' \\ z' \\ w' \end{bmatrix} = P \cdot \begin{bmatrix} x \\ y \\ z \\ w \end{bmatrix}\]

Step 2: Perspective Divide

To convert the 4D vector \(\boldsymbol{v'}\) to 3D space, we perform the perspective divide. This means dividing the \(x'\), \(y'\), and \(z'\) components by the \(w'\) component.

The resulting 3D point \(\boldsymbol{v_{3D}}\) is:

\[\boldsymbol{v_{3D}} = \begin{bmatrix} \frac{x'}{w'} \\ \frac{y'}{w'} \\ \frac{z'}{w'} \end{bmatrix}\]

Step 3: Convert to Screen Coordinates

To get the final 2D screen coordinates, we need to convert the 3D point into normalized device coordinates (NDC), which range from -1 to 1. The screen coordinates ( (x_{\text{screen}}, y_{\text{screen}}) ) are then obtained by scaling these values to the screen dimensions:

\[x_{\text{screen}} = \left( \frac{x_{3D} + 1}{2} \right) \cdot \text{width}\] \[y_{\text{screen}} = \left( \frac{1 - y_{3D}}{2} \right) \cdot \text{height}\]

The factor \(\frac{x_{3D} + 1}{2}\) maps the normalized \(x\)-coordinate from the range [-1, 1] to [0, 1], and multiplying by the screen width gives us the pixel position. The same applies for \(y_{\text{screen}}\), but we invert the \(y_{3D}\) coordinate to account for the fact that screen coordinates typically have the origin at the top-left corner, whereas the NDC system has the origin at the center.

Putting it All Together in Haskell

Here’s how you can perform this transformation in Haskell:

-- Given a projection matrix and a 4D vector, project the vector to screen coordinates
projectToScreen :: Mat4 -> Vec4 -> Double -> Double -> (Double, Double)
projectToScreen projectionMatrix vec width height =
    let Vec4 x' y' z' w' = mulMV projectionMatrix vec  -- Apply projection matrix
        x3D = x' / w'                                  -- Perspective divide
        y3D = y' / w'
        -- Convert from NDC to screen coordinates
        xScreen = (x3D + 1) / 2 * width
        yScreen = (1 - y3D) / 2 * height
    in (xScreen, yScreen)

Example

Suppose we have the following vector and projection matrix:

let vec = Vec4 1 1 1 1  -- 3D point (1, 1, 1)
let projectionMatrix = projection 90 (16/9) 0.1 1000  -- Field of view, aspect ratio, near/far planes
let (xScreen, yScreen) = projectToScreen projectionMatrix vec 1920 1080  -- Screen resolution

This will give you the screen coordinates \(x_{\text{screen}}\) and \(y_{\text{screen}}\), where the 3D point \((1, 1, 1)\) will be projected on a 1920x1080 display.

Conclusion

This has been some of the basic 3D concepts presented through Haskell. In future posts, we’ll use this code to create some basic animations on screen.