Basic 3D
18 Oct 2024Introduction
In this post, we’ll explore the foundations of 3D graphics, focusing on vector math, matrices, and transformations. By the end, you’ll understand how objects are transformed in 3D space and projected onto the screen. We’ll use Haskell for the code examples, as it closely resembles the mathematical operations involved.
Vectors
A 4D vector has four components: \(x\), \(y\), \(z\), and \(w\).
data Vec4 = Vec4 { x :: Double, y :: Double, z :: Double, w :: Double }
deriving (Show, Eq)
In 3D graphics, we often work with 4D vectors (also called homogeneous coordinates) rather than 3D vectors. The extra dimension allows us to represent translations (which are not linear transformations) as matrix operations, keeping the math uniform.
A 4D vector is written as:
\[\boldsymbol{v} = \begin{bmatrix} x \\ y \\ z \\ w \end{bmatrix}\]Where:
- \(x, y, z\) represent the position in 3D space
- \(w\) is a homogeneous coordinate that allows us to apply translations and perspective transformations.
The extra \(w\)-component is crucial for distinguishing between points and directions (i.e., vectors). When \(w = 1\), the vector represents a point. When \(w = 0\), it represents a direction or vector.
Operations
We need to perform various operations on vectors in 3D space (or 4D homogeneous space), including addition, subtraction, multiplication, dot products, and normalization.
Addition
Given two vectors \(\boldsymbol{a}\) and \(\boldsymbol{b}\):
\[\boldsymbol{a} + \boldsymbol{b} = \begin{bmatrix} a_x \\ a_y \\ a_z \\ a_w \end{bmatrix} + \begin{bmatrix} b_x \\ b_y \\ b_z \\ b_w \end{bmatrix} = \begin{bmatrix} a_x + b_x \\ a_y + b_y \\ a_z + b_z \\ a_w + b_w \end{bmatrix}\]add :: Vec4 -> Vec4 -> Vec4
add (Vec4 ax ay az aw) (Vec4 bx by bz bw) = Vec4 (ax + bx) (ay + by) (az + bz) (aw + bw)
Subtraction
\[\boldsymbol{a} - \boldsymbol{b} = \begin{bmatrix} a_x \\ a_y \\ a_z \\ a_w \end{bmatrix} - \begin{bmatrix} b_x \\ b_y \\ b_z \\ b_w \end{bmatrix} = \begin{bmatrix} a_x - b_x \\ a_y - b_y \\ a_z - b_z \\ a_w - b_w \end{bmatrix}\]sub :: Vec4 -> Vec4 -> Vec4
sub (Vec4 ax ay az aw) (Vec4 bx by bz bw) = Vec4 (ax - bx) (ay - by) (az - bz) (aw - bw)
Dot Product
The dot product of two 3D vectors \(\boldsymbol{a} \cdot \boldsymbol{b}\) gives a scalar value:
\[\boldsymbol{a} \cdot \boldsymbol{b} = a_x \cdot b_x + a_y \cdot b_y + a_z \cdot b_z\]dot :: Vec4 -> Vec4 -> Double
dot (Vec4 ax ay az _) (Vec4 bx by bz _) = ax * bx + ay * by + az * bz
Cross Product
The cross product is a vector operation that takes two 3D vectors and returns a third vector that is orthogonal (perpendicular) to both of the input vectors. The cross product is commonly used in 3D graphics to calculate surface normals, among other things.
For two 3D vectors \(\boldsymbol{a}\) and \(\boldsymbol{b}\), the cross product \(\boldsymbol{a} \times \boldsymbol{b}\) is defined as:
\[\boldsymbol{a} \times \boldsymbol{b} = \begin{bmatrix} a_y \cdot b_z - a_z \cdot b_y \\ a_z \cdot b_x - a_x \cdot b_z \\ a_x \cdot b_y - a_y \cdot b_x \end{bmatrix}\]This resulting vector is perpendicular to both \(\boldsymbol{a}\) and \(\boldsymbol{b}\).
To implement the cross product in Haskell, we will only operate on the \(x\), \(y\), and \(z\) components of a
Vec4
(ignoring \(w\)) since the cross product is defined for 3D vectors.
-- Compute the cross product of two 3D vectors
cross :: Vec4 -> Vec4 -> Vec4
cross (Vec4 ax ay az _) (Vec4 bx by bz _) =
Vec4 ((ay * bz) - (az * by)) -- x component
((az * bx) - (ax * bz)) -- y component
((ax * by) - (ay * bx)) -- z component
0 -- w is zero for a direction vector
Length
The length or magnitude of a vector \(\boldsymbol{v}\) is:
\[\lVert \boldsymbol{v} \rVert = \sqrt{x^2 + y^2 + z^2}\]len :: Vec4 -> Double
len (Vec4 x y z _) = sqrt (x * x + y * y + z * z)
Normalization
To normalize a vector is to scale it so that its length is 1:
\[\boldsymbol{v}_{\text{norm}} = \frac{\boldsymbol{v}}{\lVert \boldsymbol{v} \rVert}\]normalize :: Vec4 -> Vec4
normalize v = let l = len v in scale (1 / l) v
Matrices
A 4x4 matrix consists of 16 elements. We’ll represent it as a flat structure with 16 values:
\[M = \begin{bmatrix} m_{00} & m_{01} & m_{02} & m_{03} \\ m_{10} & m_{11} & m_{12} & m_{13} \\ m_{20} & m_{21} & m_{22} & m_{23} \\ m_{30} & m_{31} & m_{32} & m_{33} \end{bmatrix}\]In Haskell, we define it as:
data Mat4 = Mat4 { m00 :: Double, m01 :: Double, m02 :: Double, m03 :: Double
, m10 :: Double, m11 :: Double, m12 :: Double, m13 :: Double
, m20 :: Double, m21 :: Double, m22 :: Double, m23 :: Double
, m30 :: Double, m31 :: Double, m32 :: Double, m33 :: Double
}
deriving (Show, Eq)
In 3D graphics, transformations are applied to objects using 4x4 matrices. These matrices allow us to perform operations like translation, scaling, and rotation.
Operations
Addition
Adding two matrices \(A\) and \(B\) is done element-wise:
\[A + B = \begin{bmatrix} a_{00} + b_{00} & a_{01} + b_{01} & \dots \\ a_{10} + b_{10} & \dots & \dots \end{bmatrix}\]addM :: Mat4 -> Mat4 -> Mat4
addM (Mat4 a00 a01 a02 a03 a10 a11 a12 a13 a20 a21 a22 a23 a30 a31 a32 a33)
(Mat4 b00 b01 b02 b03 b10 b11 b12 b13 b20 b21 b22 b23 b30 b31 b32 b33) =
Mat4 (a00 + b00) (a01 + b01) (a02 + b02) (a03 + b03)
(a10 + b10) (a11 + b11) (a12 + b12) (a13 + b13)
(a20 + b20) (a21 + b21) (a22 + b22) (a23 + b23)
(a30 + b30) (a31 + b31) (a32 + b32) (a33 + b33)
Multiplication
Multiplying two matrices \(A\) and \(B\):
\[C = A \cdot B\]Where each element \(c_{ij}\) of the resulting matrix is calculated as:
\[c_{ij} = a_{i0} \cdot b_{0j} + a_{i1} \cdot b_{1j} + a_{i2} \cdot b_{2j} + a_{i3} \cdot b_{3j}\]mulM :: Mat4 -> Mat4 -> Mat4
mulM (Mat4 a00 a01 a02 a03 a10 a11 a12 a13 a20 a21 a22 a23 a30 a31 a32 a33)
(Mat4 b00 b01 b02 b03 b10 b11 b12 b13 b20 b21 b22 b23 b30 b31 b32 b33) =
Mat4 (a00 * b00 + a01 * b10 + a02 * b20 + a03 * b30)
(a00 * b01 + a01 * b11 + a02 * b21 + a03 * b31)
(a00 * b02 + a01 * b12 + a02 * b22 + a03 * b32)
(a00 * b03 + a01 * b13 + a02 * b23 + a03 * b33)
(a10 * b00 + a11 * b10 + a12 * b20 + a13 * b30)
(a10 * b01 + a11 * b11 + a12 * b21 + a13 * b31)
(a10 * b02 + a11 * b12 + a12 * b22 + a13 * b32)
(a10 * b03 + a11 * b13 + a12 * b23 + a13 * b33)
(a20 * b00 + a21 * b10 + a22 * b20 + a23 * b30)
(a20 * b01 + a21 * b11 + a22 * b21 + a23 * b31)
(a20 * b02 + a21 * b12 + a22 * b22 + a23 * b32)
(a20 * b03 + a21 * b13 + a22 * b23 + a23 * b33)
(a30 * b00 + a31 * b10 + a32 * b20 + a33 * b30)
(a30 * b01 + a31 * b11 + a32 * b21 + a33 * b31)
(a30 * b02 + a31 * b12 + a32 * b22 + a33 * b32)
(a30 * b03 + a31 * b13 + a32 * b23 + a33 * b33)
Vector Multiply
We transform a vector by a matrix using a multiply operation.
\[\boldsymbol{v'} = M \cdot \boldsymbol{v}\]-- Multiplying a 4D vector by a 4x4 matrix
mulMV :: Mat4 -> Vec4 -> Vec4
mulMV (Mat4 m00 m01 m02 m03 m10 m11 m12 m13 m20 m21 m22 m23 m30 m31 m32 m33)
(Vec4 x y z w) =
Vec4 (m00 * x + m01 * y + m02 * z + m03 * w)
(m10 * x + m11 * y + m12 * z + m13 * w)
(m20 * x + m21 * y + m22 * z + m23 * w)
(m30 * x + m31 * y + m32 * z + m33 * w)
3D Transformations
In 3D graphics, we apply transformations like translation, scaling, and rotation using matrices. These transformations are applied to 4D vectors, and the operations are represented as matrix multiplications.
Identity Matrix
The identity matrix is a 4x4 matrix that leaves a vector unchanged when multiplied:
\[I = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]identity :: Mat4
identity =
Mat4 1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
Translation Matrix
To translate a point by \((t_x, t_y, t_z)\), we use the translation matrix:
\[T = \begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix}\]translation :: Double -> Double -> Double -> Mat4
translation tx ty tz =
Mat4 1 0 0 tx
0 1 0 ty
0 0 1 tz
0 0 0 1
Scale Matrix
Scaling a vector by \(s_x, s_y, s_z\) is done using the following matrix:
\[S = \begin{bmatrix} s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]scale :: Double -> Double -> Double -> Mat4
scale sx sy sz =
Mat4 sx 0 0 0
0 sy 0 0
0 0 sz 0
0 0 0 1
Rotation Matrix
In 3D graphics, we frequently need to rotate objects around the X, Y, and Z axes. Each axis has its own corresponding rotation matrix, which we use to apply the rotation transformation to points in 3D space.
A rotation around the X-axis by an angle \(\theta\) is represented by the following matrix:
\[R_x(\theta) = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta & 0 \\ 0 & \sin \theta & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]A rotation around the Y-axis by an angle \(\theta\) is represented by the following matrix:
\[R_y(\theta) = \begin{bmatrix} \cos \theta & 0 & \sin \theta & 0 \\ 0 & 1 & 0 & 0 \\ -\sin \theta & 0 & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]A rotation around the Z-axis by an angle \(\theta\) is represented by the following matrix:
\[R_z(\theta) = \begin{bmatrix} \cos \theta & -\sin \theta & 0 & 0 \\ \sin \theta & \cos \theta & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}\]Combining Rotation Matrices
To rotate an object in 3D space about multiple axes, we can multiply the individual rotation matrices. The order of multiplication is crucial since matrix multiplication is not commutative. Typically, we perform rotations in the order of Z, then Y, then X (if required).
\[R = R_x(\theta_x) \cdot R_y(\theta_y) \cdot R_z(\theta_z)\]Rotation Matrices in Haskell
Let’s implement the rotation matrices for the X, Y, and Z axes in Haskell:
rotationX :: Double -> Mat4
rotationX theta =
Mat4 1 0 0 0
0 (cos theta) (-sin theta) 0
0 (sin theta) (cos theta) 0
0 0 0 1
rotationY :: Double -> Mat4
rotationY theta =
Mat4 (cos theta) 0 (sin theta) 0
0 1 0 0
(-sin theta) 0 (cos theta) 0
0 0 0 1
rotationZ :: Double -> Mat4
rotationZ theta =
Mat4 (cos theta) (-sin theta) 0 0
(sin theta) (cos theta) 0 0
0 0 1 0
0 0 0 1
Example: Rotating an Object
To apply a rotation to an object, you can combine the rotation matrices and multiply them by the object’s position vector. For instance, to rotate a point by \(\theta_x\), \(\theta_y\), and \(\theta_z\), you can multiply the corresponding matrices:
-- Rotate a point by theta_x, theta_y, and theta_z
let rotationMatrix = rotationX thetaX `mulM` rotationY thetaY `mulM` rotationZ thetaZ
let rotatedPoint = mulMV rotationMatrix pointVec
3D Transformations and Projection
Local vs World Coordinates
When dealing with 3D objects, we distinguish between local coordinates (relative to an object) and world coordinates (relative to the entire scene). Vectors are transformed from local to world coordinates by multiplying them by transformation matrices.
Projection Calculation
To project a 3D point onto a 2D screen, we use a projection matrix. The projection matrix transforms 3D coordinates into 2D coordinates by applying a perspective transformation.
A simple perspective projection matrix looks like this:
\[P = \begin{bmatrix} \frac{1}{\text{aspect}} & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & \frac{z_f + z_n}{z_n - z_f} & \frac{2z_f z_n}{z_n - z_f} \\ 0 & 0 & -1 & 0 \end{bmatrix}\]Where:
- \(z_f\) is the far clipping plane
- \(z_n\) is the near clipping plane
- \(\text{aspect}\) is the aspect ratio of the screen
projection :: Double -> Double -> Double -> Double -> Mat4
projection fov aspect near far =
let scale = 1 / tan (fov / 2)
in Mat4 (scale / aspect) 0 0 0
0 scale 0 0
0 0 (-far - near) / (far - near) (-2 * far * near) / (far - near)
0 0 -1 0
Reducing a 4D Vector to 2D Screen Coordinates
In 3D graphics, we often work with 4D vectors in homogeneous coordinates. To display a 3D point on a 2D screen, we need to project that point using a projection matrix and then convert the resulting 4D vector into 2D coordinates that we can draw on the screen.
Here’s how this process works:
Step 1: Apply the Projection Matrix
We start with a 4D vector \(\boldsymbol{v}\) in homogeneous coordinates:
\[\boldsymbol{v} = \begin{bmatrix} x \\ y \\ z \\ w \end{bmatrix}\]We apply the projection matrix \(P\), which transforms the 4D point into clip space (a space where coordinates can be projected to the screen).
The projection matrix looks something like this for perspective projection:
\[P = \begin{bmatrix} \frac{1}{\text{aspect}} & 0 & 0 & 0 \\ 0 & \frac{1}{\tan(\frac{fov}{2})} & 0 & 0 \\ 0 & 0 & \frac{z_f + z_n}{z_n - z_f} & \frac{2z_f z_n}{z_n - z_f} \\ 0 & 0 & -1 & 0 \end{bmatrix}\]Multiplying \(\boldsymbol{v}\) by \(P\) gives us:
\[\boldsymbol{v'} = P \cdot \boldsymbol{v} = \begin{bmatrix} x' \\ y' \\ z' \\ w' \end{bmatrix}\]Where:
\[\boldsymbol{v'} = \begin{bmatrix} x' \\ y' \\ z' \\ w' \end{bmatrix} = P \cdot \begin{bmatrix} x \\ y \\ z \\ w \end{bmatrix}\]Step 2: Perspective Divide
To convert the 4D vector \(\boldsymbol{v'}\) to 3D space, we perform the perspective divide. This means dividing the \(x'\), \(y'\), and \(z'\) components by the \(w'\) component.
The resulting 3D point \(\boldsymbol{v_{3D}}\) is:
\[\boldsymbol{v_{3D}} = \begin{bmatrix} \frac{x'}{w'} \\ \frac{y'}{w'} \\ \frac{z'}{w'} \end{bmatrix}\]Step 3: Convert to Screen Coordinates
To get the final 2D screen coordinates, we need to convert the 3D point into normalized device coordinates (NDC), which range from -1 to 1. The screen coordinates ( (x_{\text{screen}}, y_{\text{screen}}) ) are then obtained by scaling these values to the screen dimensions:
\[x_{\text{screen}} = \left( \frac{x_{3D} + 1}{2} \right) \cdot \text{width}\] \[y_{\text{screen}} = \left( \frac{1 - y_{3D}}{2} \right) \cdot \text{height}\]The factor \(\frac{x_{3D} + 1}{2}\) maps the normalized \(x\)-coordinate from the range [-1, 1] to [0, 1], and multiplying by the screen width gives us the pixel position. The same applies for \(y_{\text{screen}}\), but we invert the \(y_{3D}\) coordinate to account for the fact that screen coordinates typically have the origin at the top-left corner, whereas the NDC system has the origin at the center.
Putting it All Together in Haskell
Here’s how you can perform this transformation in Haskell:
-- Given a projection matrix and a 4D vector, project the vector to screen coordinates
projectToScreen :: Mat4 -> Vec4 -> Double -> Double -> (Double, Double)
projectToScreen projectionMatrix vec width height =
let Vec4 x' y' z' w' = mulMV projectionMatrix vec -- Apply projection matrix
x3D = x' / w' -- Perspective divide
y3D = y' / w'
-- Convert from NDC to screen coordinates
xScreen = (x3D + 1) / 2 * width
yScreen = (1 - y3D) / 2 * height
in (xScreen, yScreen)
Example
Suppose we have the following vector and projection matrix:
let vec = Vec4 1 1 1 1 -- 3D point (1, 1, 1)
let projectionMatrix = projection 90 (16/9) 0.1 1000 -- Field of view, aspect ratio, near/far planes
let (xScreen, yScreen) = projectToScreen projectionMatrix vec 1920 1080 -- Screen resolution
This will give you the screen coordinates \(x_{\text{screen}}\) and \(y_{\text{screen}}\), where the 3D point \((1, 1, 1)\) will be projected on a 1920x1080 display.
Conclusion
This has been some of the basic 3D concepts presented through Haskell. In future posts, we’ll use this code to create some basic animations on screen.