Cogs and Levers A blog full of technical stuff

Perlin Noise

Introduction

Noise functions in computer applications allow programmers to make the machine act a little more naturally. It’s the randomness introduced with these algorithms that gives the computer what appears to be “free thought” or unexpected decisions.

Today, I’ll walk through the Perlin Noise algorithm which has applications in computer science ranging from player movement, landscape generation, clouds, etc.

Here are some examples of the Perlin Noise function output into two dimensions:

Noise 1 Noise 2

In today’s post, I’ll walk through the Perlin Noise algorithm and what steps you need to take to implement it yourself.

Understanding Noise

The Perlin Noise algorithm can be broken down into a few smaller pieces to make it easier to understand. At its heart, the algorithm needs pseudo-random numbers. These random numbers should be repeatable so that you can re-generate the same noise pattern at will.

A common noise function for two parameters that I have found used over the web is as follows:

float noise(const int x, const int y) {
  int n = x + y * 57;
  n = (n << 13) ^ n;
  return (1.0f - ( ( n * (n * n * 15731 + 789221) + 1376312589) & 0x7FFFFFFF) / 1073741824.0f);
}

There’s a lot of math transformation in this previous function. You can use any function at all to produce your random numbers, just make sure that you can generate them against two parameters (in the case of 2d) and that you’ll get repeatable results.

Next we’ll smooth out the noise between two points. We’ll do this by sampling the corners, sides and centre of the point we’re currently generating for.

float smoothNoise(const float x, const float y) {
  int ix = (int)x;
  int iy = (int)y;

  // sample the corners
  float corners = (noise(ix - 1, iy - 1) +
                   noise(ix + 1, iy - 1) +
                   noise(ix - 1, iy + 1) +
                   noise(ix + 1, iy + 1)) / 16;

  // sample the sides
  float sides = (noise(ix - 1, iy) +
                 noise(ix + 1, iy) +
                 noise(ix, iy - 1) +
                 noise(ix, iy + 1)) / 8;

  // sample the centre
  float centre = noise(ix, iy);

  // send out the accumulated result
  return corners + sides + centre;
}

With the above function, we can now sample a small area for a given point. All based on our random number generator.

For the fractional parts that occur between solid boundaries, we’ll use a specific interpolation method. I’ve defined two below. One that will do linear interpolation and one that will use cosine for a smoother transition between points.

/* Linear interpolation */
float lerp(float a, float b, float x) {
   return a * (1 - x) + b * x;
}

/* Trigonometric interpolation */
float terp(float a, float b, float x) {
  float ft = x * 3.1415927f;
  float f = (1 - cosf(ft)) * 0.5f;

  return a * (1 - f) + b * f;
}

/* Noise interpolation */
float interpolateNoise(const float x, const float y) {
  int   ix = (int)x;
  float fx = x - ix;

  int   iy = (int)y;
  float fy = y - iy;

  float v1 = smoothNoise(ix, iy),
        v2 = smoothNoise(ix + 1, iy),
        v3 = smoothNoise(ix, iy + 1),
        v4 = smoothNoise(ix + 1, iy + 1);

  float i1 = terp(v1, v2, fx),
        i2 = terp(v3, v4, fx);

  return terp(i1, i2, fy);
}

Finally we use this smooth interpolation to perform the perlin noise function. A couple of interesting co-effecients that are provided to the algorithm are “octaves” and “persistence”. “octaves” defines how many iterations that will be performed and “persistence” defines how much of the spectrum we’ll utilise. It’s highly interactive to the main curve co-effecients: frequency and amplitude.

float perlin2d(const float x, const float y,
               const int octaves, const float persistence) {
  float total = 0.0f;

  for (int i = 0; i <= (octaves - 1); i ++) {
    float frequency = powf(2, i);
    float amplitude = powf(persistence, i);

    total = total + interpolateNoise(x * frequency, y * frequency) * amplitude;
  }

  return total;
}

This now provides us with a way to create a map (or 2d array) to produce images much like I’d pasted in above. Here is a typical build loop.

for (int x = 0; x < w; x ++) {
  for (int y = 0; y < h; y ++) {
    float xx = (float)x / (float)this->width;
    float yy = (float)y / (float)this->height;

    map[x + (y * w)] = perlin::perlin2d(
      xx, yy,
      6, 1.02f
    );
  }
}

That should be enough to get you going.

Follow up to a Camera Implementation

Introduction

In a previous post I’d written about a simple camera implementation that you can use in projects of your own. This post I’ll show how I’ve implemented this camera with some mouse handling routines to make it feel like you’re orienting your head with the mouse.

The Idea

We’ll capture all of the mouse movements given in an application and see how far the mouse deviates from a given origin point. I think that the most sane origin point to go from is the center of your window. For each movement that the mouse makes from the center of the window, we need to:

  • Determine how much movement occurred on the x axis
  • Determine how much movement occurred on the y axis
  • Deaden this movement by a co-efficient to simulate mouse “sensitivity”
  • Set the yaw and pitch (head up/down, left/right) of the camera
  • Reset the mouse back to the origin point

Here’s how I’ve done it in code:

void mouseMotion(int x, int y) {
  // calculate the origin point (shifting right divides by two, remember!)
  int halfWidth = windowWidth >> 1;
  int halfHeight = windowHeight >> 1;

  // calculate how far we deviated from the origin point and deaden this
  // by a factor of 20
  float deltaX = (halfWidth - x) / 20.0f;
  float deltaY = (halfHeight - y) / 20.0f;

  // don't do anything if there wasn't any movement to report
  if ((deltaX == 0.0f) && (deltaY == 0.0f)) {
    return ;
  }

  // set the camera's orientation
  cam.yaw(deltaX);
  cam.pitch(deltaY);

  // reset the mouse pointer back to the origin point
  glutWarpPointer(halfWidth, halfHeight);
}

You can see that I’m using GLUT to do these demos. The only GLUT-specific piece of code here is the warp command which puts the mouse back onto the origin point. You should have an equivalent function to do the same in the framework of your choice.

Well, there you have it. You can now orient your camera using your mouse.

A Camera Implementation in C++

Introduction

One of the most important elements of any 3D application is its camera. A camera allows you to move around in your world and orient your view. In today’s post, I’ll put together a code walk through that will take you through a simple camera implementation.

The Concept

There are two major components of any camera. They are position and orientation. Position is fairly easy to grasp, it’s just where the camera is and is identified using a normal 3-space vector.

The more difficult of the two concepts is orientation. The best description for this that I’ve found is on the Flight Dynamics page on wikipedia. The following image has been taken from that article and it outlines the plains that orientation can occur. Of course, the image’s subject is an aircraft but the same concepts apply to a camera’s orientation:

Roll pitch yaw

The pitch describes the orientation around the x-axis, the yaw describes the orientation around the y-axis and the roll describes the orientation around the z-axis.

With all of this information on board, the requirements of our camera should become a little clearer. We need to keep track of the following about the camera:

  • Position
  • Up orientation (yaw axis)
  • Right direction (pitch axis)
  • Forward (or view) direction (roll axis)

We’ll also keep track of how far we’ve gone around the yaw, pitch and roll axis.

Some Maths (and code)

There’s a handful of really useful equations that are going to help us out here. With all of the information that we’ll be managing and how tightly related each axis is considering they’re all relating to the same object - you can see how interactive it can be just by modifying one attribute.

When the pitch changes:

forwardVector = (forwardVector * cos(pitch)) + (upVector * sin(pitch))
upVector      = forwardVector x rightVector
void camera::pitch(const float angle) {
  // keep track of how far we've gone around the axis
  this->rotatedX += angle;

  // calculate the new forward vector
  this->viewDir = glm::normalize(
    this->viewDir * cosf(angle * PION180) +
    this->upVector * sinf(angle * PION180)
  );

  // calculate the new up vector
  this->upVector  = glm::cross(this->viewDir, this->rightVector);
  // invert so that positive goes down
  this->upVector *= -1;
}

When the yaw changes:

forwardVector = (forwardVector * cos(yaw)) - (rightVector * sin(yaw))
rightVector   = forwardVector x upVector
void camera::yaw(const float angle) {
  // keep track of how far we've gone around this axis
  this->rotatedY += angle;

  // re-calculate the new forward vector
  this->viewDir = glm::normalize(
    this->viewDir * cosf(angle * PION180) -
    this->rightVector * sinf(angle * PION180)
  );

  // re-calculate the new right vector
  this->rightVector = glm::cross(this->viewDir, this->upVector);
}

When the roll changes:

rightVector = (rightVector * cos(roll)) + (upVector * sin(roll))
upVector    = forwardVector x rightVector
void camera::roll(const float angle) {
  // keep track of how far we've gone around this axis
  this->rotatedZ += angle;

  // re-calculate the forward vector
  this->rightVector = glm::normalize(
    this->rightVector * cosf(angle * PION180) +
    this->upVector * sinf(angle * PION180)
  );

  // re-calculate the up vector
  this->upVector  = glm::cross(this->viewDir, this->rightVector);
  // invert the up vector so positive points down
  this->upVector *= -1;
}

Ok, so that’s it for orientation. Reading through the equations above, you can see that the calculation of the forward vector comes out of some standard rotations. The x operator that I’ve used above denotes vector cross product.

Now that we’re keeping track of our current viewing direction, up direction and right direction; performing camera movements is really easy.

I’ve called these advance (move along the forward plane), ascend (move along the up plane) and strafe (move along the right plane).

// z movement
void camera::advance(const float distance) {
  this->position += (this->viewDir * -distance);
}

// y movement
void camera::ascend(const float distance) {
  this->position += (this->upVector * distance);
}

// x movement
void camera::strafe(const float distance) {
  this->position += (this->rightVector * distance);
}

All we are doing here is just moving along those planes that have been defined for us via orientation. Movement is simple.

Integrating

All of this code/math is great up until we need to apply it in our environments. Most of the work that I do centralises around OpenGL, so I’ve got a very handy utility function (from GLU) that I use called gluLookAt. Plugging these values in is rather simple:

void camera::place(void) {
  glm::vec3 viewPoint = this->position + this->viewDir;

   // setup opengl with gluLookAt
  gluLookAt(
    position[0], position[1], position[2],
    viewPoint[0], viewPoint[1], viewPoint[2],
    upVector[0], upVector[1], upVector[2]
  );
}

We calculate our viewpoint as (position - forwardVector) and really just plug the literal values into this function. There is a lot of information on the gluLookAt documentation page that you can use if OpenGL isn’t your flavor to simulate what it does.

That’s it for a simple camera!

Getting started with GLUT in Linux

Introduction

GLUT is the OpenGL Utility Toolkit which is a standard set of APIs that you should be able to use on any platform to write OpenGL programs. It takes care of the boilerplate code that your applications would need to integrate with the host windowing system. More can be found about GLUT on its website.

Today’s post, I’ll focus on getting your Linux environment up to speed to start writing programs with this framework.

Installation

In order to write programs using this library, you’ll need to install the development library. Using your favorite package manager, you’ll need to install freeglut.

$ sudo apt-get install freeglut3-dev

After that’s finished, it’s time to write a test application to make sure everything went to plan.

A Simple Example

The following program will just open a window and continually clear the window.

#include <GL/glut.h>

void resize(int width, int height) {

   // avoid div-by-zero
   if (height == 0) {
      height = 1;
   }

   // calculate the aspect ratio
   float ratio = width * 1.0 / height;

   // put opengl into projection matrix mode
   glMatrixMode(GL_PROJECTION);

   // reset the matrix
   glLoadIdentity();
   // set the viewport
   glViewport(0, 0, width, height);
   // set the perspective
   gluPerspective(45.0f, ratio, 0.1f, 100.0f);

   // put opengl back into modelview mode
   glMatrixMode(GL_MODELVIEW);

}

void render(void) {

   // just clear the buffers for now
   glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

   // flip the buffers
   glutSwapBuffers();

}


int main(int argc, char *argv[]) {

   // initialize glut
   glutInit(&argc, argv);

   // setup a depth buffer, double buffer and rgba mode
   glutInitDisplayMode(GLUT_DEPTH | GLUT_DOUBLE | GLUT_RGBA);

   // set the windows initial position and size
   glutInitWindowPosition(50, 50);
   glutInitWindowSize(320, 240);

   // create the window
   glutCreateWindow("Test Glut Program");

   // register the callbacks to glut
   glutDisplayFunc(render);
   glutReshapeFunc(resize);
   glutIdleFunc(render);

   // run the program
   glutMainLoop();

   return 0;
   
}

Putting this code into “test.c”, we built it into a program with the following command:

$ gcc test.c -lGL -lGLU -lglut -o test

That’s it! Run “test” at the command prompt and if everything has gone to plan, you’ve installed freeglut correctly!

Diving into OpenCL

Introduction

In a previous article I’d put together a walk through on how to get your development environment ready to write some OpenCL code. This article by itself isn’t of much use unless you can write some code already. Today’s post will be a walk through on writing your first OpenCL program. This example, much like a lot of the other entry-level OpenCL development tutorials will focus on performing addition between two lists of floating point numbers.

Lots to learn

Unfortunately, OpenCL is a topic that brings a very steep learning curve. In order to understand even the most simple of programs you need to read a fair bit of code and hopefully be aware of what it’s doing. Before we dive into any implementation, I’ll take you on a brief tour of terms, types and definitions that will help you to understanding the code as it’s presented.

A cl_platform_id is obtained using clGetPlatformIDs. A platform in OpenCL refers to the host execution environment and any attached devices. Platforms are what allow OpenCL to share resources and execute programs.

A cl_device_id is obtained using clGetDeviceIDs. It’s how your program will refer to the devices that your code will run on. A device is how OpenCL refers to “something” that will execute code (CPU, GPU, etc).

A cl_context is obtained using clCreateContext. A context is established across OpenCL devices. It’s what OpenCL will use to manage command-queues, memory, program and kernel objects. It provides the ability to execute a kernel across many devices.

A cl_program is created from actual (string) source-code at runtime using clCreateProgramWithSource. They’re created in conjunction with your context so that program creation is aware of where it’ll be expected to run. After the cl_program reference is successfully established, the host program would typically call clBuildProgram to take the program from its source code (string) state into an executable (binary) state.

A cl_command_queue is established using clCreateCommandQueue. A command queue is how work is scheduled to a device for execution.

A cl_kernel is created using clCreateKernel. A kernel is a function contained within a compiled cl_program object. It’s identified within the source code with a __kernel qualifier. You set the argument list for a cl_kernel object using clSetKernelArg. To glue it all together, you use clEnqueueNDRangeKernel or clEnqueueTask to enqueue a task on the command queue to execute a kernel.

A side note here is that you can use clEnqueueNativeKernel to execute native C/C++ code that isn’t compiled by OpenCL.

At least if you can identify some form of meaning when you come across these type names, you won’t be totally in the dark. Next up, we’ll create a host program and OpenCL routine - compile, build and run!

The Host

The host application is responsible for engaging with the OpenCL api to setup all of the objects described above. It’s also responsible for locating the OpenCL source code and making it available for compilation at run time. In this first snippet of code, we use the OpenCL api to establish the management platform and devices that are available to execute our code. Majority of the OpenCL api standardises itself around returning error codes from all of the functions.

cl_platform_id platform;
cl_device_id device;
int err;

/* find any available platform */
err = clGetPlatformIDs(1, &platform, NULL);

/* determine if we failed */
if (err < 0) {
  perror("Unable to identify a platform");
  exit(1);
}

/* try to acquire the GPU device */
err = clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);

/* if no GPU was found, fail-back to the CPU */
if (err == CL_DEVICE_NOT_FOUND) {
  err = clGetDeviceIDs(platform, CL_DEVICE_TYPE_CPU, 1, &device, NULL);
}

/* check that we did acquire a device */
if (err < 0) {
  perror("Unable to acquire any devices");
  exit(1);
}

At this point, we have “platform” which will (hopefully) contain a platform ID identifying our management platform and “device” should either refer to the GPU or CPU (failing to find a GPU). The next step is to create a context and your OpenCL program from source.

/* we'll get to this soon. for the time being, imagine that your
   OpenCL source code is located by the following variable */
char *program_source = ". . .";
size_t program_size = strlen(program_source);

cl_context context;
cl_program program;

/* try to establish a context with the device(s) that we've found */
context = clCreateContext(NULL, 1, &device, NULL, NULL, &err);

/* check that context creation was successful */
if (err < 0) {
  perror("Unable to create a context");
  exit(1);
}

/* using the context, create the OpenCL program from source */
program = clCreateProgramWithSource(context, 1, (const char **)&program_source, &program_size, &err);

/* chech that we could create the program */
if (err < 0) {
  perror("Unable to create the program");
  exit(1);
}

We’ve done just that here, but the program isn’t quite yet ready for execution. Before we can start using this, we need to build the program. The build process is very much a compilation & linking process that involves its own set of log message outputs, etc. You can make this part of your program as elaborate as you’d like. Here’s an example compilation process.

size_t log_size;
char *program_log;

/* build the program */
err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);

/* determine if the build failed */
if (err < 0) {

  /* the build has failed here, so we'll now ask OpenCL for the build
     information, so that we can display the errors back to the user */
     
  /* pull back the size of the log */
  clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_LOG, 0, NULL, &log_size);

  /* allocate a buffer and draw the log into a string */
  program_log = (char *)malloc(log_size + 1);
  program_log[log_size] = '\0';
  
  /* pull back the actual log text */
  clGetProgramBuildInfo(program, device, CL_PROGRAM_BUILD_LOG, log_size + 1, program_log, NULL);
  
  /* send the log out to the console */
  printf("%s\n", program_log);
  
  free(program_log);
  exit(1);
}

We’ve got a platform, device, context and program that’s been built. We now need to shift contexts from the host program to the actual OpenCL code that we’ll execute for the purposes of this example. We need to understand what the inputs, outputs, used resources, etc. of the OpenCL code is before we can continue to write the rest of our host.

The OpenCL Code

The main purpose of OpenCL code is really to operate arithmetically on arrays (or strings) of data. The example that I’m suggesting for the purposes of this article takes in two source arrays and produces another array which are the sum of each index. i.e.

c[0] = a[0] + b[0]
c[1] = a[1] + b[1]
. . .
. . .

As above, the source arrays are a and b. The result array (holding the sum of each source array at each index) is c. Here’s the (rather simple) OpenCL code to achieve this.

__kernel void add_numbers(__global const int* a, 
                          __global const int* b, 
                          __global int* c) {
                          
  // get the array index we're going to work on
  size_t idx = get_global_id(0);
  
  // sum the values
  c[idx] = a[idx] + b[idx];
  
}

That’s all there is to it. Things to note are, any function that is to be called in the OpenCL context is called a “kernel”. Kernel’s must be decorated with the __kernel modifier. In this example, the parameters that are passed in are decorated with the __global modifier. This tells OpenCL that these are objects allocated from the global memory pool. You can read up more about these modifiers here. The final thing to note is the use of get_global_id. It’s what gives us the particular index to process in our array here. The parameter that’s supplies allows you to work with 1, 2 or 3 dimensional arrays. Anything over this, the arrays need to be broken down to use a smaller dimension count.

Back to the Host

Back in context of the host, we’ll create the command queue and kernel objects. The command queue allows us to send commands to OpenCL like reading & writing to buffers or executing kernel code. The following code shows the creation of the command queue and kernel.

cl_command_queue queue;
cl_kernel kernel;

/* create the command queue */
queue = clCreateCommandQueue(context, device, 0, &err);

/* check that we create the queue */
if (err < 0) {
  perror("Unable to create a command queue");
  exit(1);
}

/* create the kernel */
kernel = clCreateKernel(program, "add_numbers", &err);

/* check that we created the kernel */
if (err < 0) {
  perror("Unable to create kernel");
  exit(1);
}

Notice that we mentioned the kernel by name here. A kernel object refers to the function! Now that we have a function to execute (or kernel) we now need to be able to pass data to the function. We also need to be able to read the result once processing has finished. Our first job is allocating buffers that OpenCL will be aware of to handle these arrays.

/* here are our source data arrays. "ARRAY_SIZE" is defined elsewhere to
   give these arrays common bounds */
int a_data[ARRAY_SIZE], b_data[ARRAY_SIZE], c_data[ARRAY_SIZE];

/* these are the handles that OpenCL will refer to these memory chunks */
cl_mem a_buffer, b_buffer, c_buffer;

/* initialize a_data & b_data to integers that we want to add */
for (i = 0; i < ARRAY_SIZE; i ++) {
  a_data[i] = random() % 1000;
  b_data[i] = random() % 1000;
}

/* create the memory buffer objects */
a_buffer = clCreateBuffer(context, CL_MEM_READ_ONLY, ARRAY_SIZE * sizeof(int), NULL, &err);
b_buffer = clCreateBuffer(context, CL_MEM_READ_ONLY, ARRAY_SIZE * sizeof(int), NULL, &err);
c_buffer = clCreateBuffer(context, CL_MEM_READ_WRITE, ARRAY_SIZE * sizeof(int), NULL, &err);

In the above snippet, we’ve defined the source arrays and we’ve also created buffers that will hold the information (for use in our OpenCL code). Now all we need to do is to feed the source arrays into the buffers and supply all of the buffers as arguments to our kernel.

/* fill the source array buffers */
err = clEnqueueWriteBuffer(queue, a_buffer, CL_TRUE, 0, 
  ARRAY_SIZE * sizeof(int), a_data, 0, NULL, NULL);
  
err = clEnqueueWriteBuffer(queue, b_buffer, CL_TRUE, 0,
  ARRAY_SIZE * sizeof(int), b_data, 0, NULL, NULL);
  
/* supply these arguments to the kernel */
err = clSetKernelArg(kernel, 0, sizeof(cl_mem), &a_buffer);
err = clSetKernelArg(kernel, 1, sizeof(cl_mem), &b_buffer);
err = clSetKernelArg(kernel, 2, sizeof(cl_mem), &c_buffer);

Now we invoke OpenCL to do the work. In doing this, we need to supply the invocation with a global size and local size. Global size is used to specify the total number of work items being processed. In our case, this is ARRAY_SIZE. Local size is used as the number of work items in each local group. Local size needs to be a divisor of global size. For simplicity, I’ve set these both to ARRAY_SIZE.

int global_size = ARRAY_SIZE,
    local_size = ARRAY_SIZE;
    
/* send the work request to the queue */
err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, 
  &global_size, &local_size, 0, NULL, NULL);
 
/* check that we could queue the work */ 
if (err < 0) {
  perror("Unable to queue work");
  exit(1);
}

/* wait for the work to complete */
clFinish(queue);

After all of the work is completed, we really want to take a look at the result. We’ll send another request to the command queue to read that result array back into local storage. From there, we’ll be able to print the results to screen.

/* read out the result array */
err = clEnqueueReadBuffer(queue, c_buffer, CL_TRUE, 0
  ARRAY_SIZE * sizeof(int), c_data, 0, NULL, NULL);
  
/* check that the read was ok */
if (err < 0) {
  perror("Unable to read result array");
  exit(1);
}

/* print out the result */
for (i = 0; i < ARRAY_SIZE; i ++) {
  printf("%d + %d = %d\n", a_data[i], b_data[i], c_data[i]);
}

Fantastic. Everything’s on screen now, we can see the results. All we’ll do from here is clean up our mess and get out.

clReleaseKernel(kernel);
clReleaseMemObject(a_buffer);
clReleaseMemObject(b_buffer);
clReleaseMemObject(c_buffer);
clReleaseCommandQueue(queue);
clReleaseProgram(program);
clReleaseContext(context);

What a marathon! Hopefully you’ve learnt something from this post. It’s a lot to take in to get a “Hello, World” level application up and running.