Cogs and Levers A blog full of technical stuff

Consistent Hashing

In today’s post, I’m going to run through how Consistent hashing differs from standard modulus distribution and how it can help distributed key searches.

From the wikipedia article:

Consistent hashing is a special kind of hashing such that when a hash table is resized, only K/n keys need to be remapped on average, where K is the number of keys, and n is the number of slots. In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped.

Using a simple modulus

A simple way to balance information between a set of nodes is to take a simple modulus of a hash. The hash of an object is taken; a modulus is calculated with respect to the number of nodes. The information is then assigned to that node:

Number of Nodes = 5
"Joe"           = 348561223
Node            = 348561223 mod 5
                = 3

So, the string "Joe" is then destined to live on node 3.

This sort of balancing gets the information distributed, but starts to really show problems when nodes are added or removed from the set. Under these operations there are large suffles of information between nodes required.

How it Works

The aim here is to lower the sensitivity of a piece of information’s hash identity amongst replicas. This way, we still reap the benefits of being in a distributed system but we don’t incur such a loss at the time of adding or removing nodes. Minimalising this disturbance is what consistent hashing aims to solve.

To achieve consistent hashing, not only the key by which the information is retrieved is cached; so do the nodes managing the information. Both of these elements are added to a ring buffer. When this system gets exercised, and a client is looking for information; the key that they’re looking for will land somewhere on the circular buffer. We continually move clockwise through the buffer until we hit a node to find our information.

Adding and removing nodes from the ring buffer is a matter of distribution from a neighbour now, rather than the whole set.

One problem is the distribution of nodes along the ring buffer. If these nodes clump together, there will be a large hash-space empty that a lot of queries could hit. Adding replicas of each node to the hash seems to saturate the ring sufficiently to mitigate this problem.

Implementations

There are many implementations of consisten hashing available. It’s a good exercise to implement one yourself by hand, but these problems have already been solved for you. Some of the better known uses can be found in projects like Openstack, Amazon’s Dynamo and Apache Cassandra.

There are much simpler examples to look at also:

Paxos consensus algorithm

Paxos is a set of protocols designed to solve consensus problems when implementing fault tolerant distributed systems. When many actors in a system can propose values for a system to use, a controlling process (or protocol) needs to do the job of selecting the value and propagating that value’s selection to other actors in the system.

What are we trying to solve?

As in the introduction, distributed systems have many participants or actors that can propose values. A controlling node or consensus needs to be established on which of these values to actually use. This seems quite trivial in a local sense, but once this problem is distributed amongst separate nodes; the system becomes exponentially more complex. The complexity arises in a distributed setting as messages carrying these values can be lost, delayed or destroyed.

The common issues that you always see in distributed systems really boil down to reliability and consistency. Network communications and computers in general fail making for a less-than-reliable model. When these things fail, vectors open up allowing inconsistencies to creep into our system’s view of the world.

The whole idea of the consensus algorithm is to select a value when they’re on offer. Paxos will only select from values that have been offered; it will only select on of those values and the selection isn’t announced to other actors unless it’s actually made.

Assumptions

There are a few assumptions of all systems that need to be made clear; this is important as the protocols are designed around abstract concepts that are modeled to reality. The following lists have re-produced from the Paxos Wikipedia Article

  • Processors
    • Processors operate at arbitrary speed.
    • Processors may experience failures.
    • Processors with stable storage may re-join the protocol after failures
    • Processors do not collude, lie, or otherwise attempt to subvert the protocol.
  • Network
    • Processors can send messages to any other processor.
    • Messages are sent asynchronously and may take arbitrarily long to deliver.
    • Messages may be lost, reordered, or duplicated.
    • Messages are delivered without corruption.

Roles

With those assumptions in place, we can introduce the main roles in the Paxos system. It’s important to note at this point that physical computing entities can perform one or many of these roles.

Client

The client is the creation and termination point for the system. It’s a request from a client that start the Paxos system in motion. The protocol finished its work with a response being sent back to the client. The client isn’t a part of the system itself, it’s just be protocol’s main protagonist.

Proposer

The proposers job takes on requests from the client. It’s the responsibility of the proposers to get a valued agreed upon.

Voter

It’s the job of the voter to accept values from a proposer. It’s not until the quorum (or majority of voters) agree on a value that a state change happens.

Learner

The learner propagates its information back to the client once it’s notified by the quorum (through a primary acceptor) that a proposal has been accepted.

Workflow

There’s a particular workflow that the Paxos system can follow. There are simple and complex flavors of the workflow which all aim at focusing different parts of the consensus flow. The following steps outline the most basic flavor of the protocol:

Prepare & Promise

A proposer will send a prepare request to as many or every acceptor that it can. If the acceptor has not yet seen another proposal, a prepare response is sent back which promises to not accept another proposal. Any prepare request now received by an acceptor is ignored if the proposal number is lower than the previous request seen.

This is key, because if an acceptor now sees a message with a higher number, it’ll process that prepare request.

Accept

When the quorum send prepare resonses back to the proposer (and this ends up as the majority: by definition of quorum), the proposer can now start to issue accept requests. If an acceptor is sent an accept request that has a number equal or above to the highest proposal that it has seen, it will accept that request.

This information is now propagated down to all of the learners for them to see that a particular proposal has been accepted.

Accepted

The system will now choose a value once learners discover that a majority of nodes have accepted the same value.

Implementation

A simple implementation of this protocol can be found in this GitHub repository. It walks through a few of the different message flows in nicely laid out Python code.

PostgreSQL programming environment

Some databases rely heavily on their programming environment to deliver an imperative programming environment to where a developer can use high-level programming concepts such as control selection, iteration and flow.

Oracle supports PL/SQL, Sql Server supports T-SQL. PostgreSQL supports PL/pgSQL.

Prior to running through some top-level topics on the programming environment, some very important links:

Anonymous Code Blocks

Executing an anonymous code block is achieved with the DO keyword. It emulates setting up any other function, with no parameters and no return value. Because you have a function body, you are afforded the ability to use a DECLARE section to house variables.

DO $$
BEGIN

END $$;

Adding variables to this block:

DO $$
DECLARE age INT;
BEGIN

  age := 24;

  IF age < 18 THEN
    -- handle the below 18 case
  END IF;
END$$;

Because of the parameter and return value constraints on anonymous code blocks, getting information out can be a little tricky. There are some ways that we can get some information out though.

RAISE NOTICE allows us to present information to the console. This is considerably handy when we’re testing a block of code.

DO $$
DECLARE age INT;
BEGIN

  age := 17;

  IF age < 18 THEN
    RAISE NOTICE 'Underage!';
  END IF;
END$$;

Of course, if your code is a re-usable block you should be creating a full function for it. This takes care of any input/output parameter issues as you take full control.

Looking elsewhere

If this isn’t to your liking, you can try any of the other operating environments/languages available:

Monitoring PostgreSQL Databases

Understanding what your database is doing and when is essential to runtime administration, maintenance, monitoring and reporting. Gaining insight into how your system responds to different workloads can also tell you how your current deployment is or isn’t serving your purpose.

There are many great articles on this particular topic already. In today’s post, I’m going to walk through a couple of the simple things you can do to check your system’s runtime status.

Unix Tools

When your database is being hosted in a unix-like environment, you’re given the greatest tools at your disposal to understand what’s happening.

ps can show you running processes on your system. Paired with grep, you can focus ps to look at postgres processes only:

ps -ef | grep postgres
postgres     1     0  0 11:05 ?        00:00:00 postgres
postgres    17     1  0 11:05 ?        00:00:00 postgres: checkpointer process  
postgres    18     1  0 11:05 ?        00:00:00 postgres: writer process  
postgres    19     1  0 11:05 ?        00:00:00 postgres: wal writer process  
postgres    20     1  0 11:05 ?        00:00:00 postgres: autovacuum launcher process  
postgres    21     1  0 11:05 ?        00:00:00 postgres: stats collector process  

iostat and vmstat will also give you some operating system level insight to how your database application is performing.

Statistics Collector

An important, integrated piece of the Postgres architecture is the statistics collector. Using this, you can query to a very low level many pieces of information surrounding your system’s performance.

The following except is just a small sample of all of the views offered by the statistics collector; which are made available to the developer.

View Name Description
pg_stat_activity One row per server process, showing information related to the current activity of that process, such as state and current query. See pg_stat_activity for details.
pg_stat_bgwriter One row only, showing statistics about the background writer process’s activity. See pg_stat_bgwriter for details.
pg_stat_database One row per database, showing database-wide statistics. See pg_stat_database for details.
pg_stat_all_tables One row for each table in the current database, showing statistics about accesses to that specific table. See pg_stat_all_tables for details.
pg_stat_sys_tables Same as pg_stat_all_tables, except that only system tables are shown.
pg_stat_user_tables Same as pg_stat_all_tables, except that only user tables are shown.

The full list can be found here.

Scala type construction

Scala gives the developer flexibility when reasoning about and designing type systems for applications. By using classes and traits, a developer can quickly build a complex hierarchy that can assist in describing constraint and relationship information.

In today’s post, I’m going to walk through a useless but demonstrative example of a type hierarchy and some of the constraint features available to the developer.

Vehicles

We’re going to model some different vehicles. Cars, planes, trucks, skateboards, whatever.

abstract class Vehicle

We could start case-classing this base out or directly adding derivatives that specialise down to the exact vehicle types that we want, but we’re going to reason about some attributes that these vehicles might have. Wheels and Jets.

trait HasWheels extends Vehicle {
  def numberOfWheels: Int
}

trait HasJets extends Vehicle {
  def numberOfJets: Int
}

When a vehicle HasWheels, the type is going to require us to specify numberOfWheels. Likewise numberOfJets for HasJets. These traits are extending our abstract Vehicle class.

When we have wheels, we should be able to set how fast they’re spinning.

trait WheelPropulsion {
  this: HasWheels =>

  def setWheelSpin(velocity: Double) {
    println("Wheel spin: " + velocity)
  }
}

Our WheelPropulsion trait says that this needs to be HasWheels. Makes sense. We can’t spin wheels if we don’t have wheels.

Likewise, we’d want to set the turbine intensity if we have jets.

trait JetPropoulsion {
  this: HasJets =>

  def setTurbine(strength: Double) {
    println("Setting turbine strength to " + strength)
  }
}

Even with this very basic level of type description we can start to make some basic vehicles.

class Motorcycle extends Vehicle with HasWheels
                                 with WheelPropulsion {
  def numberOfWheels = 2                                  
}

class MotorVehicle extends Vehicle with HasWheels
                                   with WheelPropulsion {
  def numberOfWheels = 4
}

class TwinJetPlane extends Vehicle with HasJets
                                   with JetPropoulsion {
  def numberOfJets = 2                                  
}

Mixing in

When you’re assembling a variable of your own, there’s no reason you can’t mix in when creating your own types:

val toyota = new Vehicle() with HasWheels 
                           with WheelPropulsion { 
  def numberOfWheels = 4
}

Of course, we could have just constructed toyota as a MotorVehicle for the same effect. This just demonstrates the instance construction flexibility.

Constraints

Finally, when you’re writing functions that work with your types you can specify rich constraint rules so that you can target functionality with as much precision as you require:

// everything can be painted
def paint(v: Vehicle) = { }

// only a vehicle with wheels can burnout
def doBurnout(v: Vehicle with HasWheels) = { }

As you can see, you not only use the with keyword to define your types; this keyword is also used for variable construction and function signature definition.