Cogs and Levers A blog full of technical stuff

Packing data with Python

Defining how a sequence of bytes sits in a memory buffer or on disk can be challenging from time to time. Since everything that you’ll work with is a byte, it makes sense that we have an intuitive way to work with this information agnostic of the overlying type restrictions that the language will enforce on us.

In today’s post, I’m going to run through Python’s byte string packing and unpacking using the struct package.

Basics

From the Python documentation:

This module performs conversions between Python values and C structs represented as Python bytes objects. This can be used in handling binary data stored in files or from network connections, among other sources. It uses Format Strings as compact descriptions of the layout of the C structs and the intended conversion to/from Python values.

When working with a byte string in Python, you prefix your literals with b.

>>> b'Hello'
'Hello'

The ord function call is used to convert a text character into its character code representation.

>>> ord(b'H')
72
>>> ord(b'e')
101
>>> ord(b'l')
108

We can use list to convert a whole string of byte literals into an array.

>>> list(b'Hello')
[72, 101, 108, 108, 111]

The compliment to the ord call is chr, which converts the byte-value back into a character.

Packing

Using the struct module, we’re offered the pack function call. This function takes in a format of data and then the data itself. The first parameter defines how the data supplied in the second parameter should be laid out. We get started:

>>> import struct

If we pack the string 'Hello' as single bytes:

>>> list(b'Hello')
[72, 101, 108, 108, 111]
>>> struct.pack(b'BBBBB', 72, 101, 108, 108, 111)
b'Hello'

The format string b'BBBBB' tells pack to pack the values supplied into a string of 5 unsigned values. If we were to use a lower case b in our format string, pack would expect the byte value to be signed.

>>> struct.pack(b'bbbbb', 72, 101, 108, 108, 111)
b'Hello'

This only gets interesting once we send a value that would make the request overflow:

>>> struct.pack(b'bbbbb', 72, 101, 108, 129, 111)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
struct.error: byte format requires -128 <= number <= 127

The following tables have been re-produced from the Python documentation.

Byte order, size and alignment

Character Byte order Size Alignment
@ native native native
= native standard none
< little-endian standard none
> big-endian standard none
! network (= big-endian) standard none

Types

Format C Type Python type Standard size Notes
x pad byte no value    
c char bytes of length 1 1  
b signed char integer 1 (1),(3)
B unsigned char integer 1 (3)
? _Bool bool 1 (1)
h short integer 2 (3)
H unsigned short integer 2 (3)
i int integer 4 (3)
I unsigned int integer 4 (3)
l long integer 4 (3)
L unsigned long integer 4 (3)
q long long integer 8 (2), (3)
Q unsigned long long integer 8 (2), (3)
n ssize_t integer   (4)
N size_t integer   (4)
f float float 4 (5)
d double float 8 (5)
s char[] bytes    
p char[] bytes    
P void * integer   (6)

Unpacking

The direct reverse process of packing bytes into an array, is unpacking them again into usable variables inside of your python code.

>>> struct.unpack(b'BBBBB', struct.pack(b'BBBBB', 72, 101, 108, 108, 111))
(72, 101, 108, 108, 111)
>>> struct.unpack(b'5s', struct.pack(b'BBBBB', 72, 101, 108, 108, 111))
(b'Hello',)

Consistent Hashing

In today’s post, I’m going to run through how Consistent hashing differs from standard modulus distribution and how it can help distributed key searches.

From the wikipedia article:

Consistent hashing is a special kind of hashing such that when a hash table is resized, only K/n keys need to be remapped on average, where K is the number of keys, and n is the number of slots. In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped.

Using a simple modulus

A simple way to balance information between a set of nodes is to take a simple modulus of a hash. The hash of an object is taken; a modulus is calculated with respect to the number of nodes. The information is then assigned to that node:

Number of Nodes = 5
"Joe"           = 348561223
Node            = 348561223 mod 5
                = 3

So, the string "Joe" is then destined to live on node 3.

This sort of balancing gets the information distributed, but starts to really show problems when nodes are added or removed from the set. Under these operations there are large suffles of information between nodes required.

How it Works

The aim here is to lower the sensitivity of a piece of information’s hash identity amongst replicas. This way, we still reap the benefits of being in a distributed system but we don’t incur such a loss at the time of adding or removing nodes. Minimalising this disturbance is what consistent hashing aims to solve.

To achieve consistent hashing, not only the key by which the information is retrieved is cached; so do the nodes managing the information. Both of these elements are added to a ring buffer. When this system gets exercised, and a client is looking for information; the key that they’re looking for will land somewhere on the circular buffer. We continually move clockwise through the buffer until we hit a node to find our information.

Adding and removing nodes from the ring buffer is a matter of distribution from a neighbour now, rather than the whole set.

One problem is the distribution of nodes along the ring buffer. If these nodes clump together, there will be a large hash-space empty that a lot of queries could hit. Adding replicas of each node to the hash seems to saturate the ring sufficiently to mitigate this problem.

Implementations

There are many implementations of consisten hashing available. It’s a good exercise to implement one yourself by hand, but these problems have already been solved for you. Some of the better known uses can be found in projects like Openstack, Amazon’s Dynamo and Apache Cassandra.

There are much simpler examples to look at also:

Paxos consensus algorithm

Paxos is a set of protocols designed to solve consensus problems when implementing fault tolerant distributed systems. When many actors in a system can propose values for a system to use, a controlling process (or protocol) needs to do the job of selecting the value and propagating that value’s selection to other actors in the system.

What are we trying to solve?

As in the introduction, distributed systems have many participants or actors that can propose values. A controlling node or consensus needs to be established on which of these values to actually use. This seems quite trivial in a local sense, but once this problem is distributed amongst separate nodes; the system becomes exponentially more complex. The complexity arises in a distributed setting as messages carrying these values can be lost, delayed or destroyed.

The common issues that you always see in distributed systems really boil down to reliability and consistency. Network communications and computers in general fail making for a less-than-reliable model. When these things fail, vectors open up allowing inconsistencies to creep into our system’s view of the world.

The whole idea of the consensus algorithm is to select a value when they’re on offer. Paxos will only select from values that have been offered; it will only select on of those values and the selection isn’t announced to other actors unless it’s actually made.

Assumptions

There are a few assumptions of all systems that need to be made clear; this is important as the protocols are designed around abstract concepts that are modeled to reality. The following lists have re-produced from the Paxos Wikipedia Article

  • Processors
    • Processors operate at arbitrary speed.
    • Processors may experience failures.
    • Processors with stable storage may re-join the protocol after failures
    • Processors do not collude, lie, or otherwise attempt to subvert the protocol.
  • Network
    • Processors can send messages to any other processor.
    • Messages are sent asynchronously and may take arbitrarily long to deliver.
    • Messages may be lost, reordered, or duplicated.
    • Messages are delivered without corruption.

Roles

With those assumptions in place, we can introduce the main roles in the Paxos system. It’s important to note at this point that physical computing entities can perform one or many of these roles.

Client

The client is the creation and termination point for the system. It’s a request from a client that start the Paxos system in motion. The protocol finished its work with a response being sent back to the client. The client isn’t a part of the system itself, it’s just be protocol’s main protagonist.

Proposer

The proposers job takes on requests from the client. It’s the responsibility of the proposers to get a valued agreed upon.

Voter

It’s the job of the voter to accept values from a proposer. It’s not until the quorum (or majority of voters) agree on a value that a state change happens.

Learner

The learner propagates its information back to the client once it’s notified by the quorum (through a primary acceptor) that a proposal has been accepted.

Workflow

There’s a particular workflow that the Paxos system can follow. There are simple and complex flavors of the workflow which all aim at focusing different parts of the consensus flow. The following steps outline the most basic flavor of the protocol:

Prepare & Promise

A proposer will send a prepare request to as many or every acceptor that it can. If the acceptor has not yet seen another proposal, a prepare response is sent back which promises to not accept another proposal. Any prepare request now received by an acceptor is ignored if the proposal number is lower than the previous request seen.

This is key, because if an acceptor now sees a message with a higher number, it’ll process that prepare request.

Accept

When the quorum send prepare resonses back to the proposer (and this ends up as the majority: by definition of quorum), the proposer can now start to issue accept requests. If an acceptor is sent an accept request that has a number equal or above to the highest proposal that it has seen, it will accept that request.

This information is now propagated down to all of the learners for them to see that a particular proposal has been accepted.

Accepted

The system will now choose a value once learners discover that a majority of nodes have accepted the same value.

Implementation

A simple implementation of this protocol can be found in this GitHub repository. It walks through a few of the different message flows in nicely laid out Python code.

PostgreSQL programming environment

Some databases rely heavily on their programming environment to deliver an imperative programming environment to where a developer can use high-level programming concepts such as control selection, iteration and flow.

Oracle supports PL/SQL, Sql Server supports T-SQL. PostgreSQL supports PL/pgSQL.

Prior to running through some top-level topics on the programming environment, some very important links:

Anonymous Code Blocks

Executing an anonymous code block is achieved with the DO keyword. It emulates setting up any other function, with no parameters and no return value. Because you have a function body, you are afforded the ability to use a DECLARE section to house variables.

DO $$
BEGIN

END $$;

Adding variables to this block:

DO $$
DECLARE age INT;
BEGIN

  age := 24;

  IF age < 18 THEN
    -- handle the below 18 case
  END IF;
END$$;

Because of the parameter and return value constraints on anonymous code blocks, getting information out can be a little tricky. There are some ways that we can get some information out though.

RAISE NOTICE allows us to present information to the console. This is considerably handy when we’re testing a block of code.

DO $$
DECLARE age INT;
BEGIN

  age := 17;

  IF age < 18 THEN
    RAISE NOTICE 'Underage!';
  END IF;
END$$;

Of course, if your code is a re-usable block you should be creating a full function for it. This takes care of any input/output parameter issues as you take full control.

Looking elsewhere

If this isn’t to your liking, you can try any of the other operating environments/languages available:

Monitoring PostgreSQL Databases

Understanding what your database is doing and when is essential to runtime administration, maintenance, monitoring and reporting. Gaining insight into how your system responds to different workloads can also tell you how your current deployment is or isn’t serving your purpose.

There are many great articles on this particular topic already. In today’s post, I’m going to walk through a couple of the simple things you can do to check your system’s runtime status.

Unix Tools

When your database is being hosted in a unix-like environment, you’re given the greatest tools at your disposal to understand what’s happening.

ps can show you running processes on your system. Paired with grep, you can focus ps to look at postgres processes only:

ps -ef | grep postgres
postgres     1     0  0 11:05 ?        00:00:00 postgres
postgres    17     1  0 11:05 ?        00:00:00 postgres: checkpointer process  
postgres    18     1  0 11:05 ?        00:00:00 postgres: writer process  
postgres    19     1  0 11:05 ?        00:00:00 postgres: wal writer process  
postgres    20     1  0 11:05 ?        00:00:00 postgres: autovacuum launcher process  
postgres    21     1  0 11:05 ?        00:00:00 postgres: stats collector process  

iostat and vmstat will also give you some operating system level insight to how your database application is performing.

Statistics Collector

An important, integrated piece of the Postgres architecture is the statistics collector. Using this, you can query to a very low level many pieces of information surrounding your system’s performance.

The following except is just a small sample of all of the views offered by the statistics collector; which are made available to the developer.

View Name Description
pg_stat_activity One row per server process, showing information related to the current activity of that process, such as state and current query. See pg_stat_activity for details.
pg_stat_bgwriter One row only, showing statistics about the background writer process’s activity. See pg_stat_bgwriter for details.
pg_stat_database One row per database, showing database-wide statistics. See pg_stat_database for details.
pg_stat_all_tables One row for each table in the current database, showing statistics about accesses to that specific table. See pg_stat_all_tables for details.
pg_stat_sys_tables Same as pg_stat_all_tables, except that only system tables are shown.
pg_stat_user_tables Same as pg_stat_all_tables, except that only user tables are shown.

The full list can be found here.