Cogs and Levers A blog full of technical stuff

Working with OpenSSL

OpenSSL is the open source project that provides the world with SSL and TLS. In today’s post, I’ll walk through some simple tasks to encrypt and decrypt your data.

Features

OpenSSL is a very feature-rich library. It contains many pieces of functionality that you should study in more detail. The man page for it goes into all of these details in great depth.

Encoding information

Perhaps a slightly edge-case piece of functionality, OpenSSL has the ability to Base64 encode your information. It’s no where near actually securing your information, but the facility is there.

You can Base64 encode a string with the following command:

$ echo "Hello, world!" | openssl enc -base64
SGVsbG8sIHdvcmxkIQo=

You can bring it back to plain text with the following:

 
$ echo "SGVsbG8sIHdvcmxkIQo=" | openssl enc -base64 -d
Hello, world!

Encrypt with a password

OpenSSL gives you the ability to encrypt a piece of information using a password. This is a simple way of securing your information without certificates, but isn’t a very strong strategy for information security.

Take a look under the Encoding and Cipher Commands for a full range of strategies here. Where we used the base64 options above, no password was asked for. This is because it’s just an encoding. If we were to use the bf option (which will use the Blowfish Cipher), we’re prompted for a password.

$ echo "Hello, world" | openssl enc -bf > password_enc.dat
enter bf-cbc encryption password:
Verifying - enter bf-cbc encryption password:

password_enc.dat contains what would appear to be garbage, but it is our string; just encrypted. To get our plain text back:

$ openssl enc -bf -d -in password_enc.dat 
enter bf-cbc decryption password:
Hello, world!

You need to enter the correct password in order to get your plain text back. Pretty simple. This is the process for any of the ciphers mentioned above.

Encrypt with a key pair

Stepping up the complexity, you can get OpenSSL to encrypt and decrypt your data using public-key cryptographyy.

First of all, we need to generate a public/private key pair. The following command will generate a private key. This will be an RSA keypair with a 4096 bit private key.

$ openssl genrsa -out private_key.pem 4096
Generating RSA private key, 4096 bit long modulus
....++
......................................................................................................................................................................................................................++
e is 65537 (0x10001)

Now that the private key has been generated, we extract the public key from it:

$ openssl rsa -pubout -in private_key.pem -out public_key.pem
writing RSA key

You can view all of the details of your keypair details with the following command. It’s a pretty verbose information dump, so brace yourself.

$ openssl rsa -text -in private_key.pem

We encrypt the source information with the public key and perform the decryption using the private key.

To encrypt the information:

$ echo "Hello, world" > encrypt.txt
$ openssl rsautl -encrypt -inkey public_key.pem -pubin -in encrypt.txt -out encrypt.dat

To decrypt the information:

$ openssl rsautl -decrypt -inkey private_key.pem -in encrypt.dat -out decrypt.txt
$ cat decrypt.txt 
Hello, world

Working with certificates

You can use OpenSSL to generate a self-signed certificate.

Generating a self-signed certificate is a fairly simple process. The following will generate a certificate and private key (in the one file) that’s valid for 1 year. This certificate’s key won’t be protected by a passphrase.

$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mycert.pem -out mycert.pem

You can shorted the key generation process (make it ask less questions) by specifying all of the subject details in the generation command:

$ openssl req -x509 -nodes -days 365 -subj '/C=AU/ST=Queensland/L=Brisbane/CN=localhost' -newkey rsa:4096 -keyout mycert2.pem -out mycert2.pem

Other functions

You can use OpenSSL to generate some random data for you as well. This is useful in scenarios where your application requires nonce data. The rand switch does this easily:

$ openssl rand -base64 128
tfINhtHHe5LCek2mV0z6OlCcyGUaHD6xM0jQYAXPNVpy0tjoEB4gy7m6f/0Fb4/K
cKyDfZEmpvoc3aYdQuCnH1kfJk1EQR1Gbb3xyW22KOcfjuEot5I+feinilJcDfWY
aJKDyuNUOn9YuZ8aALhP1zhA0knAT5+tKtNxjjNar04=

Piping the contents of /dev/urandom through OpenSSL’s base64 encoder will also perform the same task (with better entropy).

Prime testing is an important cryptographic step and can be achieved with the prime switch:

$ openssl prime 3
3 is prime
$ openssl prime 4
4 is not prime
$ openssl prime 5
5 is prime
$ openssl prime 6
6 is not prime

A really practical utility bundled inside of OpenSSL is the testing server that you can instantiate to test out your certificates that you generate.

$ openssl s_server -cert mycert.pem -www

This starts a HTTPS server on your machine. You can point your web browser to https://server:4433/ to see how a browser responds to your certificate.

You can also use OpenSSL as a client to pull down remote certificates:

$ openssl s_client -connect server:443

Asymmetric encryption with OpenSSL in Ruby

Asymmetric encryption is a category of cryptographic strategies employed to share information between two parties using two separate keys.

In today’s post, I want to show how the encryption flow actually works using some Ruby code.

Decisions

Before we can get started, we need to make some decisions regarding the encryption that we’re going to use. The two assumptions that I’ve made up front are about the key size and digest function. I’ve stored these assumptions in a hash up front:

common = {
	:key_length  => 4096,
	:digest_func => OpenSSL::Digest::SHA256.new
}

We’ll use 4096 bit key lengths and SHA-256 as our digest function.

Parties

First thing that we have to establish is that we have two parties. They both want to send a message to each other that no one else can read. They’re both defined in our ruby code as party_a and party_b.

There’s no network separating these parties, so you’ll have to use your imagination.

To create a party, I’ve used the following:

def make_party(conf, name)

	# create a public/private key pair for this party
	pair = OpenSSL::PKey::RSA.new(conf[:key_length])

	# extract the public key from the pair
	pub  = OpenSSL::PKey::RSA.new(pair.public_key.to_der)

	{ :keypair => pair, :pubkey => pub, :name => name }

end

Using the configuration assumptions that we’d declared above, this function will create a key pair, extract the public key and give the party a name (nice for display purposes).

Processing a message

Next up, we’ll prepare a message to send. There’s a little trickery that you need to remember here:

  • :keypair is private and never seen by the other party
  • :pubkey is distributed between the parties

To prove that the message was sent by the originator, the sender generates a signature for the message. This is done by the sender using the sender’s private key and the pre-defined digest function:

# using the sender's private key, generate a signature for the message
signature = from_party[:keypair].sign(conf[:digest_func], message)

Using the recipient’s public key, the sender will encrypt the plain text:

# messages are encrypted (by the sender) using the recipient's public key
encrypted = to_party[:pubkey].public_encrypt(message)

The recipient can now decrypt the message using their private key:

# messages are decrypted (by the recipient) using their private key
decrypted = to_party[:keypair].private_decrypt(encrypted)

Finally, the recipient can verify that the message is actually from the sender by checking the signature:

if from_party[:pubkey].verify(conf[:digest_func], signature, decrypted)
	puts "Verified!"
end

That’s all there is to it.

A full working gist that this article uses code from can be found here.

Using yield to create generators in python

Generators are python functions that act like iterators. This abstraction allows you to simplify a lot of your for-loop code, implement lazy evaluation and even create more intelligent value producing iterators.

In today’s post, I’ll go through a basic usage of the yield keyword; the generator that’s created as a result and how you can interact with this function type.

Prime number example

To produce the generator, I’ve written a function that will filter through numbers picking out prime numbers. The algorithm isn’t highly optimised. It’s quite crude/brute-force in its approach, but it’ll be enough for us to understand the generator function.

import math

def primes():
	ps, cur = [2], 3
	yield 2
	while True:
		y = int(math.sqrt(cur))
		c = next((x for x in ps if x < y and (cur % x) == 0), None)

		if c == None:
			yield cur
			ps.append(cur)

		cur += 2

We’re maintaining an internal list of primes that we’ve found. When we come across a potential candidate, we try to divide it by primes that we’ve already found. To cut down on the number of divides, we only go for numbers lower than the square root of the candidate.

Note the use of yield. As we call yield, this makes another value available in the iterator. You can see that this is an iterator that doesn’t end. Well, it will end - once the integer data type overflows. If we were using a data type that wasn’t susceptible to this type of overflow, we’d only be limited by the amount of memory in the machine.

Iterating

So, we’ve created what appears to be an infinite list. Testing it out in the REPL:

>>> ps = primes()
>>> ps
<generator object primes at 0x7fa1396e8af0>
>>> ps.next()
2
>>> ps.next()
3
>>> ps.next()
5
>>> ps.next()
7
>>> ps.next()
9
>>> ps.next()
11

ps is the generator, and we’re able to call the next function on it. As we do that, we progress through the iterator. We can start to work with ps now as if it were any other iterator.

Using a list comprehension, we can find the first 10 primes:

>>> ps = primes()
>>> [ps.next() for _ in xrange(1, 10)]
[2, 3, 5, 7, 9, 11, 13, 15, 17]

Using itertools we can get all of the prime numbers under 100:

>>> import itertools
>>> list(itertools.takewhile(lambda x: x < 100, primes()))
[2, 3, 5, 7, 9, 11, 13, 15, 17, 19, 23, 25, 29, 31, 35, 37, 41, 43, 47, 49, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

yield allows you to make generators which are the potential to create values, as opposed to the values themselves. It’s not until you start to iterate over the generator that the values start to materialise.

Exploring blocks in ruby

Ruby has a very cool feature called code blocks. Sometimes referred to as closures, code blocks are custom pieces (or blocks) of ruby code that you specify to functions that inject your code block whenever the yield keyword is used.

In today’s post, I’m just going to present a simple example and usage.

Source data

The example we’re going to use will be a Manager to Employee style relationship. A Person class is going to manage an array of Person objects that we’ll classify as staff for that person. Here’s the class definition:

class Person

	attr_reader :name, :position, :staff

	def initialize(name, position, staff=[])
		@name = name
		@position = position
		@staff = staff
	end

	def to_s
		@name + ' (' + @position + ') '
	end
end

So a Person will have a name, position and staff. Some sample data using this structure might look as follows:

bob = Person.new('Bob', 'Support Officer')
joe = Person.new('Joe', 'Support Officer')
mary = Person.new('Mary', 'Technology Director', [bob, joe])

Simple example

mary is the manager for joe and bob. Using the Array function each, we can use a code block to present each person to screen:

mary.staff.each do |p|
	puts "Mary is the manager for #{p}"
end

That’s our first code block. each will run our code block for every Person in mary’s staff array.

Another level

If we introduce ‘bill’ as the manager of the company:

bill = Person.new('Bill', 'Managing Director', [mary])

We can use each to look at Bill’s staff, which is just Mary at this stage. More interestingly, we could implement our own function on the Person class that shows all of that person’s descendants.

def descendants

	@staff.each do |s|
		yield s

		# recurse 
		s.descendants do |q|
			yield q
		end

	end

end

We’re going to call any code block specified to our descendants function for each of the staff that are managed by this Person object, but we’re also going to call each descendant’s descendants function so that we recurse down the tree.

We could augment this call slightly to also include the manager of the descendants:

def descendants

	@staff.each do |s|
		yield s

		# recurse 
		s.descendants do |q|
			yield q, s
		end

	end

end

This will supply a manager to the calling block, 1 level down from where we specify.

bill.descendants do |p, s|
	if s == nil
		puts "#{p} is managed by #{bill}"
	else
		puts "#{p} is managed by #{s}"
	end
end

This code here emits the following:

Mary (Technology Director)  is managed by Bill (Managing Director) 
Bob (Support Officer)  is managed by Mary (Technology Director) 
Joe (Support Officer)  is managed by Mary (Technology Director) 

Mary’s manager variable in the block comes through as nil as she’s a direct descendant of bill, so we handle this case in the block as opposed to in descendants.

You can specify as many parameters as you want in your yield. It’s your block’s responsibility to do something useful with them!

Other options

Within your function, you can test the block_given? property for a boolean that will determine if the calling code did or didn’t specify a block.

You can also have a parameter specified in your function &block which can be handy should you need to pass the block around.

Dissecting dmesg with awk

AWK is a programming language that deals with processing text in a sequence of pattern matching rules. It’s really handy for reducing massive amounts of text into just the information that you care about. The full user guide for AWK can be found here.

Rather than take you on a tour through the user guide, I thought today’s post might be better as a practical example. I’m going to present some useful functions with AWK using the Linux Kernel’s dmesg output as source data.

As a final note, a lot if not all of the information that I’ll present below can be transformed into a “one liner”. There’s quite a few instances of crafty AWK hackers putting these together. I just want to present some of the language.

Source data

The dmesg data is in an easy-enough format to work with. Taking the first few lines as an example:

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Initializing cgroup subsys cpuacct

We see that there is an elapsed time figure surrounded with square brackets, the rest of the line is the log text. Further on through the text, we start to see the log lines prefixed with a driver name also:

[    4.871693] vboxdrv: Found 8 processor cores.
[    4.872033] vboxdrv: fAsync=0 offMin=0x19e offMax=0xcb6

Basic usage

For the purposes of today’s post, the following usage is going to be most useful to us

dmesg | awk -f our-awk-script.awk

This supplies the dmesg output to our AWK script.

To accomplish this task, we’re going to use a regular expression to pick out each line with “fail” in it.

/ failed / {
	print $0
}

Immediately, you can see that AWK statements take the shape of:

condition { actions }

The action here print $0 prints the whole, captured line to the console. Other variables are available to be printed such as $1, $2, and so on. These numbered variables take chunks of the captured string, split by a space character as its delimiter.

Exploring the variables

Just to take a look at those variables a little closer, we can augment our initial rule slightly to see what’s contained in those variables:

/ failed / {
	print "$0: ", $0
	print "$1: ", $1
	print "$2: ", $2
	print "$3: ", $3
	print "$4: ", $4
}

Run for one line of text matching the “failed” rule:

$0:  [    1.804314] iwlwifi 0000:03:00.0: Direct firmware load failed with error -2
$1:  [
$2:  1.804314]
$3:  iwlwifi
$4:  0000:03:00.0:

Listing out which drivers mentioned the word “failed”

AWK has a very flexible associative array type as well. We can basically reference any variable with any index we choose. For the next progression of this script, we’ll build an array of driver names with an instance count so we can just give the user a report of the which drivers were mentioned how many times.

/ failed / { 
	drivers[$3] = drivers[$3] + 1
}

END {
	for (driver in drivers) {
		print driver ":", drivers[driver]
	}	
}

$3 is giving us the driver name, so we just increment a value in the array for that driver. END is something new. It’s executed, at the end. We enumerate the array that we’ve built, printing the name of the driver and the count.

Running this, I get the following result:

nouveau: 1
nouveau:: 1
iwlwifi: 2

That’s annoying. nouveau appears in the report twice because it’s mentioned with and without a colon : character in the source text.

[    1.687503] nouveau E[     DRM] failed to create 0x80000080, -22
[    1.687631] nouveau: probe of 0000:01:00.0 failed with error -22

Adding a call to gsub to perform a simple string replacement does the trick. gsub is a part of AWK’s string functions.

/ failed / { 
	gsub(/\:/, "", $3)
	drivers[$3] = drivers[$3] + 1
}

With an output like this

nouveau: 2
iwlwifi: 2

Much better.

Just as we have an ‘END’ section above, we are also given the ability to write code in a ‘BEGIN’ section that will kick off before any of our pattern rules are executed.

Using boolean logic in conditions

AWK conditions aren’t just regular expressions, they can incorporate boolean logic from the file also. You can test any variable like a normal boolean condition. In the following example, I don’t want to count failures that come out of the iwlwifi driver.

/ failed / && $3 != "iwlwifi" { 
	gsub(/\:/, "", $3)
	drivers[$3] = drivers[$3] + 1
}

Other functions to check out

If at any time, your rule wants to bug out of the script entirely - wire up the exit call. If you just want to stop processing this line of text and move on to the next, you can use next.