Cogs and Levers A blog full of technical stuff

JDBC

JDBC (Java Database Connectivity) is a general purpose data access library baked into the Java development and runtime. This library’s purpose is to lower the level of complexity in connecting to different database vendors providing a consistent interface no matter what database you’re connecting to.

In today’s post, I’ll go through the basics of using this library. This blog post will be in context of connecting to a PostgreSQL database.

Drivers

JDBC is based on the premise of drivers. The driver code itself is what fills in the architecture with an implementation that your applications will use. To enumerate all of the drivers, currently in context of your application you can use the following:

Enumeration drivers = DriverManager.getDrivers();

while (drivers.hasMoreElements()) {
  Driver driver = (Driver) drivers.nextElement();
  System.out.println(driver.getClass());
}

I use the term “in context” because whilst you may have the required JAR installed on your system which will be a particular database vendor’s connection library for JDBC, you’ll need to make sure that it’s available on your class path.

For my example, I only have Postgres available to me:

class org.postgresql.Driver

The driver string that you saw in the section above plays an important role in establishing a connection to your database. Before you can start to work with Connection, Statement and ResultSet objects you first need to load in your vendor’s library implementation.

Class.forName("org.postgresql.Driver");

This will reflect your driver into your application ready for use.

Making a connection

To establish a connection with a database, you’ll need to specify a connection string with all of the attributes required to direct your application to the database.

JDBC has a uniform format for specifying its connections with each vendor. Postgres conncetions are no different.

A connection is established using the DriverManager class like so.

Connection connection = DriverManager.getConnection("jdbc:postgresql://localhost:5432/crumbs?user=postgres&password=password");

Running queries

Running retrieves on your database normally comprises of three processes:

  • Preparing a statement to run
  • Executing the statement
  • Enumerating the results

The preparation of the statement is fairly straight forward. The createStatement method on the Connection object will allow you to create an empty statement, whereas prepareStatement will allow you to provide some SQL directly.

// prepare the statement 
Statement retrieveStatement = connection.createStatement();

// execute the statement
ResultSet streetTypes = retrieveStatement.executeQuery("SELECT * FROM \"StreetType\"");

// enumerate the result
while (streetTypes.next()) {
  int id = streetTypes.getInt(streetTypes.findColumn("ID"));
  String name = streetTypes.getString(streetTypes.findColumn("Name"));
  
  System.out.println(String.format("ID: %d, Name: %s\n", id, name));
}

A slightly more complex example where you’d pass in some parameters into your statement might look like this:

PreparedStatement retrieveStatement = 
  connection.prepareStatement("SELECT * FROM \"StreetType\" WHERE \"ID\" > ?");

retrieveStatement.setInt(1, 10);
ResultSet streetTypes = retrieveStatement.executeQuery();

Enumerating a ResultSet object can be achieved with a simple while loop:

while (streetTypes.next()) {
  int id = streetTypes.getInt(streetTypes.findColumn("ID"));
  String name = streetTypes.getString(streetTypes.findColumn("Name"));
  
  System.out.println(String.format("ID: %d, Name: %s\n", id, name));
}

Cleaning up

Finally, all objects should be cleaned up afterwards by using the close functions provided.

streetTypes.close();
retrieveStatement.close();
connection.close();

Other topics

This blog post is just enough to get up and running. There are plenty more complex topics inside of JDBC to be learned:

H2

H2 is a relational database written entirely in Java. It has an extremely small footprint and has an in-memory mode making it an excellent choice for embedded applications.

In today’s post, I’ll take you through using the H2 shell.

Shell

Once you’ve downloaded H2 from their site, you can get a database created and running using the shell. You can invoke the shell with the following command:

java -cp h2-1.4.190.jar org.h2.tools.Shell -url jdbc:h2:~/testdb

I’m using version 1.4.190 here. The -url command line directs us to the file of the database that we’ll create/open.

Once the shell is running, you’re presented with a sql> prompt. You can start creating your table definitions. The documentation on the website is quite extensive with the supported sql grammar, functions and data types.

Further development

Now that you’ve created a database, you can write java applications using JDBC to run queries against your H2 database.

External modules in Python

You can extend Python relatively easily with the development libraries. Once installed, you can write a module in C, build it and start using it in your Python code.

In today’s post, I’ll create a Hello world module and use it from python.

Environment

In order to get started, you’ll need to prepare your environment with the right tools. It’s also and idea to create a bit of a project structure.

Create a directory that your code will go into. My source structure looks like this:

.
├── Dockerfile
└── pymod
    ├── README
    ├── setup.py
    ├── src
    │   ├── hello.c
    │   └── hello.h
    └── test.py

My Dockerfile looks as follows. This should describe how to setup your environment:

FROM python:2

RUN apt-get update && \
    apt-get install -y build-essential python-dev && \
    apt-get clean && \
    rm -Rf /tmp/* /var/tmp/*

The module

The module itself really consists of a c header and source file and a setup.py file to build/install the module.

The header file looks as follows:

#ifndef __hello_h_
#define __hello_h_

#include <Python.h>

static PyObject *hello_say_hello(PyObject *self, PyObject *args);

#endif

Note the Python.h header as well as the PyObject types being used. These are a part of the python-dev library that we installed before. This header file then gets implemented pretty simply. Here I’ve cheated using printf to do the printing for us:

#include "hello.h"

static char module_doc[] = "This is a simple, useless, hello module";
static char say_hello_doc[] = "This function will say hello";

static PyMethodDef module_methods[] = {
  { "say_hello", hello_say_hello, METH_VARARGS, say_hello_doc },
  { NULL, NULL, 0, NULL }
};

PyMODINIT_FUNC init_hello(void) {
    PyObject *m = Py_InitModule3("_hello", module_methods, module_doc);

    if (m == NULL)
        return;
}

PyObject *hello_say_hello(PyObject *self, PyObject *args) {
  printf("I'm here");
  return Py_None;
}

A brief analysis of this code sees us building a PyMethodDef array. We expose it out using Py_InitModule3' from within the initialization function (typed with PyMODINIT_FUNC`).

To out actual function itself, we’re printing “I’m here” to the console and then bailing out with a return value of Py_None, which is equivalent to None.

Building

To build our module, we’ll use setup.py. It’ll read as follows:

from distutils.core import setup, Extension

setup(
  ext_modules=[Extension("_hello", ["src/hello.c"])]
)

To invoke the build, we issue the following:

python setup.py build_ext --inplace

Testing it out

Now that our module is built, we can give it a test. Easiest to use the python console to do this for us:

Python 2.7.10 (default, Sep  9 2015, 20:21:51) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import _hello
>>> _hello.say_hello()
I'm here>>> 

Finishing up

I couldn’t pick a simpler example. More complex examples to come, but this is how we establish the bridge between C and python.

CQRS

CQRS stands for Command Query Responsibility Segregation and is a software pattern focused on separating code that reads a data model’s state from code that updates a data model’s state. Ultimately, the implementation of this pattern leads to performance gains, scalability and headroom to support changes to the system down the line.

Separating out your reads and your writes can also give you an increased level of security.

Models

The query model is in charge of all of your retrieves. The whole premise of having a query model is that a query will only read information and not change anything on the way through. Making this part of the process pure in the interest of the model.

The command model are all of the items of work that we’re going to perform against our model (our update model) that changes state.

Tie it all together

This is an event sourcing system, so update messages will be routed through a command layer. Where the join back to the data store that retrieves come out of is an implementation detail.

Having this separation directly at the data layer may incur eventual consistency scenarios; desirable in some settings, unacceptable in others. An event bus manages the marshaling of commands from the user interface through to the data layer. This is also an opportunity to put these commands into an event stream.

Final notes

This pattern isn’t for every situation. It should be used/applied the same way as you’d apply any other pattern; with a great measure of study and common sense. Scenarios where you have a very high contention rate for data writers would be a very good fit.

Logging in python

Adding logging to your python applications is simple with the standard logging module. The best part about having a standard logging module, is that every application has the ability to send log information in the same way.

In today’s post, I’ll go through a small howto on setting up logging.

Logging levels

Much like any other logging framework, python’s logging framework expects that you’ll send messages out to the logger that belong to a particular class (or level). The levels are as follows:

Level Description
DEBUG Debug output, trace
INFO Informational output on successful events
WARNING Pre-emptive notification of failures or unexpected events
ERROR Processing failed
CRITICAL Processing failed and the application can not recover

Getting started

import logging

logging.debug('Debug message')
logging.info('Info message')
logging.warning('Warning message')
logging.error('Error message')
logging.critical('Critical message')

The code above ends up emitting the following text:

WARNING:root:Warning message
ERROR:root:Error message
CRITICAL:root:Critical message

You notice a couple of things here; first we asked for a debug and info message here but we never saw one. Secondly, we see the format of the messages being written:

LEVEL:NAME:MESSAGE

We’re using the root logger.

The default logging level is WARNING. Anything requested below this (like INFO and DEBUG) are not emitted by the logger. We can change this with basicConfig.

logging.basicConfig(level=logging.DEBUG)

Being that DEBUG is the lowest level logger that you can ask for, we should see all of the messages.

Further configuration

To give your logger a little more context for your application, you can control the formatting parameters. The information that you can specify into your log lines is specified here.

For this example, we’ll just have the time and the log line.

logging.basicConfig(format='%(asctime)s %(message)s', level=logging.DEBUG)

We now are presented with the following:

2016-02-06 22:26:27,494 Debug message
2016-02-06 22:26:27,494 Info message
2016-02-06 22:26:27,494 Warning message
2016-02-06 22:26:27,494 Error message
2016-02-06 22:26:27,494 Critical message

There is so much more to look at in this module and you can find it here.