Cogs and Levers A blog full of technical stuff

Purging all of the data within Solr

During the test phases of getting your software setup, you’ll find it useful to completely toast what ever data you’ve already indexed to start fresh. This is as simple as issuing a delete query with open criteria *.*. The full query should translate to

<delete><query>*.*</query></delete>

As a URL it’ll look like this:

http://[your solr server]:8080/solr/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E&commit=true

Note that there is a commit at the end of this URL which will perform the delete and commit the result all in the one invocation.

Pixel Access to the Canvas with Javascript

Introduction

Gaining pixel-level access using the HTML canvas opens up some possibilities for some frame-buffer style rasterisation. Today’s post will focus on the code required to get you access to this array. Here’s the code on how to get started:

// create the canvas object
var canvas = document.createElement("canvas");

// maximise the canvas to stretch over the window
canvas.width = window.innerWidth;
canvas.height = window.innerHeight;

// get the 2d drawing context for 
var cxt = canvas.getContext("2d");
// get the image data
var imageData = cxt.createImageData(width, height);

// save off the dimensions for later use
var width = canvas.width;
var height = canvas.height;

// get the canvas on the page
document.body.appendChild(canvas);

First of all, we programmatically create our canvas object using document.createElement. Using the inner dimensions of the window, we can then set the canvas’ size. Of course this can be custom set to the dimensions you require - I just like to take over the whole window! Using the canvas object, we then pull out the drawing context with getContext. The next part, using createImageData we then get a reference to the frame-buffer to draw to. This gives us read/write access to the canvas through an array. Finally, we’ll take note of the width and height (this will come in handy later) and then pop the canvas onto the page.

Frame-buffer structure

So, I say “frame-buffer” - but it’s just an array. It’s quite nicely laid out such that pixels start at every 4 elements within the array. The first element being the red component, second is green, third is blue and the fourth is the alpha. Calculating an offset into the array is a piece of cake. For example, take the following piece of code which will allow you to set a single pixel on the frame-buffer.

var setPixel = function(x, y, r, g, b, a, buffer) {
  // calculate the start of the pixel
  var offset = ((y * width) + x) * 4;
  
  // set the components
  buffer[offset] = r;
  buffer[offset] = g;
  buffer[offset] = b;
  buffer[offset] = a;
};

The main part to focus on here is the calculation of the offset. Above, I said it was important to take note of the dimensions - we’re only using the width here. This is pretty straight forward calculation of an offset within a linear data segment with Cartesian co-ordinates.

Flip out!

Now that we’ve drawn all of the data to the image buffer (frame-buffer), we need a way to get it back onto our canvas. This is simply done using putImageData.

cxt.putImageData(imageData, 0, 0);

That’s it for now.

mpt-statusd detected non-optimal RAID status

After installing Debian within a few VMWare virtual machines, I keep getting a rather annoying and persistent message spamming out my unix mail box as well as /var/log/messages.

mpt-statusd: detected non-optimal RAID status

Simplest solution that I’ve come across is to just . . .

$ sudo apt-get remove mpt-status

Yup! That’s it.

Complementing MongoDB with Full Text Search from Solr

Introduction

MongoDB is a great database, but one area that I’ve noticed it’s been deficient in is full text search. Thankfully, there are some great tools around that we can employ to compliment Mongo and give it this functionality. Today’s post will be a walk through to getting Solr & mongo-connector installed and configured on Debian Wheezy.

Get the software

First up, install Solr on tomcat with the tomcat administration tools

$ sudo apt-get install solr-tomcat tomcat6-admin

Straight after this has installed, you’ll need to configure a user to access these applications. Use your favorite text editor and open /etc/tomcat6/tomcat-users.xml. This file (like all of the configuration files) is really well commented. The steps I took here were:

  • Added a “role” node for “manager-gui”
  • Added “manager-gui” as a role to the “tomcat” user

In the end, you should have something sort-of like this:

<role rolename="tomcat"/>
<role rolename="manager-gui"/>
<user username="tomcat" password="tomcat" roles="tomcat,manager-gui"/>

Now that you’ve finished configuring all of the user access, restart tomcat.

$ sudo service tomcat restart

You can now check that tomcat is up and running by pointing your web browser at http://localhost:8080/. When you click on the manager-app link, you’ll be prompted for a username and password. As defined by the user configuration above, the username is “tomcat” and the password is “tomcat”. Have a click around, you should also see Solr installed in there also.

Solr Schema

Now it’s time we tell Solr exactly what we want to index. Remember, it’s going to be a client to our mongo database - so any interesting fields that you want indexed will need to be mentioned here. Solr’s schema file is found at /etc/solr/conf/schema.xml. Everyone’s requirements are way to broad for me to go into depth here on what to do, but it would be a good time to look up the documentation and learn about how you want your data attributed: http://wiki.apache.org/solr/SchemaXml.

Connecting to Mongo

Next, we’re going to connect Solr to mongo using mongo-connect. There’s some more software that’s needed to be installed here. mongo-connect is a python package that listens to mongo’s oplog for “interesting” things, and then stores them away into Solr for fast searching later. We will need pip, some xml dependencies that mongo-connect relies on - then we can install the connector.

 
$ sudo apt-get install python-dev python-pip
$ sudo apt-get install libxml2 libxml2-dev libxslt-dev
$ sudo pip install lxml cssselect
$ sudo pip install mongo-connector

Running the connector

Now that you’re all installed, it’s time to start indexing some data. Again, everyone’s requirements are going to be quite different - so it’s a good time to go out and take a look at the mongo-connector github page to understand the full usage of the command. A typical execution of the command would look like this:

$ mongo-connector -m localhost:27217 -t http://localhost:8080/solr -o oplog_progress.txt -n alpha.foo,test.test -u _id -k auth.txt -a admin -d ./doc_managers/solr_doc_manager.py

From here, mongo-connector listens to changes and stores them away in Solr so that your full text search facility has them available.

That’s it for Solr & MongoDB.

Developer Documentation on Linux

Here’s a quick shortcut to getting all of the developer documentation you’ll need installed on a Linux (debian) machine.

$ sudo apt-get install manpages manpages-dev manpages-posix manpages-posix-dev gcc-doc