Installing pyocr on Debian

13 Mar 2014

Introduction

Today’s post is an installation guide to get pyocr up and running on a Debian Linux style distribution.

Prepare your python environment:

sudo apt-get install build-tools python-dev
sudo apt-get install python-setuptools
sudo easy_install pip

Install the operating system implementations of the OCR programs. In order to do this, you my need to enable the non-free repositories within your apt settings.

sudo apt-get install tesseract-ocr tesseract-ocr-eng
sudo apt-get install cuneiform

At this point, setuptools needed a little extra help with the following fix:

sudo pip install setuptools --no-use-wheel --upgrade

Prerequisite development libraries are now required prior to the python binding installations:

sudo apt-get install libtiff4-dev libjpeg62-dev zlib1g-dev libfreetype6-dev liblcms-dev libwebp-dev

Finally, we install the python bindings:

sudo pip install Pillow
sudo pip install pyocr

That gets pyocr up and running on a machine.

Other libraries I’ve installed for image manipulation are as follows.

sudo apt-get install python-pythonmagick
sudo apt-get install python-pdfminer
sudo apt-get install libmagickwand-dev
sudo pip install Wand

Watching the File System with INotify in Haskell

10 Feb 2014

Introduction

Some applications that you write from time to time may require you to study changes that occur on the file system. These changes could be that a file arrives, is modified, is closed, etc. Your program can then respond accordingly to these interesting events.

In today’s post, I’ll show you how to monitor the file system for changes using the hinotify package.

What is inotify?

inotify is short for inode notify. It’s a piece of the Linux Kernel that adds notifications at the filesystem level so that userspace programs can take advantage of these events. The Wikipedia page on inotify has a good explanation for further reading.

The Code

There are 3 main jobs that we need to take care of here:

Create awareness with INotify
Register your interest in changes
Respond to the changes

module Main where

import Control.Concurrent (threadDelay)
import System.INotify

main :: IO ()
main = do
  -- the paths that we'll monitor
  let paths = [ "/tmp", "/home/user" ]
  
  -- setup INotify
  withINotify $ \n -> do
    -- monitor each predefined path, and respond using printEvent
    mapM_ (\f -> addWatch n [Modify, CloseWrite] f (printEvent f)) paths
    
    -- this gives "addWatch" some time to collect some data
    threadDelay 10000000
    
  where
    -- print the file and event to the console
    printEvent :: FilePath -> Event -> IO ()
    printEvent f e = putStrLn (f ++ ": " ++ show e)

I’ve tried to comment this code as best I can to show you what’s going on. It’s all pretty straight forward. Delaying the main thread may seem unintuitive, however without this call being made the program will finish execution without collecting data (because INotify doesn’t block!).

Nifty.

Making Sys-V Init Scripts for Debian

06 Feb 2014

Introduction

Sometimes it’s necessary to run programs (applications or daemons) in the background at different stages of a machine’s up time. The most common use case of which centralises around starting something at boot and stopping something at shutdown.

In today’s post, I’ll be doing a write up on preparing init scripts for Sys-V style systems.

How it works

Debian’s Sys-V style init system relies on the scripts under /etc/init.d to instruct the operating system on how to start or stop particular programs. A header that sits at the top of these scripts informs the init system under what conditions this script should start and stop.

Here’s an example of this header from an excerpt taken from /etc/init.d/README

All init.d scripts are expected to have a LSB style header documenting dependencies and default runlevel settings. The header look like this (not all fields are required):

### BEGIN INIT INFO
# Provides:          skeleton
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Should-Start:      $portmap
# Should-Stop:       $portmap
# X-Start-Before:    nis
# X-Stop-After:      nis
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# X-Interactive:     true
# Short-Description: Example initscript
# Description:       This file should be used to construct scripts to be
#                    placed in /etc/init.d.
### END INIT INFO

The insserv man page will take you further in-depth as to what each of these means for your init script.

A full example of an init script is given on the Debian administration website here. It doesn’t use the header above - which is acceptable as these fields aren’t required by the script.

#! /bin/sh
# /etc/init.d/blah
#

# Some things that run always
touch /var/lock/blah

# Carry out specific functions when asked to by the system
case "$1" in
  start)
    echo "Starting script blah "
    echo "Could do more here"
    ;;
  stop)
    echo "Stopping script blah"
    echo "Could do more here"
    ;;
  *)
    echo "Usage: /etc/init.d/blah {start|stop}"
    exit 1
    ;;
esac

exit 0

Putting this script into a text file (into your home directory) and applying execute permissions on the file, allows you to test it at the console. I’ve put this script into a file called “blah”.

$ chmod 755 blah

$ ./blah
Usage: /etc/init.d/blah {start|stop}

$ ./blah start
Starting script blah
Could do more here

$ ./blah stop
Stopping script blah
Could do more here

The script implements a “start” and “stop” instruction.

Installation

Now that you’ve developed your script and are happy with its operation (after testing it at the console), you can install it on your system.

The first step is to copy your script (in this case “blah”) up into /etc/init.d/. This makes it available to the init system to use. It won’t actually use it though until you establish some symlinks between the script and the runlevel script sets.

You can check that it’s available to your system using the “service” command, like so:

$ sudo service --status-all
 [ + ]  acpid
 [ + ]  atd
 [ ? ]  blah
 [ - ]  bootlogs
 [ ? ]  bootmisc.sh
 . . .
 . . .
 . . .
 . . .

You can see that “blah” is being registered here as an init script and as such can also be started and stopped using the “service” command:

$ sudo service blah start
Starting script blah
Could do more here

$ sudo service blah stop
Stopping script blah
Could do more here

Now we’ll attach this init script to the startup and shut down of the computer. update-rc.d will help with this process. You can get the script installed with the following command:

$ sudo update-rc.d blah defaults

If you no longer want the script to execute on your system, you can remove it from the script sets with the following command:

$ update-rc.d -f blah remove

After removing it, you’ll still have your script in /etc/init.d/, just in case you want to set it up again.

That’s it for today.

Textfiles.com

04 Feb 2014

I had to bookmark this site here. I’ve just burnt the past 2 hours looking through it, fondly remembering the 90’s.

http://www.textfiles.com/programming/

Setting up Bochs

03 Feb 2014

Introduction

Today’s post is just a short one on the installation of the virtual machine Bochs within a Debian environment.

Procedure

# Install Bochs using your package manager
sudo apt-get install bochs

# Install the X11 and sdl plugin for Bochs
sudo apt-get install bochs-x bochs-sdl

Finally, make sure that your machines are using SDL as the display library by adding this line to your .bochsrc

files:display_library: sdl

That’s it for today.

Older Newer

Cogs and Levers A blog full of technical stuff

Installing pyocr on Debian

Introduction

Watching the File System with INotify in Haskell

Introduction

What is inotify?

The Code

Making Sys-V Init Scripts for Debian

Introduction

How it works

Installation

Textfiles.com

Setting up Bochs

Introduction

Procedure