Sometimes it’s necessary to run programs (applications or daemons) in the background at different stages of a machine’s up time. The most common use case of which centralises around starting something at boot and stopping something at shutdown.
In today’s post, I’ll be doing a write up on preparing init scripts for Sys-V style systems.
How it works
Debian’s Sys-V style init system relies on the scripts under /etc/init.d to instruct the operating system on how to start or stop particular programs. A header that sits at the top of these scripts informs the init system under what conditions this script should start and stop.
Here’s an example of this header from an excerpt taken from /etc/init.d/README
All init.d scripts are expected to have a LSB style header documenting
dependencies and default runlevel settings. The header look like this
(not all fields are required):
### BEGIN INIT INFO
# Provides: skeleton
# Required-Start: $remote_fs $syslog
# Required-Stop: $remote_fs $syslog
# Should-Start: $portmap
# Should-Stop: $portmap
# X-Start-Before: nis
# X-Stop-After: nis
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# X-Interactive: true
# Short-Description: Example initscript
# Description: This file should be used to construct scripts to be
# placed in /etc/init.d.
### END INIT INFO
The insserv man page will take you further in-depth as to what each of these means for your init script.
A full example of an init script is given on the Debian administration website here. It doesn’t use the header above - which is acceptable as these fields aren’t required by the script.
#! /bin/sh# /etc/init.d/blah## Some things that run alwaystouch /var/lock/blah
# Carry out specific functions when asked to by the systemcase"$1"in
start)echo"Starting script blah "echo"Could do more here";;
stop)echo"Stopping script blah"echo"Could do more here";;*)echo"Usage: /etc/init.d/blah {start|stop}"exit 1
;;esacexit 0
Putting this script into a text file (into your home directory) and applying execute permissions on the file, allows you to test it at the console. I’ve put this script into a file called “blah”.
$ chmod 755 blah
$ ./blah
Usage: /etc/init.d/blah {start|stop}$ ./blah start
Starting script blah
Could do more here
$ ./blah stop
Stopping script blah
Could do more here
The script implements a “start” and “stop” instruction.
Installation
Now that you’ve developed your script and are happy with its operation (after testing it at the console), you can install it on your system.
The first step is to copy your script (in this case “blah”) up into /etc/init.d/. This makes it available to the init system to use. It won’t actually use it though until you establish some symlinks between the script and the runlevel script sets.
You can check that it’s available to your system using the “service” command, like so:
You can see that “blah” is being registered here as an init script and as such can also be started and stopped using the “service” command:
$ sudo service blah start
Starting script blah
Could do more here
$ sudo service blah stop
Stopping script blah
Could do more here
Now we’ll attach this init script to the startup and shut down of the computer. update-rc.d will help with this process. You can get the script installed with the following command:
$ sudo update-rc.d blah defaults
If you no longer want the script to execute on your system, you can remove it from the script sets with the following command:
$ update-rc.d -f blah remove
After removing it, you’ll still have your script in /etc/init.d/, just in case you want to set it up again.
A few utilities exist to manage your build, dependencies, test running for Java projects. One that I’ve seen that is quite intuitive (once you wrap your head around the xml structure) is Maven. According to the website, Maven is a “software project management and comprehension tool”.
The main benefit I’ve seen already is how the developer’s work-cycle is managed using the “POM” (project object model). The POM is just an XML file that accompanies your project to describe to Maven what your requirements are to build, test & package your software unit.
An excellent, short post can be found on the Maven website called “Maven in 5 minutes”.
Today’s post will focus on Maven installation and getting a “Hello, world” project running.
Installation
I’m on a Debian-flavored Linux distribution, so you may need to translate slightly between package managers. To get Maven installed, issue the following command at the prompt:
sudo apt-get install maven
Check that everything has installed correctly with the following command:
mvn --version
You should see some output not unlike what I’ve got here:
If you’re seeing output like what I’ve got above - that’s it. You’re installed now.
First Project
Getting your first application together is pretty easy. A “quick start” approach is to use the quick start templates to generate a project structure like so:
cd ~/Source
mvn archetype:generate -DgroupId=org.temp -DartifactId=hello -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
Maven will then go out and grab all that it needs from the web to get your project setup. It’s now generated a project structure for you (in a directory called “hello”) that looks like this:
In a previous post, I walked through the very basic operations of getting a Maven project up and running so that you can start writing Java applications using this managed environment.
In today’s post, I’ll walk through the modifications required to your POM to get a MapReduce job running on Hadoop 2.2.0.
If you don’t have Maven installed yet, do that . . . maybe even have a bit of a read up on what it is, how it helps and how you can use it. Of course you’ll also need your Hadoop environment up and running!
Project Setup
First thing you’ll need to do, is to create a project structure using Maven in your workspace/source folder. I do this with the following command:
As it runs, this command will ask you a few questions on the details of your project. For all of the questions, I’ve found selecting the default value was sufficient. So . . . enter enter enter !
Once the process is complete, you’ll have a project folder created for you. In this example, my project folder is “wordcount” (you can probably see where this tutorial is now headed). Changing into this folder and having a look at the directory tree, you should see the following:
~/src/wordcount$ tree
.
├── pom.xml
└── src
├── main
│ └── java
│ └── com
│ └── test
│ └── wordcount
│ └── App.java
└── test
└── java
└── com
└── test
└── wordcount
└── AppTest.java
11 directories, 3 files
Now it’s time to change the project environment so that it’ll suit our Hadoop application target.
Adjusting the POM for Hadoop
There’s only a few minor alterations that are required here. The first one is, referencing the Hadoop libraries so that they are available to you to program against. We also specify the type of packaging for the application. Lastly, changing the language version (to something higher than what’s specified as default).
Open up “pom.xml” in your editor of choice and add the following lines into the “dependencies” node.
This tells the project that we need the “hadoop-client” library (version 2.2.0).
We’re now going to tell Maven to make us an executable JAR. Unfortunately, here’s where the post is slightly pre-emptive upon itself. In order to tell Maven that we want an executable JAR, we need to tell it what class is holding our “main” function. . . we haven’t written any code yet - but we will!
Create a “build” node and within that node create a “plugins” node and add the following to it:
That’s all that should be needed now to perform compilation and packaging of our Hadoop application.
The Job
I’ll leave writing Hadoop Jobs to another post, but we still need some code to make sure our project is working (for today).
All I have done for today, is taken the WordCount code that’s on the Hadoop Wiki here http://wiki.apache.org/hadoop/WordCount, changed the package name to align with what I created my project as com.test.wordcount and saved it into src/main/java/com/test/wordcount/WordCount.java
I removed the template provided App.java that was in this folder. I did make one minor patch to this code also. Here’s my full listing that I’ve used for reference anyway.
Our project is setup, our code is in place; it’s now time to compile our project.
$ mvn clean install
Lots of downloading of dependencies and a bit of compilation go on . . . If all has gone to plan, you can now have a package to run. As usual, you’ll need a text file of words to count. I’ve popped one up on hdfs called “input.txt”.
$ hadoop jar target/wordcount-1.0-SNAPSHOT.jar input.txt wcout