Cogs and Levers A blog full of technical stuff

Working with HBase

Apache HBase is a data storage technology that allows random, realtime read/write access to your big stores. It’s modelled on Google’s Bigtable paper and is available for use with Apache Hadoop. In today’s article, I’ll walk through some very simple usage of this technology.

Installation

First up, we’ll need to get some software installed. From the downloads page, you can grab a release. Once this is downloaded, get it unpacked onto your machine. In this instance, we’ll be using HBase in standalone mode

This is the default mode. Standalone mode is what is described in the Section 1.2, “Quick Start - Standalone HBase” section. In standalone mode, HBase does not use HDFS – it uses the local filesystem instead – and it runs all HBase daemons and a local ZooKeeper all up in the same JVM. Zookeeper binds to a well known port so clients may talk to HBase.

If you need to perform any further configuration, the /conf folder holds the xml files required. To put your root folders into more sane places, you can change the values of conf/hbase-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>file:///DIRECTORY/hbase</value>
  </property>
  <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/DIRECTORY/zookeeper</value>
  </property>
</configuration>

Start up your server:

$ ./bin/start-hbase.sh 
starting master, logging to /opt/hbase-1.2.3/bin/../logs/hbase--master-0f0ebda04483.out
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase-1.2.3/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.0/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Shell!

Now that HBase is running, we can shell into it and have a poke around.

$ ./bin/hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hbase-1.2.3/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.0/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.2.3, rbd63744624a26dc3350137b564fe746df7a721a4, Mon Aug 29 15:13:42 PDT 2016

hbase(main):001:0> 

First up, we’ll create a table called person with a column family of `name’:

hbase(main):002:0> create 'person', 'name'
0 row(s) in 1.5290 seconds

=> Hbase::Table - person

Now we can insert some tables into our table:

hbase(main):004:0> put 'person', 'row1', 'name:first', 'John'
0 row(s) in 0.1430 seconds

hbase(main):005:0> put 'person', 'row2', 'name:first', 'Mary'
0 row(s) in 0.0150 seconds

hbase(main):006:0> put 'person', 'row3', 'name:first', 'Bob'
0 row(s) in 0.0080 seconds

hbase(main):007:0> scan 'person'
ROW                   COLUMN+CELL                                               
 row1                 column=name:first, timestamp=1475030956731, value=John    
 row2                 column=name:first, timestamp=1475030975840, value=Mary    
 row3                 column=name:first, timestamp=1475030988587, value=Bob

Values can also be read out of our table:

hbase(main):009:0> get 'person', 'row1'
COLUMN                CELL                                                      
 name:first           timestamp=1475030956731, value=John                       
1 row(s) in 0.0250 seconds

Now, we can clean up our test:

hbase(main):011:0> disable 'person'
0 row(s) in 2.3000 seconds

hbase(main):012:0> drop 'person'
0 row(s) in 1.2670 seconds

Following up

Now that we can start to work with HBase, further posts will focus on designing schemas and processing data into the store.