Cogs and Levers A blog full of technical stuff

systemd Services and Sockets

Introduction

systemd is a set of basic tools that any system can use to build more sophisticated service applications. Using these building you can create units which can be a:

  • service
  • socket
  • device
  • mount
  • automount
  • swap
  • target
  • path
  • timer
  • slice
  • scope

In today’s article, we’ll go through an example that uses service and socket to build a simple server.

Hello, World

To start with, let’s create a “Hello, World” service that will do nothing more than take your connection and send you back the string "Hello, world". First we define our service in a file. Ours is hw.service.

You can install this at ~/.config/systemd/user/hw.service.

# hw.service
[Unit]
Description=Hello World Service
After=network.target hw.socket
Requires=hw.socket

[Service]
Type=simple
ExecStart=/usr/bin/python3 %h/tmp/serve.py
TimeoutStopSec=5

[Install]
WantedBy=default.target

The ExecStart holds what systemd will hand the socket connection off to. In this case, we’re going to hand the connection off to a python socket server running from our ~/tmp directory.

You can see that our requires hw.socket. It needs to be up before it will respond to requests. You install this one at ~/.config/systemd/user/hw.socket.

# hw.socket
[Unit]
Description=Hello World Socket
PartOf=hw.service

[Socket]
ListenStream=127.0.0.1:7777

[Install]
WantedBy=sockets.target

Our socket will listen on 7777 waiting for connections.

The serve.py mentioned in the service file is what systemd will hand the connection off to. The implementation of that server is a simple socket server:

#!/usr/bin/env python3

from socketserver import TCPServer, StreamRequestHandler
import socket
import logging

class Handler(StreamRequestHandler):
    def handle(self):
        self.data = self.rfile.readline().strip()
        logging.info("From <%s>: %s" % (self.client_address, self.data))
        self.wfile.write("Hello, world!\r\n".encode("utf-8"))

class Server(TCPServer):
    
    # The constant would be better initialized by a systemd module
    SYSTEMD_FIRST_SOCKET_FD = 3

    def __init__(self, server_address, handler_cls):
        # ignore the bind/listen steps
        TCPServer.__init__(
            self, server_address, handler_cls, bind_and_activate=False
        )
        
        # take the socket from systemd
        self.socket = socket.fromfd(
            self.SYSTEMD_FIRST_SOCKET_FD, self.address_family, self.socket_type
        )

if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)

    # host and port values are ignored here
    server = Server(("localhost", 9999), Handler)
    server.serve_forever()

Inside of the Handler class, in the constructor you can see that we avoid the bind and listen steps. This is because systemd has already done this for us. We’re just going to be handed a file descriptor with the socket already attached.

# ignore the bind/listen steps
TCPServer.__init__(
    self, server_address, handler_cls, bind_and_activate=False
)

# take the socket from systemd
self.socket = socket.fromfd(
    self.SYSTEMD_FIRST_SOCKET_FD, self.address_family, self.socket_type
)

That’s exactly what’s happening with fromfd here. We’re given a socket to work with via descriptor 3.

The actual implementation of our handler is not doing much more than taking in the request data, and sending back "Hello World".

def handle(self):
    self.data = self.rfile.readline().strip()
    logging.info("From <%s>: %s" % (self.client_address, self.data))
    self.wfile.write("Hello, world!\r\n".encode("utf-8"))

Getting it installed

You can start your server listening with the following now:

systemctl --user daemon-reload
systemctl --user start hw.socket

You should be up and running.

Testing

You can use telnet to take a look at your server:

➜  ~ telnet 127.0.0.1 7777
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Hello
Hello, world!
Connection closed by foreign host.

Alternatively, you can just use netcat:

➜  ~ echo 'test' | netcat 127.0.0.1 7777 
Hello, world!

Check that it’s working

After you’ve tested it a few times, you’ll be albe to see requests in the logs.

journalctl -f --user-unit hw.service

You should see the lines from the logging.info calls.

Nov 05 18:23:15 ditto systemd[2245]: Started Hello World Service.
Nov 05 18:23:18 ditto python3[9883]: INFO:root:From <('127.0.0.1', 38292)>: b'Hello'

Cleaning up

Once you’re done and you’d like to remove these, simply stop the service, remove the units, and reload.

# stop the service
systemctl --user stop hw.socket

# remove the service and socket
rm ~/.config/systemd/user/hw.*

# reload systemd state
systemctl --user daemon-reload 

Find Gems

Introduction

The find utility is an extremely flexible query tool for your file system. It allows you to navigate and traverse your folder structures.

From the man page:

This manual page documents the GNU version of find. GNU find searches the directory tree rooted at each given file name by evaluating the given expression from left to right, according to the rules of precedence (see section OPERATORS), until the outcome is known (the left hand side is false for and operations, true for or), at which point find moves on to the next file name.

In this article, we’ll go through some usages.

Find by name

You can simply find by name with the -name switch:

find /the/directory -name the-file-name.txt

You can also ignore case by switching out -name for -iname.

Find by permissions

If you’d like to find files with a specific permission (in this case 777):

find . -type f -perm 0777 -print

You can take the converse of this with !:

find . -type f ! -perm 0777 -print

Average file size

Let’s say you have a directory of text files, and you’d like to find the average size of them. By joining find with awk, you can use the following to do just that:

find . -type f -exec wc -w {} \; | awk '{numfiles=numfiles+1;total += $1} END{print total/numfiles}'

Random sampling

You may need to take a random sample of files in a given folder. You can start this process off with find, and then through clever usage of sort, and tail you can get a random sample.

This will take 500 random files from a directory:

find /the/path -type f -print0 | sort -zR | head -zn 500

find /the/path -type f -print0 prints out the files using \0 as a delimiter thanks to -print0.

sort is told to use \0 as its delimiter with -z and -R is to sort them randomly.

head now takes the first 500, again using \0 as the delimiter.

Parquet

The Apache Parquet file format is used widely in the data space. It’s a column-oriented format that focuses on storing data as efficiently as possible, with emphasis on data retrieval.

Why?

The most common storage format that you’d use to hold data is CSV. It’s a good, simple format but has quite a number of short-comings when it comes to data analysis.

Parquet is a column-oriented format which is much more sympathetic to data analysis. CSV is row-oriented, which is a much better application for an OLTP scenario.

Parquet offers compression and partitioning that is simply not available to the CSV format.

So, it stores information - but is better suited to the data warehouse idea.

Python

A really easy way to get started using Parquet is with Python. The PyArrow library is a set of utilities that allow you to work with in-memory analytics tools. PyArrow plays nicely with Pandas and NumPy so it’s a good fit.

Make sure you have pyarrow installed as a dependency.

import pandas as pd
import numpy as np
import pyarrow as pa

import pyarrow.parquet as pq

Create a table

First off, we’ll create a DataFrame from some raw data.

data = [
    ["John", "Smith", 23],
    ["Amber", "Brown", 31],
    ["Mark", "Green", 22],
    ["Jane", "Thomas", 26]
]

df = pd.DataFrame(
    data, 
    columns=["first_name", "last_name", "age"]
)

We can then use from_pandas function to create a pyarrow.Table from this DataFrame.

table = pa.Table.from_pandas(df)

Basic I/O

Now that we have loaded a Table, we can save this data using write_table and pick it back up off disk using read_table.

pq.write_table(table, 'people.parquet')

people = pq.read_table('people.parquet')
people.to_pandas()

Finishing up

There’s a lot more benefits to picking a more optimal data storage solution when working in the data analytics space.

Libuv

—uvBuf layout: post title: libuv date: 2023-07-04 comments: false categories: [ “” ] —

libuv is a multi-platform library that provides your programs with asynchronous capabilities through the use of an event loop. node.js has been the most mainstream usage of this library.

Today’s post will talk about this library and show some working examples.

Features

libuv provides a quite a set of features:

  • Event loop
  • Async file and network I/O
  • File system events
  • IPC
  • Thread pool
  • Signal handling
  • High resolution clock

Event loop

When you’re programming in an event-driven environment, you need a medium that can transfer control over to your program when an event occurs. The event loop’s job is to do exactly this, running forever.

If you were to think about it in c-pseudo code, it might look something like this.

while (events_to_process) {
    event = get_next_event();
    
    if (event.callback) {
        event.callback();
    }
}

Watchers

The list of handles that can send us events, and signal our application are here:

/* Handle types. */
typedef struct uv_loop_s uv_loop_t;
typedef struct uv_handle_s uv_handle_t;
typedef struct uv_stream_s uv_stream_t;
typedef struct uv_tcp_s uv_tcp_t;
typedef struct uv_udp_s uv_udp_t;
typedef struct uv_pipe_s uv_pipe_t;
typedef struct uv_tty_s uv_tty_t;
typedef struct uv_poll_s uv_poll_t;
typedef struct uv_timer_s uv_timer_t;
typedef struct uv_prepare_s uv_prepare_t;
typedef struct uv_check_s uv_check_t;
typedef struct uv_idle_s uv_idle_t;
typedef struct uv_async_s uv_async_t;
typedef struct uv_process_s uv_process_t;
typedef struct uv_fs_event_s uv_fs_event_t;
typedef struct uv_fs_poll_s uv_fs_poll_t;
typedef struct uv_signal_s uv_signal_t;

/* Request types. */
typedef struct uv_req_s uv_req_t;
typedef struct uv_getaddrinfo_s uv_getaddrinfo_t;
typedef struct uv_getnameinfo_s uv_getnameinfo_t;
typedef struct uv_shutdown_s uv_shutdown_t;
typedef struct uv_write_s uv_write_t;
typedef struct uv_connect_s uv_connect_t;
typedef struct uv_udp_send_s uv_udp_send_t;
typedef struct uv_fs_s uv_fs_t;
typedef struct uv_work_s uv_work_t;

/* None of the above. */
typedef struct uv_cpu_info_s uv_cpu_info_t;
typedef struct uv_interface_address_s uv_interface_address_t;
typedef struct uv_dirent_s uv_dirent_t;

These are the handles that we can register interest in; so the system will raise interesting events to us.

Get started

Before we get started, the libuv library needs to be installed along with the development files. In order to do this on my Debian machine, I’ll install the development and the runtime files.

libuv1 - asynchronous event notification library - runtime library
libuv1-dev - asynchronous event notification library - development files

Now, when we build an executable we need to link to the uv library using -luv. For the CMake test application that I’m writing with this article, I used:

target_link_libraries(uvtest uv)

Where uvtest is the name of my application.

First program

The “hello, world” of event loops. We’ll allocate the event loop, run the loop, and then cleanup.

int main() {
    /* allocate and init the loop */
    uv_loop_t *loop = malloc(sizeof(uv_loop_t));
    uv_loop_init(loop);

    /* run the loop */
    uv_run(loop, UV_RUN_DEFAULT);

    /* clean up */
    uv_loop_close(loop);
    free(loop);
    
    return 0;
}

Idle

While our program is doing “nothing”, waiting for the next event we can register a function to execute. You’ll notice in this code that we’re using a uv_idle_t rather than a uv_loop_t (as above). Using uv_idle_t provides us access to register an “idler” function.

int64_t counter = 0;

void count_to_10(uv_idle_t* handle) {
    printf("Counter at: %d\n", counter++);

    if (counter > 10) {
        uv_idle_stop(handle);
    }
}

int main() {
    uv_idle_t idler;

    uv_idle_init(uv_default_loop(), &idler);
    uv_idle_start(&idler, count_to_10);

    uv_run(uv_default_loop(), UV_RUN_DEFAULT);

    uv_loop_close(uv_default_loop());
    return 0;
}

The idle function, count_to_10 counts up until we exceed 10 and then calls uv_idle_stop which is our exit.

Finishing up

This has just been an introduction to the absolute basics of libuv.

The Rrrola Constant

When you’re writing demos in 320x200 mode, you can quickly estimate X and Y coordinates from a screen offset with one simple multiply. That value that you use is 0xcccd, and it’s called the Rrrola Constant.

To start this, we need to adjust our video address. Remember, this is an estimate (it’s good enough); but it does need a bit more “help” in the setup.

  push  0xa000 - 10			
  pop   es				


top:
  xor   di, di
  mov   cx, 64000

Right now, es:[di] is pointing to the start of video memory (adjusted).

Now we perform the multiply

pat:
  mov  ax, 0xcccd
  mul  di				

At this point the (x, y) pair is now available to us in (dh, dl). This is really handy for use in your rendering functions.

In this example, we just make a pixel that’s x xor y.

  xor  dh, dl
  mov  al, dh
  stosb
  loop pat

This works because the offset into the video buffer is worked out as (y * 320) + x. Multiplying this formula out by Oxcccd we end up with (y * 0x1000040) + (x * 0xcccd)

The top byte is y * 0x1000000. The next byte along is now (x * 0xcccd / 0x10000) which approximates to (x * 256/320), which is useful to us. The lower two bytes from the product are garbage.

Full example

The following is a .com demo-style example which uses the above technique:

  org  100h

start:
  ; setup 320x200x256
  mov  ax, 0x0013
  int  0x10

  ; adjust screen segment to work
  ; with the Rrrloa trick
  push 0xa000 - 10
  pop  es

top:
  ; start at the beginning of
  ; the video buffer
  xor  di, di
  mov  cx, 64000

pat:
  ; (dh, dl) -> (x, y)
  mov  ax, 0xcccd			 
  mul  di					

  ; col = x ^ y
  xor  dh, dl
  mov  al, dh
  
  ; paint
  stosb
  loop pat

  ; check for esc
  in  al, 60h				
  cmp al, 1
  
  jne top

  ; return to text
  mov  ax, 0x0003			
  int  0x10

  ; return to dos
  mov  ax, 0x4c00			
  int  0x21

Pretty!

XOR Demo