Cogs and Levers A blog full of technical stuff

Binary recon techniques

Introduction

In the world of cybersecurity, understanding how binaries operate and interact with a system is critical for both defense and troubleshooting. Whether you’re analyzing potential malware, reverse engineering an application, or simply trying to understand how a piece of software works, performing a thorough system reconnaissance is a vital skill.

Linux offers a wide range of powerful tools that allow you to investigate and inspect binaries, revealing everything from their internal structure to their external behavior. These tools help in uncovering useful information such as file type, embedded strings, dependencies, system calls, and much more. By using them effectively, you can gain valuable insights into how a binary interacts with the system and what potential risks or vulnerabilities it might pose.

In this article, we’ll walk through a collection of essential Linux tools for binary analysis, starting with the basics and working toward more advanced techniques. Whether you’re a seasoned engineer or just getting started with system-level investigation, these tools will arm you with the knowledge to perform comprehensive binary analysis in a Linux environment.

Shell Tools

In this section, I’ll cover a variety of shell-based tools designed to provide insights into files you encounter. These tools help you gather critical intel, allowing you to better understand and reason about the file’s nature and potential behavior.

file

The file command will attempt to classify the file that you’ve given it. The classification that comes back can give you valuable information about the contents of the file itself.

The man page says the following:

file tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic tests, and language tests. The first test that succeeds causes the file type to be printed.

Usage

$ file /bin/sh
/bin/sh: symbolic link to bash
$ file README.md
README.md: ASCII text, with no line terminators
$ file gouraud.png
gouraud.png: PNG image data, 320 x 157, 8-bit/color RGB, non-interlaced

Immediately, you’re given feedback classifying the content of the file.

strings

The strings command is useful when you want to extract readable text from a binary file. This can be particularly helpful for identifying embedded text, error messages, function names, or other human-readable content hidden within the binary. By filtering out non-text data, strings allows you to gain clues about the file’s purpose or origin without needing to disassemble the entire binary.

The man page describes it as follows:

For each file given, GNU strings prints the sequences of printable characters that are at least 4 characters long (or the number given with the options below) and are followed by an unprintable character.

Usage

As an example, let’s look at the first few strings in /bin/bash.

strings /bin/bash 

/lib64/ld-linux-x86-64.so.2
DJ
E $
@B       
B8"
@P@@

. . .
. . .

fchmod
libreadline.so.8
libc.so.6
initialize_job_control
sys_error
maybe_make_export_env
dispose_word

The presence of libc.so.6 tells us that the binary likely uses the standard C library, for instance.

lsof

The lsof command, short for “list open files,” shows all the files currently opened by active processes. Since everything in Linux is treated as a file (including directories, devices, and network connections), lsof is a powerful tool for monitoring what a binary interacts with in real time. By using lsof, you can track which resources the binary accesses, providing insight into its operations.

From the man page:

lsof lists on its standard output file information about files opened by processes for the given conditions.

Usage

For these examples, the program that we’re probing needs to be running. We’ll be able to list out open files, and network connections.

$ lsof -p 25524       

COMMAND   PID    USER  FD   TYPE             DEVICE SIZE/OFF     NODE NAME
zsh     25524    user cwd    DIR                8,2     4096  1703938 /home/user
zsh     25524    user rtd    DIR                8,2     4096        2 /
zsh     25524    user txt    REG                8,2   947360 14055379 /usr/bin/zsh
zsh     25524    user mem    REG                8,2  3060208 14038967 /usr/lib/locale/locale-archive
zsh     25524    user mem    REG                8,2    76240 14055393 /usr/lib/zsh/5.9/zsh/computil.so

. . .

We can also look at the network connections of these:

➜  ~ lsof -i -p 23316   
COMMAND     PID    USER  FD      TYPE             DEVICE  SIZE/OFF     NODE NAME
jetbrains  1103    user  74u     IPv6              14281       0t0      TCP localhost:52829 (LISTEN)
kdeconnec  1109    user  26u     IPv6              34398       0t0      UDP *:xmsg
kdeconnec  1109    user  27u     IPv4              17425       0t0      UDP *:mdns
kdeconnec  1109    user  28u     IPv6              17426       0t0      UDP *:mdns
kdeconnec  1109    user  29u     IPv6              34399       0t0      TCP *:xmsg (LISTEN)
kdeconnec  1109    user  30u     IPv4              39158       0t0      UDP *:60200
kdeconnec  1109    user  31u     IPv6              39159       0t0      UDP *:49715

. . .

You can also look for what files are being accessed by users on your system:

$ lsof -u user

ldd

The ldd command lists the shared libraries that a binary depends on. Since many binaries rely on external libraries for functionality, ldd helps you map out these dependencies and check whether all required libraries are present. This is particularly useful when investigating dynamically linked binaries or troubleshooting issues related to missing libraries.

From the man page:

ldd prints the shared libraries required by each program or shared library specified on the command line.

Usage

$ ldd /bin/bash
  linux-vdso.so.1 (0x00007aa7b7743000)
  libreadline.so.8 => /usr/lib/libreadline.so.8 (0x00007aa7b759e000)
  libc.so.6 => /usr/lib/libc.so.6 (0x00007aa7b73ad000)
  libncursesw.so.6 => /usr/lib/libncursesw.so.6 (0x00007aa7b733e000)
  /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007aa7b7745000)

This is really helpful. We can see all of the shared libraries that /bin/bash relies on.

objdump

The objdump command provides detailed information about object files, offering insight into a binary’s internal structure. With objdump, you can disassemble the binary, inspect its headers, and examine its machine code and data sections. This tool is invaluable when you’re conducting a deep analysis, as it gives you a granular look at the file’s components.

From the man page:

objdump displays information about one or more object files. The options control what particular information to display.

Usage

In a previous post I had written about objdump usage while creating shell code to execute.

nm

The nm command allows you to list symbols from object files, providing insight into the binary’s functions, variables, and other symbols. It’s a useful tool when trying to reverse engineer a binary, as it helps map out its structure and function entry points. You can also use it to debug symbol-related issues in your own compiled binaries.

From the man page:

nm lists the symbols from object files. The symbol names are shown in the name column, and additional information includes the type and the value associated with each symbol.

Usage

I’ve just written and compiled a “Hello, world” executable.

After a standard compilation gcc hello.c -o hello, the following is returned.

$ nm hello
0000000000004018 B __bss_start
w __cxa_finalize@GLIBC_2.2.5
0000000000004008 D __data_start
0000000000004008 W data_start
0000000000004010 D __dso_handle
0000000000003de0 d _DYNAMIC
0000000000004018 D _edata
0000000000004020 B _end
0000000000001160 T _fini
0000000000003fe8 d _GLOBAL_OFFSET_TABLE_
w __gmon_start__
0000000000002014 r __GNU_EH_FRAME_HDR
0000000000001000 T _init
0000000000002000 R _IO_stdin_used
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
U __libc_start_main@GLIBC_2.34
0000000000001139 T main
U puts@GLIBC_2.2.5
0000000000001040 T _start
0000000000004018 D __TMC_END__

readelf

The readelf command is similar to objdump, but it focuses specifically on displaying information from ELF (Executable and Linkable Format) files. This tool can show detailed information about sections, headers, program headers, and other parts of an ELF binary. readelf is a go-to tool for investigating how ELF binaries are structured, particularly in understanding what segments the binary contains and how it’s loaded into memory.

From the man page:

readelf displays information about one or more ELF format object files. The options control what particular information to display.

Usage

For the “Hello, world” program, readelf breaks the ELF header down:

$ readelf -h hello
ELF Header:
Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class:                             ELF64
Data:                              2's complement, little endian
Version:                           1 (current)
OS/ABI:                            UNIX - System V
ABI Version:                       0
Type:                              DYN (Position-Independent Executable file)
Machine:                           Advanced Micro Devices X86-64
Version:                           0x1
Entry point address:               0x1040
Start of program headers:          64 (bytes into file)
Start of section headers:          13520 (bytes into file)
Flags:                             0x0
Size of this header:               64 (bytes)
Size of program headers:           56 (bytes)
Number of program headers:         13
Size of section headers:           64 (bytes)
Number of section headers:         30
Section header string table index: 29

strace

The strace command traces the system calls made by a binary as it runs. System calls are the interface between user-space applications and the Linux kernel, so tracing them can give you a real-time view of how the binary interacts with the system. Whether you’re debugging or investigating suspicious behavior, strace is an essential tool for following what a binary is doing at a low level.

From the man page:

strace is a diagnostic, debugging, and instructional userspace utility for Linux. It is used to monitor and tamper with interactions between processes and the Linux kernel, including system calls, signal deliveries, and changes of process state.

Usage

I’ve removed a lot of the calls here. I’ve run the Hello, world program through strace and you can trace through to the system calls.

$ strace ./hello
execve("./hello", ["./hello"], 0x7ffc231b4fc0 /* 82 vars */) = 0

. . .
. . .

write(1, "Hello, world\n", 13Hello, world
)          = 13
exit_group(0)                           = ?
+++ exited with 0 +++

ltrace

The ltrace command works similarly to strace, but instead of tracing system calls, it tracks the dynamic library calls made by a binary. If you want to see how a program interacts with shared libraries, such as the C standard library (libc), ltrace is the tool to use. It’s particularly useful when debugging issues related to dynamically linked functions.

From the man page:

ltrace is a program that simply runs the specified command until it exits. It intercepts and records the dynamic library calls that are called by the executed process and the signals received by that process.

Usage

Looking again at our “Hello, world” program:

$ ltrace ./hello
puts("Hello, world"Hello, world
)                                         = 13
+++ exited (status 0) +++

gdb

The GNU Debugger (gdb) is a powerful tool for interactively debugging binaries. You can use it to set breakpoints, inspect memory and variables, and step through a binary’s execution line by line. gdb is a versatile tool not only for developers debugging their own code but also for reverse engineers looking to analyze how a binary works.

From the man page:

gdb is a portable debugger that works for many programming languages, including C, C++, and Fortran. The main purpose of gdb is to allow you to see what is going on inside another program while it is executing or what another program was doing at the moment it crashed.

I have a write up in a previous post about gdb.

hexedit

The hexedit command is a hex editor that allows you to directly view and edit the raw binary content of a file. This can be useful for making minor modifications to a binary or simply for inspecting its content at the byte level. It’s especially helpful when you need to look at binary structures or strings that aren’t visible using regular text-based tools.

From the man page:

hexedit shows a file both in hexadecimal and in ASCII. The file can be modified, and the changes take effect immediately.

objcopy

The objcopy command allows you to copy and translate object files from one format to another. It’s often used to extract or remove sections from binaries, making it a useful tool for tailoring object files to specific requirements. objcopy can be helpful when you need to analyze or modify specific sections of a binary, such as stripping debugging symbols.

From the man page:

objcopy copies the contents of an object file to another. It can also extract specific sections from the object file or remove sections from it.

patchelf

The patchelf command lets you modify ELF binaries, enabling you to change key properties like the dynamic loader path or RPATH. This is useful when you want to adjust how an ELF binary locates shared libraries, or when you’re working in an environment where libraries are stored in non-standard locations.

From the man page:

patchelf is a simple utility for modifying existing ELF executables and libraries. It can change the dynamic loader (“ELF interpreter”) of executables and change the RPATH and RUNPATH of executables and libraries.

checksec

The checksec command provides a quick way to check the security properties of a binary. It examines whether the binary uses common security mechanisms like stack canaries, non-executable memory (NX), or position-independent execution (PIE). This tool is great for assessing how hardened a binary is against common vulnerabilities.

From the man page:

checksec is a bash script to check the properties of executables (e.g., whether they are compiled with stack protection, DEP, ASLR, etc.).

Usage

Let’s look at Hello, world.

$ checksec --format=cli --file=hello
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      Symbols    FORTIFY  Fortified       Fortifiable     FILE
Partial RELRO   No canary found   NX enabled    PIE enabled     No RPATH   No RUNPATH   24 Symbols   N/A    0               0               hello

Other Tools

While the shell-based tools discussed above are invaluable for quick inspection and analysis of binaries, there are several more advanced tools that provide deeper functionality and broader capabilities. Below are some additional tools worth exploring for more in-depth binary analysis and reverse engineering:

  • GHIDRA: A powerful open-source reverse engineering tool developed by the NSA. It supports analysis of binary code for a wide variety of architectures and provides a graphical interface for decompilation and analysis.
  • Radare2: An advanced open-source framework for reverse engineering and analyzing binaries. It provides a rich command-line interface as well as a visual mode for inspecting file structures.
  • Binary Ninja: A commercial reverse engineering platform offering a clean interface and scriptable analysis features for binary inspection and disassembly.
  • Hopper Disassembler: A reverse engineering tool designed for macOS and Linux that helps with disassembly, decompilation, and debugging of binaries.
  • IDA Pro: A well-known, industry-standard disassembler and debugger, widely used in reverse engineering for deeply analyzing executables and debugging code across various architectures.
  • capstone: A lightweight, multi-architecture disassembly framework that can be integrated into other tools or used to write custom disassemblers.
  • RetDec: An open-source decompiler developed by Avast, designed to convert machine code into human-readable code.
  • pwntools: A CTF framework and exploit development library, useful for writing scripts to exploit vulnerabilities in binaries and automate tasks.
  • Angr: A platform for analyzing binaries, capable of both static and dynamic analysis, widely used in vulnerability research and symbolic execution.

These tools are generally more sophisticated than the shell-based ones and are essential for deep binary analysis, reverse engineering, and exploit development. Many are extensible with scripting capabilities, allowing for custom and automated analysis workflows.

Framebuffer Drawing in Linux

Introduction

Some of my favourite graphics programming is done simply with a framebuffer pointer. The simplicity of accessing pixels directly can be alot of fun. In today’s article, I’ll walk through a couple of different ways that you can achieve this inside of Linux.

/dev/fb*

Probably the easiest way to get started with writing to the framebuffer is to start working directly with the /dev/fb0 device.

cat /dev/urandom > /dev/fb0

If your system is anything like mine, this results in zsh: permission denied: /dev/fb0. To get around this, add yourself to the “video” group.

sudo adduser $USER video

You can now fill your screen with garbage by sending all of those bytes from /dev/urandom.

This works, but it’s not the best way to get it done.

Xlib

Next up we’ll try again using Xlib. This isn’t exactly what we’re after, but I’ve included this one for completeness.

#include <X11/Xlib.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {
    Display *display;
    Window window;
    XEvent event;
    int screen;

    // Open connection to X server
    display = XOpenDisplay(NULL);
    if (display == NULL) {
        fprintf(stderr, "Unable to open X display\n");
        exit(1);
    }

    screen = DefaultScreen(display);

    // Create a window
    window = XCreateSimpleWindow(
      display, 
      RootWindow(display, screen), 
      10, 10, 
      800, 600, 
      1,
      BlackPixel(display, screen), 
      WhitePixel(display, screen)
    );

    // Select kind of events we are interested in
    XSelectInput(display, window, ExposureMask | KeyPressMask);

    // Map (show) the window
    XMapWindow(display, window);

    // Create a simple graphics context
    GC gc = XCreateGC(display, window, 0, NULL);
    XSetForeground(display, gc, BlackPixel(display, screen));

    // Allocate a buffer for drawing
    XImage *image = XCreateImage(
      display, 
      DefaultVisual(display, screen), DefaultDepth(display, screen), 
      ZPixmap, 
      0, 
      NULL, 
      800, 
      600, 
      32, 
      0
    );

    image->data = malloc(image->bytes_per_line * image->height);

    // Main event loop
    while (1) {
        XNextEvent(display, &event);
        if (event.type == Expose) {
            // Draw something to the buffer
            for (int y = 0; y < 600; y++) {
                for (int x = 0; x < 800; x++) {
                    unsigned long pixel = ((x ^ y) & 1) ? 0xFFFFFF : 0x000000; // Simple checker pattern
                    XPutPixel(image, x, y, pixel);
                }
            }

            // Copy buffer to window
            XPutImage(display, window, gc, image, 0, 0, 0, 0, 800, 600);
        }
        if (event.type == KeyPress)
            break;
    }

    // Cleanup
    XDestroyImage(image);
    XFreeGC(display, gc);
    XDestroyWindow(display, window);
    XCloseDisplay(display);

    return 0;
}

We are performing double-buffering here, but it’s only when the event type of Expose comes through. This can be useful, but not great if you want to do some animation.

In order to compile this particular example, you need to make sure that you have libx11-dev installed.

sudo apt-get install libx11-dev
gcc -o xlib_demo xlib_demo.c -lX11

SDL

For our last example here, we’ll use SDL to achieve pixel access to a backbuffer (or framebuffer) by creating an image. In this example we are continouosly flipping the back image onto video memory which allows for smooth animation.

#include <SDL2/SDL.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
    if (SDL_Init(SDL_INIT_VIDEO) < 0) {
        fprintf(stderr, "Could not initialize SDL: %s\n", SDL_GetError());
        return 1;
    }

    SDL_Window *window = SDL_CreateWindow(
      "SDL Demo", 
      SDL_WINDOWPOS_UNDEFINED, 
      SDL_WINDOWPOS_UNDEFINED, 
      800, 
      600, 
      SDL_WINDOW_SHOWN
    );

    if (window == NULL) {
        fprintf(stderr, "Could not create window: %s\n", SDL_GetError());
        SDL_Quit();
        return 1;
    }

    SDL_Renderer *renderer = SDL_CreateRenderer(
      window, 
      -1, 
      SDL_RENDERER_ACCELERATED
    );

    SDL_Texture *texture = SDL_CreateTexture(
      renderer, 
      SDL_PIXELFORMAT_ARGB8888, 
      SDL_TEXTUREACCESS_STREAMING, 
      800, 
      600
    );

    Uint32 *pixels = malloc(800 * 600 * sizeof(Uint32));

    // Main loop
    int running = 1;
    while (running) {
        SDL_Event event;
        while (SDL_PollEvent(&event)) {
            if (event.type == SDL_QUIT) {
                running = 0;
            }
        }

        // Draw something to the buffer
        for (int y = 0; y < 600; y++) {
            for (int x = 0; x < 800; x++) {
                pixels[y * 800 + x] = ((x ^ y) & 1) ? 0xFFFFFFFF : 0xFF000000; // Simple checker pattern
            }
        }

        SDL_UpdateTexture(texture, NULL, pixels, 800 * sizeof(Uint32));
        SDL_RenderClear(renderer);
        SDL_RenderCopy(renderer, texture, NULL, NULL);
        SDL_RenderPresent(renderer);

        SDL_Delay(16); // ~60 FPS
    }

    free(pixels);
    SDL_DestroyTexture(texture);
    SDL_DestroyRenderer(renderer);
    SDL_DestroyWindow(window);
    SDL_Quit();

    return 0;
}

Before being able to compile and run this, you need to make sure you have SDL installed on your system.

sudo apt-get install libsdl2-dev
gcc -o sdl_demo sdl_demo.c -lSDL2

That’s been a few different framebuffer options, all depending on your appetite for dependencies or ease of programming.

Getting Started with GNUstep

Introduction

GNUstep is a development framework for writing GUI applications. It aims to follow Apple’s Cocoa API but allows you to write applications for more platforms than just OSX.

In today’s article we’ll setup a local environment for writing GNUstep programs, and we’ll also write and compile a simple “Hello, world” application to make sure everything is setup.

Brief History

GNUstep is an open-source implementation of the OpenStep specification, which originated from NeXT, a company founded by Steve Jobs after he left Apple in 1985. NeXT developed the NeXTSTEP operating system, which introduced an advanced object-oriented framework for software development. In 1993, NeXT partnered with Sun Microsystems to create the OpenStep standard, which aimed to make NeXT’s frameworks available on other platforms.

When Apple acquired NeXT in 1996, the technology from NeXTSTEP and OpenStep formed the foundation of Apple’s new operating system, Mac OS X. Apple’s Cocoa framework, a core part of macOS, is directly derived from OpenStep. GNUstep, initiated in the early 1990s, aims to provide a free and portable version of the OpenStep API, allowing developers to create cross-platform applications with a foundation rooted in the same principles that underpin macOS development.

So, this still leaves us with GNUstep to get up and running.

Developer environment

First up, we need to install all of the dependencies in our developer environment. I’m using Debian so all of my package management will be specific to that distribution. All of these packages will be available on all distributions though.

sudo apt-get install build-essential gnustep gnustep-devel

Once this is installed, we can move on to writing some code.

“Hello World” alert

The following program is very basic. It’ll show an alert to screen, and then exit after the user has dismissed the alert.

// hello.m

#include <AppKit/AppKit.h>

@interface AppDelegate : NSObject<NSApplicationDelegate>

@end

@implementation AppDelegate

- (void)applicationDidFinishLaunching:(NSNotification *)notification {
  NSAlert *alert = [[[NSAlert alloc] init] autorelease];
  [alert setMessageText:@"Hello, World!"];
  [alert runModal];
  [NSApp terminate:nil];
}

@end

int main(int argc, char *argv[]) {
  NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
  NSApplication *app = [NSApplication sharedApplication];

  AppDelegate *appDelegate = [[[AppDelegate alloc] init] autorelease];
  [app setDelegate:appDelegate];

  [app run];
  [pool drain];

  return 0;
}

Walkthrough

First of all, we include AppKit/AppKit.h so that we get access to the programming API. We then define our own AppDelegate so we capture the applicationDidFinishLaunching slot:

#include <AppKit/AppKit.h>

@interface AppDelegate : NSObject<NSApplicationDelegate>

@end

@implementation AppDelegate

- (void)applicationDidFinishLaunching:(NSNotification *)notification {
  NSAlert *alert = [[[NSAlert alloc] init] autorelease];
  [alert setMessageText:@"Hello, World!"];
  [alert runModal];
  [NSApp terminate:nil];
}

@end

The handler sets up the alert to show to screen, runs this modal as “the program” with runModal, and then we finish with `terminate’.

Next is the main program itself.

We start with an NSAutoreleasePool to give our program some automatic memory management. This is cleaned up at the end with a call to [pool drain].

The app variable is setup as an NSApplication which allows us to instantiate and attach our AppDelegate via the [app setDelegate:appDelegate]; call.

int main(int argc, char *argv[]) {
  NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
  NSApplication *app = [NSApplication sharedApplication];

  AppDelegate *appDelegate = [[[AppDelegate alloc] init] autorelease];
  [app setDelegate:appDelegate];

  [app run];
  [pool drain];

  return 0;
}

Building

Now that we have our code written into hello.m, we can compile and link. There are some compile and link time libraries required to get this running. For this, we’ll use gnustep-config to do all of the heavy lifting for us.

gcc `gnustep-config --objc-flags` -o hello hello.m `gnustep-config --gui-libs`

If everything has gone to plan, you should be left with an executable called hello.

GNUstep Hello World

Happy GNUstep’ing!

Simple usages of rsync

Introduction

rsync is an excellent tool to have as your disposal for working with file data.

In today’s article, I’ll go through some usages.

Progress Copy

To securely and efficiently transfer data to (or from) a remote server while providing you progress updates you can use the following:

rsync -avzh --progress --stats user@server:/path/to/file output_name
  • -a puts rsync into archive mode, preserving your file and link structures
  • -v pumps up the logging output to give you verbose information
  • -z will compress the file data as it’s sent over the network
  • -h makes the output human readable
  • --progress provides a progress bar to be displayed, which tracks how far through we are
  • --stats provides a full statistical run down after the process has completed

Simple Backups with rsync

Introduction

Rsync is a powerful tool often utilized for its proficiency in file synchronization and transfer. Widely adopted in various computing environments, rsync excels in efficiently mirroring data across multiple platforms and minimizing data transfer by sending only changes in files. This utility is not only pivotal for maintaining backups but also ensures that the copies are consistent and up-to-date.

Today’s article is designed to guide you through the steps of setting up a basic yet effective backup system using rsync. By the end of this guide, you’ll have the knowledge to implement a reliable backup solution that can be customized to fit your specific needs.

Daily

The daily backup captures all of the changes for the day, giving you a backup with the fastest cadence.

#!/bin/bash

# daily-backup.sh
#
# Basic copy script to perform daily backups of the home dir

USER=$(whoami)
HOST=$(hostname)

rsync -aAX \
    --delete \
    --rsync-path="mkdir -p /path/to/backups/$HOST/daily/ && rsync" \
    --exclude-from=/home/$USER/.local/bin/rsync-homedir-excludes.txt \
    /home/$USER/ backup-user@backup-server:/path/to/backups/$HOST/daily/ 

Using whoami and hostname we can generalise this script so that you can reuse it between all of your machines.

The rsync-homedir-excludes.txt file allows you to define files that you’re not interested in backing up.

The switches that we’re sending into rsync are quite significant:

  • -a puts rsync into archive mode, preserving file structures, and links
  • -A preserves the ACLs so all of our permissions are preserved
  • -X any extra attributes that are stored by your file system will be preserved
  • --delete will delete files in the destination that are no longer present in the source, making the backup a true mirror

Weekly

The weekly backup is very similar to the daily backup. It’ll target different folders, and will have a different execution cadence.

#!/bin/bash

# weekly-backup.sh
#
# Basic copy script to perform weekly backups of the home dir

USER=$(whoami)
HOST=$(hostname)

rsync -aAX \
    --delete \
    --rsync-path="mkdir -p /path/to/backups/$HOST/weekly/ && rsync" \
    --exclude-from=/home/$USER/.local/bin/rsync-homedir-excludes.txt \
    /home/$USER/ backup-user@backup-server:/path/to/backups/$HOST/weekly/ 

There isn’t much difference here. Just writing to the /weekly folder.

Monthly

The longest cadence that we have is a monthly process which will archive the current state into an archive, and date the file for later use potentially.

#!/bin/bash

# monthly-backup.sh
#
# Monthly archive and copy

ARCHIVE=$(date +%Y-%m-%d).tar.gz
USER=$(whoami)
HOST=$(hostname)

tar --exclude-from=/home/$USER/.local/bin/rsync-homedir-excludes.txt \
    -zcvf \
    /tmp/$ARCHIVE \
    /home/$USER

scp /tmp/$ARCHIVE backup-user@backup-server:/path/to/backups/$HOST/monthly/

rm /tmp/$ARCHIVE

Using tar this script builds a full archive, and then sends that off to the backup server.

Scheduling

Finally, we need to setup these scripts to execute automatically for us. For this, we’ll use cron.

Here’s an example crontab scheduling these scripts:

# m h  dom mon dow   command
00 22 * * * /home/user/.local/bin/daily-backup.sh
00 21 * * 6 /home/user/.local/bin/weekly-backup.sh
00 22 1 * * /home/user/.local/bin/monthly-backup.sh