FreeBSD Jails are one of the earliest implementations of operating
system-level virtualization—dating back to the early 2000s, long before Docker popularized the idea of lightweight
containers. Despite their age, jails remain a powerful, flexible, and minimal way to isolate services and processes on
FreeBSD systems.
This post walks through a minimal “Hello World” setup using Jails, with just enough commentary to orient new users and
show where jails shine in the modern world of virtualization.
Why Jails?
A FreeBSD jail is a chroot-like environment with its own file system, users,
network interfaces, and process table. But unlike chroot, jails extend control to include process isolation, network
access, and fine-grained permission control. They’re more secure, more flexible, and more deeply integrated into the
FreeBSD base system.
Here’s how jails compare with some familiar alternatives:
Versus VMs: Jails don’t emulate hardware or run separate kernels. They’re faster to start, lighter on resources, and simpler to manage. But they’re limited to the same FreeBSD kernel as the host.
Versus Docker: Docker containers typically run on a Linux host and rely on a container runtime, layered filesystems, and extensive tooling. Jails are simpler, arguably more robust, and don’t require external daemons. However, they lack some of the ecosystem and portability benefits that Docker brings.
If you’re already running FreeBSD and want to isolate services or test systems with minimal overhead, jails are a
perfect fit.
Setup
Let’s build a bare-bones jail. The goal here is simplicity: get a jail running with minimal commands. This is the BSD
jail equivalent of “Hello, World.”
# Make a directory to hold the jailmkdir hw
# Install a minimal FreeBSD userland into that directorysudo bsdinstall jail /home/michael/src/jails/hw
# Start the jail with a name, IP address, and a shellsudo jail -cname=hw host.hostname=hw.example.org \
ip4.addr=192.168.1.190 \path=/home/michael/src/jails/hw \command=/bin/sh
You now have a running jail named hw, with a hostname and IP, running a shell isolated from the host system.
192.168.1.190 is just a static address picked arbitrarily by me. For you, you’ll want to pick an address that is
reachable on your local network.
Poking Around
With your jail up and running, that means you can start working with it. To enter the jail, you can use the following:
sudo jexec hw /bin/sh
jexec allows you to send any command that you need to into the jail to execute.
sudo jexec hw ls /
Querying
You can list running jails with:
jls
You should see something like this:
JID IP Address Hostname Path
2 192.168.1.190 hw.example.org /home/michael/src/jails/hw
You can also look at what’s currently running in the jail:
ps -J hw
You should see the /bin/sh process:
PID TT STAT TIME COMMAND
2390 5 I+J 0:00.01 /bin/sh
Finishing up
To terminate the jail:
sudo jail -r hw
This is a minimal setup with no automated networking, no jail management frameworks, and no persistent configuration.
And that’s exactly the point: you can get a working jail in three commands and tear it down just as easily.
When to Use Jails
Jails make sense when:
You want process and network isolation on FreeBSD without the overhead of full VMs.
You want to run multiple versions of a service (e.g., Postgres 13 and 15) on the same host.
You want stronger guarantees than chroot provides for service containment.
You’re building or testing FreeBSD-based systems and want a reproducible sandbox.
For more complex jail setups, FreeBSD offers tools like ezjail, iocage, and bastille that add automation and
persistence. But it’s worth knowing how the pieces fit together at the core.
Conclusion
FreeBSD jails offer a uniquely minimal, powerful, and mature alternative to both VMs and containers. With just a few
commands, you can create a secure, isolated environment for experimentation, testing, or even production workloads.
This post only scratched the surface, but hopefully it’s enough to get you curious. If you’re already on FreeBSD, jails
are just sitting there, waiting to be used—no extra software required.
Modern Linux systems provide a fascinating feature for overriding shared library behavior at runtime: LD_PRELOAD.
This environment variable lets you inject a custom shared library before anything else is loaded — meaning you can
intercept and modify calls to common functions like open, read, connect, and more.
In this post, we’ll walk through hooking the open() function using LD_PRELOAD and a simple shared object. No extra
tooling required — just a few lines of C, and the ability to compile a .so file.
Intercepting open()
Let’s write a tiny library that intercepts calls to open() and prints the file path being accessed. We’ll also
forward the call to the real open() so the program behaves normally.
Create a file named hook_open.c with the following:
#define _GNU_SOURCE
#include<stdio.h>
#include<stdarg.h>
#include<dlfcn.h>
#include<fcntl.h>intopen(constchar*pathname,intflags,...){staticint(*real_open)(constchar*,int,...)=NULL;if(!real_open)real_open=dlsym(RTLD_NEXT,"open");va_listargs;va_start(args,flags);mode_tmode=va_arg(args,int);va_end(args);fprintf(stderr,"[HOOK] open() called with path: %s\n",pathname);returnreal_open(pathname,flags,mode);}
This function matches the signature of open, grabs the “real” function using dlsym(RTLD_NEXT, ...), and then
forwards the call after logging it.
Note We use va_list to handle the optional mode argument safely.
Compiling the Hook
Compile your code into a shared object:
gcc -fPIC-shared-o hook_open.so hook_open.c -ldl
Now you can use this library with any dynamically linked program that calls open.
Testing with a Simple Program
Try running a standard tool like cat to confirm that it’s using open():
LD_PRELOAD=./hook_open.so cat hook_open.c
You should see:
[HOOK] open() called with path: hook_open.c
#define _GNU_SOURCE
...
Each time the program calls open(), your hook intercepts it, logs the call, and passes control along.
Notes and Gotchas
This only works with dynamically linked binaries — statically linked programs don’t go through the dynamic linker.
Some programs (like ls) may use openat() instead of open(). You can hook that too, using the same method.
If your hook causes a crash or hangs, it’s often due to incorrect use of va_arg or missing dlsym resolution.
Where to Go From Here
You can expand this basic example to:
Block access to specific files
Redirect file paths
Inject fake contents
Hook other syscalls like connect(), write(), execve()
LD_PRELOAD is a powerful mechanism for debugging, sandboxing, and learning how programs interact with the system.
Just don’t forget — you’re rewriting the behavior of fundamental APIs at runtime.
Hexagonal Architecture, also known as Ports and
Adapters, is a compelling design pattern that encourages the decoupling of domain logic from infrastructure concerns.
In this post, I’ll walk through a Rust project called banker that adopts this architecture, showing how it helps
keep domain logic clean, composable, and well-tested.
You can follow along with the full code up in my GitHub Repository to get this
running locally.
Project Structure
The banker project is organized as a set of crates:
crates/
├── banker-core # The domain and business logic
├── banker-adapters # Infrastructure adapters (e.g. in-memory repo)
├── banker-fixtures # Helpers and test data
└── banker-http # Web interface via Axum
Each crate plays a role in isolating logic boundaries:
banker-core defines the domain entities, business rules, and traits (ports).
banker-adapters implements the ports with concrete infrastructure (like an in-memory repository).
banker-fixtures provides test helpers and mock repositories.
banker-http exposes an HTTP API with axum, calling into the domain via ports.
Structurally, the project flows as follows:
graph TD
subgraph Core
BankService
AccountRepo[AccountRepo trait]
end
subgraph Adapters
HTTP[HTTP Handler]
InMemory[InMemoryAccountRepo]
Fixtures[Fixture Test Repo]
end
HTTP -->|calls| BankService
BankService -->|trait| AccountRepo
InMemory -->|implements| AccountRepo
Fixtures -->|implements| AccountRepo
Defining the Domain (banker-core)
In Hexagonal Architecture, the domain represents the core of your application—the rules, behaviors, and models that
define what your system actually does. It’s intentionally isolated from infrastructure concerns like databases or HTTP.
This separation ensures the business logic remains testable, reusable, and resilient to changes in external technology
choices.
The banker-core crate contains the central business model:
pubstructBank<R:AccountRepo>{repo:R,}impl<R:AccountRepo>Bank<R>{pubfndeposit(&self,cmd:Deposit)->Result<Account,BankError>{letmutacct=self.repo.get(&cmd.id)?.ok_or(BankError::NotFound)?;acct.balance_cents+=cmd.amount_cents;self.repo.upsert(&acct)?;Ok(acct)}// ... open and withdraw omitted for brevity}
The Bank struct acts as the use-case layer, coordinating logic between domain entities and ports.
Implementing Adapters
In Hexagonal Architecture, adapters are the glue between your domain and the outside world. They translate external
inputs (like HTTP requests or database queries) into something your domain understands—and vice versa. Adapters
implement the domain’s ports (traits), allowing your application core to remain oblivious to how and where the data
comes from.
The in-memory repository implements the AccountRepo trait and lives in banker-adapters:
The outermost layer of a hexagonal architecture typically handles transport—the mechanism through which external actors
interact with the system. In our case, that’s HTTP, implemented using the axum framework. This layer invokes domain
services via the ports defined in banker-core, ensuring the business logic remains insulated from the specifics of web
handling.
In banker-http, we wire up the application for HTTP access using axum:
Each handler invokes domain logic through the Bank service, returning simple JSON responses.
This is one example of a primary adapter—other adapters (e.g., CLI, gRPC) could be swapped in without changing the core.
Takeaways
Traits in Rust are a perfect match for defining ports.
Structs implementing those traits become adapters—testable and swappable.
The core domain crate (banker-core) has no dependencies on infrastructure or axum.
Tests can exercise the domain logic via fixtures and in-memory mocks.
Hexagonal Architecture in Rust isn’t just theoretical—it’s ergonomic. With traits, lifetimes, and ownership semantics,
you can cleanly separate concerns while still writing expressive, high-performance code.
One of the most powerful ideas behind deep learning is backpropagation—the algorithm that lets a neural network learn
from its mistakes. But while modern tools like PyTorch and TensorFlow
make it easy to use backprop, they also hide the magic.
In this post, we’ll strip things down to the fundamentals and implement a neural network from scratch in NumPy
to solve the XOR problem.
Along the way, we’ll dig into what backprop really is, how it works, and why it matters.
What Is Backpropagation?
Backpropagation is a method for computing how to adjust the weights—the tunable parameters of a neural network—so
that it improves its predictions. It does this by minimizing a loss function, which measures how far off the
network’s outputs are from the correct answers. To do that, it calculates gradients, which tell us how much each
weight contributes to the overall error and how to adjust it to reduce that error.
Think of it like this:
In calculus, we use derivatives to understand how one variable changes with respect to another.
In neural networks, we want to know: How much does this weight affect the final error?
Enter the chain rule—a calculus technique that lets us break down complex derivatives into manageable parts.
Backpropagation applies the chain rule across all the layers in a network, allowing us to efficiently compute the
gradient of the loss function for every weight.
Neural Network Flow
graph TD
A[Input Layer] --> B[Hidden Layer]
B --> C[Output Layer]
C --> D[Loss Function]
D -->|Backpropagate| C
C -->|Backpropagate| B
B -->|Backpropagate| A
We push inputs forward through the network to get predictions (forward pass), then pull error gradients backward to
adjust the weights (backward pass).
Solving XOR with a Neural Network
The XOR problem is a classic test for neural networks. It looks like this:
Input
Output
[0, 0]
0
[0, 1]
1
[1, 0]
1
[1, 1]
0
A simple linear model can’t solve XOR because it’s not linearly separable. But with a small neural network—just one
hidden layer—we can crack it.
We’ll walk through our implementation step by step.
The x matrix defines all of our inputs. You can see these as the bit pairs that you’d normally pass through an xor
operation. The y matrix then defines the “well known” outputs.
The input_size is the number of input features. We have two values going in as an input here.
The hidden_size is the number of “neurons” in the hidden layer. Hidden layers are where the network transforms
input into internal features. XOR requires non-linear transformation, so at least one hidden layer is essential. Setting
this to 2 keeps the network small, but expressive enough to learn XOR.
output_size is the number of output neurons. XOR is a binary classification problem so we only need a single output.
Finally, learning_rate controls how fast the network learns. This value scales the size of the weight updates
during training. By increasing this value, we get the network to learn faster but we risk overshooting optimal values.
Lower values are safer, but slower.
We initialize weights randomly and biases to zero. The small network has two hidden units.
Training Loop
We run a “forward pass” and a “backward pass” many times (we refer to these as epochs).
Forward pass
The forward pass takes the input X, feeds it through the network layer by layer, and computes the output a2. Then it
calculates how far off the prediction is using a loss function.
In this step, we are calculating the loss for the current set of weights.
This loss is a measure of how “wrong” the network is, and it’s what drives the learning process in the backward pass.
Backward pass
The backward pass is how the network learns—by adjusting the weights based on how much they contributed to the final
error. This is done by applying the chain rule in reverse across the network.
# Step 1: Derivative of loss with respect to output (a2)
d_loss_a2=2*(a2-y)/y.size
This computes the gradient of the mean squared error loss with respect to the output. It answers: How much does a
small change in the output affect the loss?
# Step 2: Derivative of sigmoid at output layer
d_a2_z2=sigmoid_derivative(a2)d_z2=d_loss_a2*d_a2_z2
Now we apply the chain rule. Since the output passed through a sigmoid function, we compute the derivative of the
sigmoid to see how a change in the pre-activation \(z_2\) affects the output.
# Step 3: Gradients for W2 and b2
d_W2=np.dot(a1.T,d_z2)d_b2=np.sum(d_z2,axis=0,keepdims=True)
a1.T is the transposed output from the hidden layer.
d_z2 is the error signal coming back from the output.
The dot product calculates how much each weight in W2 contributed to the error.
The bias gradient is simply the sum across all samples.
# Step 4: Propagate error back to hidden layer
d_a1=np.dot(d_z2,W2.T)d_z1=d_a1*sigmoid_derivative(a1)
Now we move the error back to the hidden layer:
d_a1 is the effect of the output error on the hidden layer output.
We multiply by the derivative of the hidden layer activation to get the true gradient of the hidden pre-activations.
# Step 5: Gradients for W1 and b1
d_W1=np.dot(X.T,d_z1)d_b1=np.sum(d_z1,axis=0,keepdims=True)
X.T is the input data, transposed.
We compute how each input feature contributed to the hidden layer error.
This entire sequence completes one application of backpropagation—moving from output to hidden to input layer,
using the chain rule and computing gradients at each step.
The final gradients (d_W1, d_W2, d_b1, d_b2) are then used in the weight update step:
# Apply the gradients to update the weights
W2-=learning_rate*d_W2b2-=learning_rate*d_b2W1-=learning_rate*d_W1b1-=learning_rate*d_b1
This updates the model just a little bit—nudging the weights toward values that reduce the overall loss.
The network is getting better, but not perfect. Let’s look at what these predictions mean:
Input
Expected
Predicted
Interpreted
[0, 0]
0
0.1241
0
[0, 1]
1
0.4808
~0.5
[1, 0]
1
0.8914
1
[1, 1]
0
0.5080
~0.5
It’s nailed [1, 0] and is close on [0, 0], but it’s uncertain about [0, 1] and [1, 1]. That’s okay—XOR is a
tough problem when learning from scratch with minimal capacity.
This ambiguity is actually a great teaching point: neural networks don’t just “flip a switch” to get things right.
They learn gradually, and sometimes unevenly, especially when training conditions (like architecture or learning rate)
are modest.
You can tweak the hidden layer size, activation functions, or even the optimizer to get better results—but the core
algorithm stays the same: forward pass, loss computation, backpropagation, weight update.
Conclusion
As it stands, this tiny XOR network is a full demonstration of what makes neural networks learn.
Rust programmers encounter combinators all the time: map(), and_then(), filter(). They’re everywhere in
Option, Result, Iterator, and of course, Future. But if you’re coming from a functional programming
background — or just curious how these things work — you might ask:
What actually is a combinator?
Let’s strip it down to the bare minimum: a value, a function, and some deferred execution.
A Lazy Computation
We’ll start with a structure called Thunk. It wraps a closure that does some work, and it defers that work until we
explicitly ask for it via .run().
It’s essentially a one-shot deferred computation. We stash a closure inside, and we invoke it only when we’re ready.
Here, F is the type of the closure (the function) we’re wrapping, and R is the result it will produce once called. This lets
Thunk be generic over any one-shot computation.
The work here is really wrapped up by self.f.take() which will force the value.
Simple.
Example
Here’s what this looks like in practice:
fnmain(){letadd_one=Thunk::new(||3+1);letresult=add_one.run();println!("Result: {}",result);// prints 4}
No magic. No threading. No async. Just a delayed function call.
Composing Thunks
The real value in combinators is that they compose. We can make more complex computations out of simpler ones —
without immediately evaluating them.
Here’s how we can build on top of multiple Thunks:
We’ve built a new computation (o) that depends on two others (m and n). They won’t run until o.run() is
called — and then they run in the correct order, and just once.
Look Familiar?
If you’ve spent time in Haskell, this structure might look suspiciously familiar:
fmap :: Functor f => (a -> b) -> f a -> f b
This is a form of fmap. We’re not building a full trait implementation here, but the shape is the same. We can even
imagine extending Thunk with a map() method:
No typeclasses, no lifetimes — just combinator building blocks.
From Lazy to Async
Now here’s the twist. What if our .run() method couldn’t give us a value right away? What if it needed to register a
waker, yield, and be polled later?
That’s exactly what happens in Rust’s async system. The structure is the same — a value and a function bundled
together — but the execution context changes. Instead of calling .run(), we implement Future and respond to
.poll().
Here’s what that looks like for a simple async Map combinator:
usestd::future::Future;usestd::pin::Pin;usestd::task::{Context,Poll};usepin_project::pin_project;// Our Map combinator#[pin_project]pubstructMap<Fut,F>{#[pin]future:Fut,f:Option<F>,// Option to allow taking ownership in poll}impl<Fut,F>Map<Fut,F>{pubfnnew(future:Fut,f:F)->Self{Self{future,f:Some(f)}}}impl<Fut,F,T,U>FutureforMap<Fut,F>whereFut:Future<Output=T>,F:FnOnce(T)->U,{typeOutput=U;fnpoll(self:Pin<&mutSelf>,cx:&mutContext<'_>)->Poll<Self::Output>{letmutthis=self.project();matchthis.future.poll(cx){Poll::Pending=>Poll::Pending,Poll::Ready(val)=>{letf=this.f.take().expect("polled Map after completion");Poll::Ready(f(val))}}}}// Helper function to use it ergonomicallypubfnmap<Fut,F,T,U>(future:Fut,f:F)->Map<Fut,F>whereFut:Future<Output=T>,F:FnOnce(T)->U,{Map::new(future,f)}
Let’s take a step back and notice something: this structure is almost identical to Thunk. We’re still storing a value
(future) and a function (f), and the combinator (Map) still controls when that function is applied. The
difference is that we now interact with the asynchronous task system via poll(), instead of calling .run() ourselves.
This is how Future combinators in futures and tokio work under the hood — by carefully pinning, polling, and
composing smaller futures into larger ones.
This is essentially a hand-rolled version of what futures::FutureExt::map() gives you for free.
As a simple example, we can use this as follows:
#[tokio::main]asyncfnmain(){letfut=async{21};letmapped=map(fut,|x|x*2);letresult=mapped.await;println!("Result: {}",result);// Should print 42}
Conclusion
We often think of combinators as “just utility functions.” But they’re really more than that: they’re
a way of thinking. Package a value and a transformation together. Delay the work. Compose more when you’re ready.
So the next time you write .map(), remember — it’s just a Thunk waiting to happen.