Sometimes you just want the raw power of assembly, but still enjoy the ergonomics of Rust. In this article, we’ll
walk through how to call routines in an external .s assembly file from your Rust project — the right way, using build.rs.
The functions we’ve defined in the assembly module need to be marked as extern. We do this at the top via extern "C"
with "C" indicating that we’re using the C calling convention
which is the standard way functions pass arguments and return values on most platforms.
Note: These functions need to be called in unsafe blocks as the Rust compiler can not guarantee the treatment of resources when they're executing.
The key here is the build entry, which tells Cargo to run our custom build script.
build.rs
Why do we need build.rs?
Rust’s build system (Cargo) doesn’t natively compile .s files or link in .o files unless you explicitly tell it
to. That’s where build.rs comes in — it’s a custom build script executed before compilation.
Here’s what ours looks like:
usestd::process::Command;fnmain(){// Compile test.s into test.oletstatus=Command::new("as").args(["test.s","-o","test.o"]).status().expect("Failed to assemble test.s");if!status.success(){panic!("Assembly failed");}// Link the object fileprintln!("cargo:rustc-link-search=.");println!("cargo:rustc-link-arg=test.o");// Rebuild if test.s changesprintln!("cargo:rerun-if-changed=test.s");}
We’re invoking as to compile the assembly, then passing the resulting object file to the Rust linker.
Build and Run
cargo run
Expected output:
Zero: 0
42 + 58 = 100
Conclusion
You’ve just learned how to:
Write standalone x86_64 assembly and link it with Rust
Use build.rs to compile and link external object files
Safely call assembly functions using Rust’s FFI
This is a powerful setup for performance-critical code, hardware interfacing, or even educational tools. You can take
this further by compiling C code too, or adding multiple .s modules for more complex logic.
Imagine two people, Alice and Bob. They’re standing in a crowded room — everyone can hear them. Yet somehow, they want
to agree on a secret password that only they know.
Sounds impossible, right?
That’s where Diffie–Hellman key exchange comes in. It’s a bit of mathematical magic that lets two people agree on a
shared secret — even while everyone is listening.
Let’s walk through how it works — and then build a toy version in code to see it with your own eyes.
Mixing Paint
Let’s forget numbers for a second. Imagine this:
Alice and Bob agree on a public color — let’s say yellow paint.
Alice secretly picks red, and Bob secretly picks blue.
They mix their secret color with the yellow:
Alice sends Bob the result of red + yellow.
Bob sends Alice the result of blue + yellow.
Now each of them adds their secret color again:
Alice adds red to Bob’s mix: (yellow + blue) + red
Bob adds blue to Alice’s mix: (yellow + red) + blue
Both end up with the same final color: yellow + red + blue!
But someone watching only saw:
The public yellow
The mixes: (yellow + red), (yellow + blue)
They can’t reverse it to figure out the red or blue.
Mixing paint is easy, but un-mixing it is really hard.
From Paint to Numbers
In the real world, computers don’t mix colors — they work with math.
Specifically, Diffie–Hellman uses something called modular arithmetic. Module arithmetic is just math where we
“wrap around” at some number.
For example:
\[7 \mod 5 = 2\]
We’ll also use exponentiation — raising a number to a power.
And here’s the core of the trick: it’s easy to compute this:
\[\text{result} = g^{\text{secret}} \mod p\]
But it’s hard to go backward and find the secret, even if you know result, g, and p.
This is the secret sauce behind Diffie–Hellman.
A Toy Implementation
Let’s see this story in action.
importrandom# Publicly known numbers
p=23# A small prime number
g=5# A primitive root modulo p (more on this later)
print("Public values: p =",p,", g =",g)# Alice picks a private number
a=random.randint(1,p-2)A=pow(g,a,p)# A = g^a mod p
# Bob picks a private number
b=random.randint(1,p-2)B=pow(g,b,p)# B = g^b mod p
print("Alice sends:",A)print("Bob sends: ",B)# Each computes the shared secret
shared_secret_alice=pow(B,a,p)# B^a mod p
shared_secret_bob=pow(A,b,p)# A^b mod p
print("Alice computes shared secret:",shared_secret_alice)print("Bob computes shared secret: ",shared_secret_bob)
Running this (your results may vary due to random number selection), you’ll see something like this:
Public values: p = 23 , g = 5
Alice sends: 10
Bob sends: 2
Alice computes shared secret: 8
Bob computes shared secret: 8
The important part here is that Alice and Bob both end up with the same shared secret.
Let’s breakdown this code, line by line.
p=23g=5
These are public constants. Going back to the paint analogy, you can think of p as the size of the palette and g
as our base “colour”. We are ok with these being known to anybody.
a=random.randint(1,p-2)A=pow(g,a,p)
Alice chooses a secret nunber a, and then computes \(A = g^a \mod p\). This is her public key - the equivalent of
“red + yellow”.
They both raise the other’s public key to their secret power. And because of how exponentiation works, both arrive at
the same final value:
\[(g^b)^a \mod p = (g^a)^b \mod p\]
This simplifies to:
\[g^{ab} \mod p\]
This is the shared secret.
Try it yourself
Try running the toy code above multiple times. You’ll see that:
Every time, Alice and Bob pick new private numbers.
They still always agree on the same final shared secret.
And yet… if someone was eavesdropping, they’d only see p, g, A, and B. That’s not enough to figure out a,
b, or the final shared secret (unless they can solve a very hard math problem called the discrete logarithm problem —
something computers can’t do quickly, even today).
It’s not perfect
Diffie–Hellman is powerful, but there’s a catch: it doesn’t authenticate the participants.
If a hacker, Mallory, can intercept the messages, she could do this:
Pretend to be Bob when talking to Alice
Pretend to be Alice when talking to Bob
Now she has two separate shared secrets — one with each person — and can man-in-the-middle the whole conversation.
So in practice, Diffie–Hellman is used with authentication — like digital certificates or signed messages — to
prevent this attack.
So, the sorts of applications you’ll see this used in are:
TLS / HTTPS (the “S” in secure websites)
VPNs
Secure messaging (like Signal)
SSH key exchanges
It’s one of the fundamental building blocks of internet security.
Conclusion
Diffie–Hellman feels like a magic trick: two people agree on a secret, in public, without ever saying the secret out
loud.
It’s one of the most beautiful algorithms in cryptography — simple, powerful, and still rock-solid almost 50 years
after it was invented.
Fuzz testing is the art of breaking your software on purpose. By feeding random or malformed input into a program, we
can uncover crashes, logic errors, or even security vulnerabilities — all without writing specific test cases.
In memory-unsafe languages like C, fuzzing is especially powerful. In just a few lines of shell script, we can hammer a
binary until it falls over.
This guide shows how to fuzz a tiny C program using just cat /dev/urandom, and how to track down and analyze the
crash with gdb.
The Target
First off we need our test candidate. By design this program is vulnerable through its use of strcpy.
In main, we’re reading up to 1kb of data from stdin. This pointer is then sent into the vulnerable function. A
buffer is defined in there well under the 1kb that could come through the front door.
strcpy doesn’t care though. It’ll try and grab as much data until it encounters a null terminator.
This is our problem.
Let’s get this program built with some debugging information:
gcc -g-o vuln vuln.c
Basic “Dumb” Fuzzer
We have plenty of tools at our disposal, directly at the linux console. So we can put together a fuzz tester albeit
simple, without any extra tools here.
Here’s fuzzer.sh:
# allow core dumpsulimit-c unlimited
# send in some random datacat /dev/urandom | head-c 100 | ./vuln
100 bytes should be enough to trigger some problems internally.
Running the fuzzer, we should see something similar to this:
We get some immediate feedback in stack smashing detected.
Where’s the Core Dump?
On modern Linux systems, core dumps don’t always appear in your working directory. Instead, they may be captured by
systemd-coredump and stored elsewhere.
In order to get a list of core dumps, you can use coredumpctl:
coredumpctl list
You’ll get a big report of all the core dumps that your system has gone through. You can use the PID that crashed to
reference the dump that is specifically yours.
TIME PID UID GID SIG COREFILE EXE SIZE
Sun 2025-04-20 11:02:14 AEST 4775 1000 1000 SIGABRT present /path/to/vuln 19.4K
Debugging the dump
We can get our hands on these core dumps in a couple of ways.
We can launch gdb directly via coredumpctl, and This will load the crashing binary and the core file into GDB.
coredumpctl gdb 4775
I added the specific failing pid to my command, otherwise this will use the latest coredump.
Inside GDB:
bt # backtrace
info registers # cpu state at crash
list # show source code around crash
Alternatively, if you want a phyical copy of the dump in your local directory you can get our hands on it with this:
coredumpctl dump --output=core.vuln
AFL
Once you’ve had your fun with cat /dev/urandom, it’s worth exploring more sophisticated fuzzers that generate inputs
intelligently — like AFL (American Fuzzy Lop).
AFL instruments your binary to trace code coverage and then evolves inputs that explore new paths.
Install
First of all, we need to install afl on our system.
pacman -S afl
Running
Now we can re-compile our executable but this time with AFL’s instrumentation:
afl-cc -g-o vuln-afl vuln.c
Before we can run our test, we need to create an input corpus. We create a minimal set of valid (or near-valid) inputs.
AFL will use this input to mutate in other inputs.
mkdir input
echo"AAAA"> input/seed
Before we run, there will be some performance settings that you need to push out to the kernel first.
We need to tell the CPU to run at maximum frequency with the following:
cd /sys/devices/system/cpu
echo performance | tee cpu*/cpufreq/scaling_governor
For more details about these settings, have a look at the CPU frequency scaling
documentation.
You should now see a live updating dashboard like the following, detailing all of the events that are occuring through
the many different runs of your application:
It’s like the /dev/urandom method — but on steroids, with data-driven evolution.
The /output folder will hold all the telemetry from the many runs that AFL is currently performing. Any crashes and
hangs are kept later for your inspection. These are just core dumps that you can use again with gdb.
Conclusion
Fuzzing is cheap, dumb, and shockingly effective. If you’re writing C code, run a fuzzer against your tools. You may
find bugs that formal tests would never hit — and you’ll learn a lot about your program’s internals in the process.
If you’re interested in going deeper, check out more advanced fuzzers like:
AFL (American Fuzzy Lop): coverage-guided fuzzing via input mutation
Genetic algorithms (GAs) are one of those wild ideas in computing where the solution isn’t hand-coded — it’s grown.
They borrow inspiration straight from biology. Just like nature evolved eyes, wings, and brains through selection and
mutation, we can evolve solutions to problems in software. Not by brute-force guessing, but by letting generations of
candidates compete, reproduce, and adapt.
At a high level, a genetic algorithm looks like this:
Create a population of random candidate solutions.
Score each one — how “fit” or useful is it?
Select the best performers.
Breed them together to make the next generation.
Mutate some of them slightly, to add variation.
Repeat until something good enough evolves.
There’s no central intelligence. No clever algorithm trying to find the best answer. Just selection pressure pushing
generations toward better solutions — and that’s often enough.
What’s powerful about GAs is that they’re not tied to any specific kind of problem. If you can describe what a “good”
answer looks like, even fuzzily, a GA might be able to evolve one. People have used them to:
Evolve art or music
Solve optimization problems
Train strategies for games
Design antennas for NASA
In this post, we’re going to build a genetic algorithm from scratch — in pure Python — and show it working on a fun
little challenge: evolving a string of text until it spells out "HELLO WORLD".
It might be toy-sized, but the core principles are exactly the same as the big stuff.
Defining the parts
Here, we’ll break down the genetic algorithm idea into simple, solvable parts.
Solution
First of all, we need to define a solution. The solution is what we want to work towards. It can be considered our
“chromosome” in this example.
TARGET="HELLO WORLD"
Every character of this string can then be considered a “gene”.
Defining Fitness
Now we need a function that tells us how fit our individual is, or how close it is to our defined target:
random_individual will return a string the same size as our solution, but will go with random characters at each
index. This provides our starting point.
Genetics
Two key operations give genetic algorithms their evolutionary flavor: crossover and mutation. This is what
gives us generations, allowing this algorithm to grow.
Crossover (Recombination)
In biology, crossover happens when two parents create a child: their DNA gets shuffled together. A bit from mum, a bit
from dad, spliced at some random point. The child ends up with a new mix of traits, some from each.
We can do exactly that with strings of characters (our “DNA”). Here’s the basic idea:
This picks a random point in the string, then takes the first part from parent a and the second part from parent b.
So if a is "HELLOXXXX" and b is "YYYYYWORLD", a crossover might give you "HELLYWORLD". New combinations,
new possibilities.
Mutation
Of course, biology isn’t just about inheritance — it also relies on randomness. DNA can get copied imperfectly: a
flipped bit, a swapped base. Most mutations are useless. But every once in a while, one’s brilliant.
This picks a random character in the string and replaces it with a new random one — maybe turning an "X" into a "D",
or an "O" into an "E". It adds diversity to the population and prevents us from getting stuck in a rut.
Together, crossover and mutation give us the raw machinery of evolution: recombination and novelty. With just these two
tricks, plus a way to score fitness and select the best candidates, we can grow something surprisingly smart from
totally random beginnings.
Putting it all together
Now, we just loop on this. We do this over and over, until we land at a solution that marries to our “good solution” that
we fed this system with to being with. You can see the “HELLO WORLD” example in action here, and exactly how the
algorithm came to its answer:
Gen 0 | Best: RAVY OTRTK | Score: 2
Gen 1 | Best: RAVY OTRLH | Score: 3
Gen 2 | Best: LZELORMOHLD | Score: 5
Gen 3 | Best: SFLLORMOHLD | Score: 6
Gen 4 | Best: SFLLO OTRLD | Score: 7
Gen 5 | Best: SFLLO OTRLD | Score: 7
Gen 6 | Best: SFLLO OORLD | Score: 8
Gen 7 | Best: SFLLO MORLD | Score: 8
Gen 8 | Best: SFLLO MORLD | Score: 8
Gen 9 | Best: SFLLO MORLD | Score: 8
Gen 10 | Best: SFLLO MORLD | Score: 8
Gen 11 | Best: SFLLO MORLD | Score: 8
Gen 12 | Best: SFLLO SORLD | Score: 8
Gen 13 | Best: NFLLO MORLD | Score: 8
Gen 14 | Best: SFLLO MORLD | Score: 8
Gen 15 | Best: SFLLO WORLD | Score: 9
Gen 16 | Best: SFLLO WORLD | Score: 9
Gen 17 | Best: HFLLO WORLD | Score: 10
Gen 18 | Best: HFLLO WORLD | Score: 10
Gen 19 | Best: HFLLO WORLD | Score: 10
Gen 20 | Best: HFLLO WORLD | Score: 10
Gen 21 | Best: HFLLO WORLD | Score: 10
Gen 22 | Best: HELLO WORLD | Score: 11
It obviously depends on how your random number generator is feeling, but your mileage will vary.
Code listing
A full code listing of this in action is here:
importrandomimportstringTARGET="HELLO WORLD".upper()POP_SIZE=100MUTATION_RATE=0.01GENERATIONS=100000defrandom_char():returnrandom.choice(string.ascii_uppercase+" ")defrandom_individual():return''.join(random_char()for_inrange(len(TARGET)))deffitness(individual):returnsum(1fori,jinzip(individual,TARGET)ifi==j)defmutate(individual):return''.join(cifrandom.random()>MUTATION_RATEelserandom_char()forcinindividual)defcrossover(a,b):split=random.randint(0,len(a)-1)returna[:split]+b[split:]# Initial population
population=[random_individual()for_inrange(POP_SIZE)]forgenerationinrange(GENERATIONS):scored=[(ind,fitness(ind))forindinpopulation]scored.sort(key=lambdax:-x[1])best=scored[0]print(f"Gen {generation:4d} | Best: {best[0]} | Score: {best[1]}")ifbest[1]==len(TARGET):print("🎉 Target reached!")break# Keep top 10 as parents
parents=[indforind,_inscored[:10]]# Make new population
population=[mutate(crossover(random.choice(parents),random.choice(parents)))for_inrange(POP_SIZE)]
Conclusion
Genetic algorithms are a beautiful way to turn randomness into results. You don’t need deep math or fancy machine
learning models — just a way to measure how good something is, and the patience to let evolution do its thing.
What’s even more exciting is that this toy example uses the exact same principles behind real-world tools that tackle
complex problems in scheduling, design, game playing, and even neural network tuning.
If you’re curious to take this further, there are full-featured libraries and frameworks out there built for serious
applications:
DEAP (Distributed Evolutionary Algorithms in Python) – A flexible framework for evolutionary algorithms. Great for research and custom workflows.
PyGAD – Simple, powerful, and easy to use — especially good for optimizing neural networks and functions.
ECJ – A Java-based evolutionary computing toolkit used in academia and industry.
Jenetics – Another Java library that’s modern, elegant, and geared toward real engineering problems.
These libraries offer more advanced crossover strategies, selection techniques (like tournament or roulette-wheel),
and even support for parallel processing or multi-objective optimization.
But even with a simple string-matching example, you’ve now seen how it all works under the hood — survival of the
fittest, one generation at a time.
In Part 3, we explored ARM calling conventions, debugging,
and cleaned up our UART driver. While assembly has given us fine control over the hardware, writing an entire OS in
assembly would be painful.
It’s time to enter C land.
This post covers:
Modifying the bootloader to transition from assembly to C
Updating the UART driver to be callable from C
Writing our first C function (kmain())
Adjusting the Makefile and linker script for C support
Booting into C
We still need a bit of assembly to set up the stack and call kmain(). Let’s start by modifying our bootloader.
Updated bootloader.s
.section .text
.global _start
_start:
LDR sp, =stack_top @ Set up the stack
BL kmain @ Call the C kernel entry function
B . @ Hang forever
.section .bss
.align 4
stack_top:
.space 1024
What’s changed?
We load the stack pointer (sp) before calling kmain(). This ensures C has a valid stack to work with.
We branch-and-link (BL) to kmain(), following ARM’s calling conventions.
The infinite loop (B .) prevents execution from continuing into unknown memory if kmain() ever returns.
With this setup, execution will jump to kmain()—which we’ll define next.
Our First C Function: kmain()
Now that we can transition from assembly to C, let’s create our first function.
kmain.c
#include"uart.h"voidkmain(){uart_puts("Hello from C!\n");while(1);}
What’s happening?
We include our uart.h header so we can call uart_puts().
kmain() prints "Hello from C!" using our UART driver.
The infinite while(1); loop prevents execution from continuing into unknown territory.
At this point, our OS will boot from assembly, call kmain(), and print text using our UART driver—but we need to
make a few more changes before this compiles.
Making the UART Driver Callable from C
Right now, uart_puts and uart_putc are assembly functions. To call them from C, we need to:
Code starts at 0x10000, ensuring it is loaded correctly.
.text, .rodata, .data, and .bss sections are properly defined.
Build
Now that all of these changes in place, we can make our kernel and run it. If everything has gone to plan, you should
see our kernel telling us that it’s jumped to C.
Hello from C!
Conclusion
We’ve successfully transitioned from pure assembly to using C for higher-level logic, while keeping low-level
hardware interaction in assembly.
The code for this article is available in the GitHub repo.