Radio signals surround us every day—whether it’s FM radio in your car, WiFi on your laptop, or Bluetooth connecting
your phone to wireless headphones. These signals are all based on the same fundamental principles: frequency,
modulation, and bandwidth.
In today’s article, I want to go through some of the fundamentals:
How do radio signals travel through the air?
What is the difference between AM and FM?
How can multiple signals exist without interfering?
What is digital modulation (FSK & PSK), and why does it matter?
How can we capture and transmit signals using HackRF?
By the end of this article, we should be running some experiments with our own software defined radio devices.
Getting Setup
Before getting started in this next section, you’ll want to make sure that you have some specific software installed
on your computer, as well as a software defined radio device.
I’m using a HackRF One as my device of choice as it integrates with all of
the software that I use.
Make sure you have the following packages installed:
hackrf
gqrx
gnuradio
sudo pacman -S hackrf gqrx gnuradio
Create yourself a new python project and virtual environment, and install the libraries that will wrap some of these
tools to give you an easy to use programming environment for your software defined radio.
The output of some of these examples are graphs, so we use matplotlib to save them to look at later.
Be Responsible!
A note before we get going - you will be installing software that will allow you to transmit signals that could
potentially be dangerous and against the law, so before transmitting:
Know the laws – Unlicensed transmission can interfere with emergency services.
Use ISM Bands – 433 MHz, 915 MHz, 2.4 GHz are allowed for low-power use.
Start in Receive Mode – Learning to capture first avoids accidental interference.
Basics of Radio Waves
What is a Radio Signal?
A radio signal is a type of electromagnetic wave that carries information through the air. These waves travel at the
speed of light and can carry audio, video, or digital data.
Radio waves are defined by:
Frequency (Hz) – How fast the wave oscillates.
Wavelength (m) – The distance between peaks.
Amplitude (A) – The height of the wave (strength of the signal).
A high-frequency signal oscillates faster and has a shorter wavelength. A low-frequency signal oscillates slower and
has a longer wavelength.
Since radio waves travel at the speed of light, their wavelength (\(\lambda\)) can be calculated using:
\[\lambda = \frac{c}{f}\]
Where:
\(\lambda\) = Wavelength in meters
\(c\) = Speed of light ~(\(3.0 \times 10^8\) m/s)
\(f\) = Frequency in Hz
What is Frequency?
Frequency is measured in Hertz (Hz), meaning cycles per second. You may have heard of kilohertz, megahertz, and
gigahertz. These are all common frequency units:
Unit
Hertz
1 kHz (kilohertz)
1,000 Hz
1 MHz (megahertz)
1,000,000 Hz
1 GHz (gigahertz)
1,000,000,000 Hz
Every device that uses radio has a specific frequency range. For example:
AM Radio: 530 kHz – 1.7 MHz
FM Radio: 88 MHz – 108 MHz
WiFi (2.4 GHz Band): 2.4 GHz – 2.5 GHz
Bluetooth: 2.4 GHz
GPS Satellites: 1.2 GHz – 1.6 GHz
Each of these frequencies belongs to the radio spectrum, which is carefully divided so that signals don’t interfere
with each other.
What is Bandwidth?
Bandwidth is the amount of frequency space a signal occupies.
A narrowband signal (like AM radio) takes up less space. A wideband signal (like WiFi) takes up more space to
carry more data.
Example:
AM Radio Bandwidth: ~10 kHz per station
FM Radio Bandwidth: ~200 kHz per station
WiFi Bandwidth: 20–80 MHz (much larger, more data)
The more bandwidth a signal has, the better the quality (or more data it can carry).
How Can Multiple Signals Exist Together?
One analogy you can use is imagining a highway—each lane is a different frequency. Cars (signals) stay in their lanes and don’t interfere unless they overlap (cause interference). This is why:
FM stations are spaced apart (88.1 MHz, 88.3 MHz, etc.).
WiFi has channels (1, 6, 11) to avoid congestion.
TV channels each have a dedicated frequency band.
This method of dividing the spectrum is called Frequency Division Multiplexing (FDM).
Using the following python code, we can visualise FDM in action by sweeping the FM spectrum:
# Basic FM Spectrum Capture
fromhackrfimport*importmatplotlib.pyplotaspltimportnumpyasnpfromscipy.signalimportwelchwithHackRF()ashrf:hrf.sample_rate=20e6# 20 MHz sample rate
hrf.center_freq=104.5e6# FM Radio
samples=hrf.read_samples(2e6)# Capture 2 million samples
# Compute PSD using Welch’s method (handling complex IQ data)
freqs,psd_values=welch(samples,fs=hrf.sample_rate,nperseg=1024,return_onesided=False)# Convert frequency axis to MHz
freqs_mhz=(freqs-(hrf.sample_rate/2))/1e6+(hrf.center_freq/1e6)# Plot Power Spectral Density
plt.figure(figsize=(10,5))plt.plot(freqs_mhz,10*np.log10(psd_values))# Convert power to dB
plt.xlabel('Frequency (MHz)')plt.ylabel('Power Spectral Density (dB/Hz)')plt.title(f'FM Radio Spectrum at {hrf.center_freq/1e6} MHz')plt.grid()# Save and show
plt.savefig("fm_spectrum.png")plt.show(block=True)
Running this code gave me the following resulting plot (it will be different for you depending on where you live!):
Each sharp peak that you see here represents an FM station at a unique frequency. These are the lanes.
Understanding Modulation
What is Modulation?
Radio signals don’t carry useful information by themselves. Instead, they use modulation to encode voice, music, or
data.
There are two main types of modulation:
Analog Modulation – Used for traditional radio (AM/FM).
Digital Modulation – Used for WiFi, Bluetooth, GPS, and modern systems.
AM (Amplitude Modulation)
AM works by varying the height (amplitude) of the carrier wave to encode audio.
As an example, the carrier frequency stays the same (e.g., 900 kHz), but the amplitude changes based on the sound wave.
AM is prone to static noise (because any electrical interference changes amplitude).
You can capture a sample of AM signals using the hackrf_transfer utility that was installed on your system:
This will capture AM signals at 900 kHz into a file for later analysis.
We can write some python to capture an AM signal and plot the samples so we can visualise this information.
# AM Signal Demodulation
fromhackrfimport*importmatplotlib.pyplotaspltimportnumpyasnpwithHackRF()ashrf:hrf.sample_rate=10e6# 10 MHz sample rate
hrf.center_freq=693e6# 693 MHz AM station
samples=hrf.read_samples(1e6)# Capture 1M samples
# AM Demodulation - Extract Magnitude (Envelope Detection)
demodulated=np.abs(samples)# Plot Demodulated Signal
plt.figure(figsize=(10,5))plt.plot(demodulated[:5000])# Plot first 5000 samples
plt.xlabel("Time")plt.ylabel("Amplitude")plt.title("AM Demodulated Signal")plt.grid()# Save and show
plt.savefig("am_demodulated.png")plt.show(block=True)
Running this code should give you a plot of what’s happening at 693 MHz:
The plot above represents the amplitude envelope of a real AM radio transmission.
The X-axis represents time, while the Y-axis represents amplitude.
The variations in amplitude correspond to the audio signal encoded by the AM station.
FM (Frequency Modulation)
FM works by varying the frequency of the carrier wave to encode audio.
As an example, the amplitude stays constant, but the frequency changes based on the audio wave.
FM is clearer than AM because it ignores amplitude noise.
You can capture a sample of FM signals with the following:
We can write some python code to capture and demodulate an FM signal as well:
fromhackrfimport*importmatplotlib.pyplotaspltimportnumpyasnpwithHackRF()ashrf:hrf.sample_rate=2e6# 2 MHz sample rate
hrf.center_freq=104.5e6# Example FM station
samples=hrf.read_samples(1e6)# FM Demodulation - Phase Differentiation
phase=np.angle(samples)# Extract phase
fm_demodulated=np.diff(phase)# Differentiate phase
# Plot FM Demodulated Signal
plt.figure(figsize=(10,5))plt.plot(fm_demodulated[:5000])# Plot first 5000 samples
plt.xlabel("Time")plt.ylabel("Frequency Deviation")plt.title("FM Demodulated Signal")plt.grid()# Save and show
plt.savefig("fm_demodulated.png")plt.show(block=True)
If you pick a frequency that has a local radio station, you should get a strong signal like this:
Unlike AM, where the signal’s amplitude changes, FM signals encode audio by varying the frequency of the carrier wave.
The graph above shows the frequency deviation over time:
The X-axis represents time, showing how the signal changes.
The Y-axis represents frequency deviation, showing how much the carrier frequency shifts.
The spikes and variations represent audio modulation, where frequency shifts encode sound.
If your FM demodulation appears too noisy:
Try tuning to a stronger station (e.g., 100.3 MHz).
Increase the sample rate for a clearer signal.
Apply a low-pass filter to reduce noise in post-processing.
Bandwidth of a Modulated Signal
Modulated signals require bandwidth (\(B\)), and the amount depends on the modulation type.
AM
The total bandwidth required for AM signals is:
\[B = 2f_m\]
Where:
\(B\) = Bandwidth in Hz
\(f_m\) = Maximum audio modulation frequency in Hz
If an AM station transmits audio up to 5 kHz, the bandwidth is:
\[B = 2 \times 5\text{kHz} = 10\text{kHz}\]
This explains why AM radio stations typically require ~10 kHz per station.
FM
The bandwidth required for an FM signal follows Carson’s Rule:
\[B = 2 (f_d + f_m)\]
Where:
\(f_d\) = Peak frequency deviation (how much the frequency shifts)
\(f_m\) = Maximum audio frequency in Hz
For an FM station with a deviation of 75 kHz and max audio frequency of 15 kHz, the total bandwidth is:
\[B = 2 (75 + 15) = 180 \text{kHz}\]
This explains why FM radio stations require much more bandwidth (~200 kHz per station).
Digital Modulation
For digital signals, it’s important to be able to transmit binary data (1’s and 0’s). These methods of modulation are
focused on making this process much more optimal than what the analog counterparts could provide.
What is FSK (Frequency Shift Keying)?
FSK is digital FM—instead of smoothly varying frequency like FM radio, it switches between two frequencies for 0’s and
1’s. This method of modulation is used in technologies like Bluetooth, LoRa, and old-school modems.
Example:
A “0” might be transmitted as a lower frequency (e.g., 915 MHz).
A “1” might be transmitted as a higher frequency (e.g., 917 MHz).
The receiver detects these frequency changes and reconstructs the binary data.
What is PSK (Phase Shift Keying)?
PSK is digital AM—instead of changing amplitude, it shifts the phase of the wave. This method of modulation is used
in technologies like WiFi, GPS, 4G LTE, Satellites.
Example:
0° phase shift = Binary 0
180° phase shift = Binary 1
More advanced PSK (like QPSK) uses four phase shifts (0°, 90°, 180°, 270°) to send two bits per symbol (faster data transmission).
Wrapping Up
In this post, we explored the fundamentals of radio signals—what they are, how they work, and how different modulation
techniques like AM and FM allow signals to carry audio through the air.
This really is only the start of what you can get done with software defined radio. Here are some further resources to
check out:
Previously, we’ve explored WASM in rust as well as some more advanced
concepts with Pixel Buffer Rendering again from
Rust. In today’s article, we’ll go through WebAssembly from a more fundamental perspective.
WebAssembly (Wasm) is a powerful technology that enables high-performance execution in web browsers and beyond. If
you’re just getting started, this guide will walk you through writing a simple WebAssembly program from scratch,
running it in a browser using JavaScript.
What is WebAssembly?
WebAssembly is a low-level binary instruction format that runs at near-native speed. It provides a sandboxed execution
environment, making it secure and highly portable. While it was initially designed for the web, Wasm is now expanding
into cloud computing, serverless, and embedded systems.
Unlike JavaScript, Wasm allows near-native performance, making it ideal for gaming, video processing, and even AI in
the browser.
First program
Before we start, we need to make sure all of the tools are available on your system. Make sure you have
wabt installed on your system:
sudo pacman -S wabt
WAT
We’ll start by writing a WebAssembly module using the WebAssembly Text Format (WAT).
Create a file called add.wat with the following code:
This module defines a function $add that takes two 32-bit integers (i32) as parameters and returns their sum.
local.get retrieves the parameters.
i32.add performs the addition.
The function is exported as "add", making it accessible from JavaScript
wat2wasm
To convert our add.wat file into a .wasm binary, we’ll use a tool called wat2wasm from the
WebAssembly Binary Toolkit (wabt) that we installed earlier:
wat2wasm add.wat -o add.wasm
This produces a binary add.wasm file, ready for execution.
Running WebAssembly from Javascript
Now, let’s create a JavaScript file (index.js) to load and execute our Wasm module:
asyncfunctionrunWasm(){// Fetch and compile the Wasm moduleconstresponse=awaitfetch("add.wasm");constbuffer=awaitresponse.arrayBuffer();constwasmModule=awaitWebAssembly.instantiate(buffer);// Get the exported add functionconstadd=wasmModule.instance.exports.add;// Call the functionconsole.log("5 + 7 = ",add(5,7));}runWasm();
We can execute this javascript by referencing it from a html file, and running this in a browser.
In the world of computational geometry, Delaunay triangulation stands out as one of the most versatile and powerful
algorithms. Its ability to transform a scattered set of points into a structured mesh of triangles has applications
ranging from terrain modeling to wireless network optimization.
This blog explores the concept of Delaunay triangulation, the algorithms that implement it, and its real-world
applications.
What is Delaunay Triangulation?
Delaunay triangulation is a method for connecting a set of points in a plane (or higher dimensions) to form a network
of triangles. The primary property of this triangulation is that no point lies inside the circumcircle of any triangle.
This ensures the triangulation is “optimal” in the sense that it avoids skinny triangles and maximizes the smallest
angles in the mesh.
A key relationship is its duality with the Voronoi diagram: Delaunay
triangulation and Voronoi diagrams together provide complementary ways to describe the spatial relationships between
points.
Why is Delaunay Triangulation Important?
The importance of Delaunay triangulation stems from its geometric and computational properties:
Optimal Mesh Quality: By avoiding narrow angles, it produces meshes suitable for simulations, interpolation, and rendering.
Simplicity and Efficiency: It reduces computational overhead by connecting points with minimal redundant edges.
Wide Applicability: From geographic information systems (GIS) to computer graphics and engineering, Delaunay triangulation plays a foundational role.
Real-World Applications
Geographic Information Systems (GIS)
Terrain Modeling: Delaunay triangulation is used to create Triangulated Irregular Networks (TINs), which model landscapes by connecting elevation points into a mesh.
Watershed Analysis: Helps analyze water flow and drainage patterns.
Computer Graphics
Mesh Generation: Triangles are the fundamental building blocks for 3D modeling and rendering.
Collision Detection: Used in simulations to detect interactions between objects.
Telecommunications
Wireless Network Optimization: Helps optimize the placement of cell towers and the connections between them.
Voronoi-based Coverage Analysis: Delaunay edges represent backhaul connections between towers.
Robotics and Pathfinding
Motion Planning: Robots use triangulated graphs to navigate efficiently while avoiding obstacles.
Terrain Navigation: Triangulation simplifies understanding of the environment for autonomous vehicles.
Engineering and Simulation
Finite Element Analysis (FEA): Generates triangular meshes for simulating physical systems, such as stress distribution in materials.
Fluid Dynamics: Simulates the flow of fluids over surfaces.
Environmental Science
Flood Modeling: Simulates how water flows across landscapes.
Resource Management: Models the distribution of natural resources like water or minerals.
A Practical Example
To illustrate the concept, let’s consider a set of points representing small towns scattered across a region. Using
Delaunay triangulation:
The towns (points) are connected with lines (edges) to form a network.
These edges represent potential road connections, ensuring the shortest and most efficient routes between towns.
By avoiding sharp angles, this network is both practical and cost-effective for infrastructure planning.
Here’s a Python script that demonstrates this idea:
importnumpyasnpimportmatplotlib.pyplotaspltfromscipy.spatialimportDelaunay# Generate random points in 2D space
np.random.seed(42)# For reproducibility
points=np.random.rand(20,2)# 20 points in 2D
# Perform Delaunay triangulation
tri=Delaunay(points)# Plot the points and the triangulation
plt.figure(figsize=(8,6))plt.triplot(points[:,0],points[:,1],tri.simplices,color='blue',linewidth=0.8)plt.scatter(points[:,0],points[:,1],color='red',s=50,label='Points')plt.title("Delaunay Triangulation Example")plt.xlabel("X-axis")plt.ylabel("Y-axis")plt.legend()plt.grid(True)plt.show()
The following is the output from this program.
Note how the paths between the points are the most optimal for connecting the towns efficiently. The triangulation
avoids unnecessary overlaps or excessively sharp angles, ensuring practicality and simplicity in the network design.
Limitations and Challenges
While Delaunay triangulation is powerful, it has its challenges:
Degenerate Cases: Points that are collinear or on the same circle can cause issues.
Scalability: Large datasets may require optimized algorithms to compute triangulations efficiently.
Extensions to Higher Dimensions: In 3D or higher, the algorithm becomes more complex and computationally expensive.
Conclusion
Delaunay triangulation is a cornerstone of computational geometry, offering an elegant way to structure and connect
scattered points. Its versatility makes it applicable across diverse domains, from GIS to robotics and environmental
science. Whether you’re modeling terrains, optimizing networks, or simulating physical systems, Delaunay triangulation
is an indispensable tool for solving real-world problems.
Rust is celebrated for its emphasis on safety and performance, largely thanks to its robust compile-time checks.
However, there are situations where you need to bypass these checks to perform low-level operations—this is where
Rust’s unsafe keyword comes in. While unsafe opens the door to powerful features, it also comes with significant
risks.
The solution?
Encapsulating unsafe code in safe abstractions.
This post explores what that means, why it’s important, and how to do it effectively.
Understanding unsafe in Rust
Rust enforces strict memory safety guarantees by default. However, some operations are inherently unsafe and require
explicit acknowledgment from the programmer. These include:
Raw pointer manipulation: Directly accessing memory without bounds or validity checks.
Foreign Function Interface (FFI): Interacting with non-Rust code (e.g., calling C functions).
Manual memory management: Allocating and freeing memory without Rust’s usual safeguards.
Concurrency primitives: Implementing data structures that require custom synchronization logic.
When you write unsafe code, you’re essentially telling the compiler, “I know what I’m doing; trust me.”
While this is sometimes necessary, it’s critical to minimize the potential for misuse by others.
Why Wrap Unsafe Code in Safe Abstractions?
Using unsafe is a trade-off. It gives you access to low-level features and optimizations but requires you to
manually uphold the invariants that Rust would otherwise enforce. Safe abstractions address this challenge by:
Avoiding Undefined Behavior: Preventing common pitfalls like null pointer dereferences, data races, or buffer overflows.
Improving Maintainability: Reducing the scattering of unsafe blocks across the codebase makes it easier to audit and debug.
Providing Ease of Use: Enabling most developers to rely on Rust’s safety guarantees without needing to understand the intricacies of the underlying unsafe implementation.
What is a Safe Abstraction?
A safe abstraction is an API or module where the internal implementation may use unsafe code, but the external
interface ensures that incorrect usage is either impossible or extremely difficult.
Let’s look at how to create one.
Example: Safe Wrapping of Unsafe Memory Allocation
Here’s a simplified example of wrapping unsafe memory management into a safe abstraction:
pubstructSafeAllocator{// Internal raw pointer or other unsafe constructsptr:*mutu8,size:usize,}implSafeAllocator{pubfnnew(size:usize)->Self{letptr=unsafe{libc::malloc(size)as*mutu8};ifptr.is_null(){panic!("Failed to allocate memory");}Self{ptr,size}}pubfnallocate(&self,offset:usize,len:usize)->&[u8]{ifoffset+len>self.size{panic!("Out of bounds access");}unsafe{std::slice::from_raw_parts(self.ptr.add(offset),len)}}pubfndeallocate(self){unsafe{libc::free(self.ptras*mutlibc::c_void);}}}implDropforSafeAllocator{fndrop(&mutself){unsafe{libc::free(self.ptras*mutlibc::c_void);}}}
In this example:
unsafe is confined to specific, well-defined sections of the code.
The API ensures that users cannot misuse the allocator (e.g., by accessing out-of-bounds memory).
Drop ensures memory is automatically freed when the allocator goes out of scope.
Example Usage of SafeAllocator
Here’s how you might use the SafeAllocator in practice:
fnmain(){// Create a new SafeAllocator with 1024 bytes of memoryletallocator=SafeAllocator::new(1024);// Allocate a slice of 128 bytes starting from offset 0letslice=allocator.allocate(0,128);println!("Allocated slice of length: {}",slice.len());// The allocator will automatically deallocate memory when it goes out of scope}
This usage demonstrates:
How to create and interact with the SafeAllocator API.
That memory is automatically managed via Rust’s Drop trait, preventing leaks.
Leveraging Rust’s Type System
Rust’s type system is another powerful tool for enforcing invariants. For example, you can use:
Lifetimes: To ensure references don’t outlive the data they point to.
PhantomData: To associate types or lifetimes with otherwise untyped data.
Ownership and Borrowing Rules: To enforce safe access patterns at compile time.
Documentation of Safety Contracts
Any unsafe code should include clear documentation of the invariants it relies on. For example:
// Safety:// - `ptr` must be non-null and point to a valid memory region.// - `len` must not exceed the bounds of the allocated memory.unsafe{std::slice::from_raw_parts(ptr,len)}
This makes it easier for future maintainers to understand and verify the correctness of the code.
Real-World Examples of Safe Abstractions
Many Rust libraries provide excellent examples of safe abstractions over unsafe code:
std::sync::Mutex: Internally uses unsafe for thread synchronization but exposes a safe API for locking and unlocking.
Vec: The Rust standard library’s Vec type uses unsafe for raw memory allocation and resizing but ensures bounds checks and proper memory management externally.
crossbeam: Provides safe concurrency primitives built on low-level atomic operations.
Costs and Benefits
While writing safe abstractions requires extra effort and careful thought, the benefits outweigh the costs:
Benefits:
Reduced Risk of Bugs: Encapsulating unsafe code minimizes the chance of introducing undefined behavior.
Improved Developer Experience: Safe APIs make it easier for others to use your code without worrying about low-level details.
Easier Auditing: With unsafe code isolated, it’s easier to review and verify its correctness.
Costs:
Initial Effort: Designing a robust safe abstraction takes time and expertise.
Performance Overhead: In rare cases, adding safety layers may incur slight overhead (though usually negligible in well-designed abstractions).
Conclusion
Writing safe abstractions for unsafe Rust code is both an art and a science. It involves understanding the invariants
of your unsafe code, leveraging Rust’s type system to enforce safety, and documenting your assumptions clearly. By
doing so, you can harness the power of unsafe while maintaining Rust’s guarantees of memory safety and concurrency
correctness—the best of both worlds.
In today’s post, we’ll build a simple key value server; but we’ll do it in an iterative way. We’ll build it up simple
and then add safety, concurrency, and networking as we go.
Implementation
Now we’ll get started with our iterations. The finished code will be available at the end of this post.
Baseline
All of our implementations will deal with a KeyValueStorestruct. This struct will hold all of the variables that
we want to keep track of in our server.
String is pretty limiting to store as far as the value side is concerned. We can upgrade this to specifically use
data types that we will find useful via an enum:
#[derive(Debug,Clone)]enumValue{String(String),Integer(i64),Float(f64),Boolean(bool),Binary(Vec<u8>),// Add more variants as needed}
We can swap out the value side of our data member now, too.
structKeyValueStore{data:HashMap<String,Value>,}
The implementation simply swaps the String for Value:
We’re now able to not only store strings. We can store integers, floats, binary, and booleans. This makes our key value
store a lot more versatile.
Thread Safety
We will have multiple threads of execution trying to perform actions on this structure at the same time, so we will
add some thread safety to the process now. Wrapping data in Arc will give us a thread safe, reference counting
pointer. We’re also going to need to lock this data structure for reading and for writing. We can use RwLock to
take care of that for us.
We update our data structure to include these new types:
These functions are now safe, which means calling code can be multithreaded and we can guaranteed that our data
structure will be treated consistently.
fnmain(){letstore=Arc::new(KeyValueStore::new());// Create a vector to hold thread handlesletmuthandles=vec![];// Spawn threads to perform insertsforiin0..5{letstore=Arc::clone(&store);lethandle=thread::spawn(move||{letkey=format!("key{}",i);letvalue=Value::Integer(i*10);store.insert(key.clone(),value);println!("Thread {} inserted: {}",i,key);});handles.push(handle);}// Spawn threads to read valuesforiin0..5{letstore=Arc::clone(&store);lethandle=thread::spawn(move||{letkey=format!("key{}",i);ifletSome(value)=store.get(&key){println!("Thread {} read: {} -> {:?}",i,key,value);}else{println!("Thread {} could not find: {}",i,key);}});handles.push(handle);}// Spawn threads to delete keysforiin0..5{letstore=Arc::clone(&store);lethandle=thread::spawn(move||{letkey=format!("key{}",i);store.delete(&key);println!("Thread {} deleted: {}",i,key);});handles.push(handle);}// Wait for all threads to completeforhandleinhandles{handle.join().unwrap();}println!("Final state of the store: {:?}",store.data.read().unwrap());}
Error handling
You can see that we’re using unwrap in the implementation functions, which might be ok for tests or short scripts. If
we’re going to expect to run this code in production, we’d be best replacing these with actual error handling counterparts.
In order to do that, we need to define our error domain first. We create an enum called StoreError. As we fill out
our implementation, we’ll run into a number of different error cases. We’ll use StoreError to centralise all of these
errors so we can express them clearly.
We’ve implemented PoisonError for our StoreError because the PoisonError type is an error which can be returned
whenever a lock is acquired. If something goes wrong and we’ve acquired a lock, it’s a PoisonError that’s used.
Our insert, get, and delete methods now need an upgrade. We’ll be returning Result<T, E> values from our
functions now to accomodate potential failures.
fninsert(&self,key:String,value:Value)->Result<(),StoreError>{letmutlocked=self.data.write()?;locked.insert(key,value);Ok(())}fnget(&self,key:&str)->Result<Option<Value>,StoreError>{letlocked=self.data.read()?;Ok(locked.get(key).cloned())// Clone the value to return an owned copy}fndelete(&self,key:&str)->Result<(),StoreError>{letmutlocked=self.data.write()?;iflocked.remove(key).is_none(){returnErr(StoreError::KeyNotFound(key.to_string()));}Ok(())}
We’ve removed the use of unwrap now, swapping out to using the ? operator. This will allow us to actually handle
any failure that is bubbled out of calling code.
Using the File System
We need to be able to persist the state of our key value store out to disk for durability. In order to do this, we need
to keep track of where we’ll write the file. We add a file_path member to our structure:
Starting out this implementation simply, we just write a load and save function that we can call at any time. Before
we do this we need some extra dependencies added for serialisation:
This will allow us to reduce our internal state to JSON.
Loading the database off disk
/// Load the state from a filefnload(&self)->Result<(),StoreError>{ifletSome(refpath)=self.file_path{matchfs::read_to_string(path){Ok(contents)=>{letdeserialized:HashMap<String,Value>=serde_json::from_str(&contents)?;letmutlocked=self.data.write()?;*locked=deserialized;// Replace the current state with the loaded oneOk(())}Err(e)ife.kind()==ErrorKind::NotFound=>{// File doesn't exist, just return Ok (no data to load)Ok(())}Err(e)=>Err(e.into()),}}else{Err(StoreError::IoError("File path not set".to_string()))}}
We need to make sure that a file_path was specified. We read everything off from the file into contents as a big
string. Using serde_json::from_str we can turn that contents into the deserialised representation. From there, we
simply swap out the underlying content.
We’ve got some new errors to deal with here in IoError.
This will be used for our write implementation which looks like this:
/// Save the current state to a filefnsave(&self)->Result<(),StoreError>{ifletSome(refpath)=self.file_path{letlocked=self.data.read()?;letserialized=serde_json::to_string(&*locked)?;fs::write(path,serialized)?;Ok(())}else{Err(StoreError::IoError("File path not set".to_string()))}}
The magic here really is the serde_json::to_string taking our internal state and writing it as json.
Finally, we’ll add some networking to the solution. A really basic network interface will allow remote clients to
perform the get, set, and delete operations for us.
The handle_client function is the heart of the server process, performing the needed processing on incoming requests
and routing them to the database instance:
fnhandle_client(mutstream:TcpStream,store:Arc<KeyValueStore>){letmutbuffer=[0;512];// Read the incoming requestmatchstream.read(&mutbuffer){Ok(_)=>{letrequest=String::from_utf8_lossy(&buffer);letmutparts=request.trim().split_whitespace();letcommand=parts.next();letresponse=matchcommand{Some("SET")=>{letkey=parts.next().unwrap_or_default().to_string();letvalue=parts.next().unwrap_or_default().to_string();store.insert(key,Value::String(value));"OK\n".to_string()}Some("GET")=>{letkey=parts.next().unwrap_or_default();ifletOk(Some(value))=store.get(key){format!("{:?}\n",value)}else{"Key not found\n".to_string()}}Some("DEL")=>{letkey=parts.next().unwrap_or_default();store.delete(key);"OK\n".to_string()}_=>"Unknown command\n".to_string(),};// Send the response back to the clientstream.write_all(response.as_bytes()).unwrap();}Err(e)=>eprintln!("Failed to read from socket: {}",e),}}
Out networking “protocol” looks like this:
-- set the key "key1" to the value "hello"
SET key1 hello
-- get the value of the key "key1"
GET key1
-- remove the value and key "key1"
DEL key1
We read in the request data from the client into request. This gets split up on white spaces into parts with command
given the first of these parts. The code is expectingcommand to be either SET, GET, or DEL that is then
handled in the following pattern match.
This function gets mounted onto the server in the main function which now looks like this:
fnmain(){letstore=Arc::new(KeyValueStore::new(None));letlistener=TcpListener::bind("127.0.0.1:7878").unwrap();println!("Server running on 127.0.0.1:7878");forstreaminlistener.incoming(){matchstream{Ok(stream)=>{letstore=Arc::clone(&store);std::thread::spawn(move||handle_client(stream,store));}Err(e)=>eprintln!("Connection failed: {}",e),}}}
We’re starting our server on port 7878 and handling each connection with our handle_client function.
Running this and giving it a test with telnet gives us the following:
➜ telnet 127.0.0.1 7878
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
SET key1 hello
OK
Connection closed by foreign host.
➜ telnet 127.0.0.1 7878
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
GET key1
String("hello")
Connection closed by foreign host.
➜ telnet 127.0.0.1 7878
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
DEL key1
OK
Connection closed by foreign host.
➜ telnet 127.0.0.1 7878
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
GET key1
Key not found
Connection closed by foreign host.
So, it works. It’s crude and needs to be patched to be a little more production ready than this - but this is a start.
Conclusion
In this article, we walked through building a thread-safe, persistent key-value store in Rust. We started with a simple
in-memory implementation and iteratively improved it by:
Adding support for multiple data types using an enum.
Ensuring thread safety with RwLock and Arc.
Replacing unwrap with proper error handling.
Adding file persistence using JSON serialization and deserialization.
Added some basic network access
This provides a solid foundation for a more robust and scalable key-value server. Next steps could include:
Implementing advanced features like snapshots or replication.
Optimizing for performance with tools like async I/O or a custom storage engine.