The ? operator in Rust is one of the most powerful features for handling errors concisely and gracefully. However,
it’s often misunderstood as just syntactic sugar for .unwrap(). In this post, we’ll dive into how the ? operator
works, its differences from .unwrap(), and practical examples to highlight its usage.
What is it?
The ? operator is a shorthand for propagating errors in Rust. It simplifies error handling in functions that return a
Result or Option. Here’s what it does:
For Result:
If the value is Ok, the inner value is returned.
If the value is Err, the error is returned to the caller.
For Option:
If the value is Some, the inner value is returned.
If the value is None, it returns None to the caller.
This allows you to avoid manually matching on Result or Option in many cases, keeping your code clean and readable.
How ? Differs from .unwrap()
At first glance, the ? operator might look like a safer version of .unwrap(), but they serve different purposes:
Error Propagation:
? propagates the error to the caller, allowing the program to handle it later.
.unwrap() panics and crashes the program if the value is Err or None.
Use in Production:
? is ideal for production code where you want robust error handling.
.unwrap() should only be used when you are absolutely certain the value will never be an error (e.g., in tests or prototypes).
Examples
In this example, the ? operator automatically returns any error from std::fs::read_to_string to the caller, saving
you from writing a verbose match.
The match is then left as an exercise to the calling code; in this case main.
How it Differs from .unwrap()
Compare the ? operator to .unwrap():
Using ?:
Using .unwrap():
If std::fs::read_to_string fails:
The ? operator propagates the error to the caller.
.unwrap() causes the program to panic, potentially crashing your application.
Error Propagation in Action
The ? operator shines when you need to handle multiple fallible operations:
Here, the ? operator simplifies error handling for both read_to_string and write, keeping the code concise and
readable.
Saving typing
Using ? is equivalent to a common error propagation pattern:
Without ?:
With ?:
Chaining
You can also chain multiple operations with ?, making it ideal for error-prone workflows:
Conclusion
The ? operator is much more than syntactic sugar for .unwrap(). It’s a powerful tool that:
Simplifies error propagation.
Keeps your code clean and readable.
Encourages robust error handling in production.
By embracing the ? operator, you can write concise, idiomatic Rust code that gracefully handles errors without
sacrificing clarity or safety.
Rust’s async and await features bring modern asynchronous programming to the language, enabling developers to write
non-blocking code efficiently. In this blog post, we’ll explore how async and await work, when to use them, and
provide practical examples to demonstrate their power.
What Are async and await?
Rust uses an async and await model to handle concurrency. These features allow you to write asynchronous code that
doesn’t block the thread, making it perfect for tasks like I/O operations, networking, or any scenario where waiting on
external resources is necessary.
Key Concepts:
async:
Marks a function or block as asynchronous.
Returns a Future instead of executing immediately.
await:
Suspends the current function until the Future completes.
Only allowed inside an async function or block.
Getting Started
To use async and await, you’ll need an asynchronous runtime such as Tokio or
async-std. These provide the necessary infrastructure to execute asynchronous tasks.
Practical Examples
A Basic async Function
Explanation:
say_hello is an async function that prints messages and waits for 2 seconds without blocking the thread.
The .await keyword pauses execution until the sleep operation completes.
Running Tasks Concurrently with join!
Explanation:
join! runs multiple tasks concurrently.
Task two finishes first, even though task one started earlier, demonstrating concurrency.
Handling Errors in Asynchronous Code
Explanation:
Uses the reqwest crate to fetch data from a URL.
Error handling is built-in with Result and the ? operator.
The await ensures all tasks complete before exiting.
Asynchronous File I/O
Explanation:
Uses tokio::fs for non-blocking file reading.
Handles file errors gracefully with Result.
Key Points to Remember
Async Runtime:
You need an async runtime like Tokio or async-std to execute async functions.
Concurrency:
Rust’s async model is cooperative, meaning tasks must yield control for others to run.
Error Handling:
Combine async with Result for robust error management.
State Sharing:
Use Arc and Mutex for sharing state safely between async tasks.
Conclusion
Rust’s async and await features empower you to write efficient, non-blocking code that handles concurrency
seamlessly. By leveraging async runtimes and best practices, you can build high-performance applications that scale
effortlessly.
Start experimenting with these examples and see how async and await can make your Rust code more powerful and
expressive. Happy coding!
IO_URING is an advanced asynchronous I/O interface introduced in the Linux kernel (version 5.1). It’s designed to
provide significant performance improvements for I/O-bound applications, particularly those requiring high throughput
and low latency.
It’s well worth taking a look in the linux man pages for io_uring
and having a read through the function interface.
In today’s article we’ll discuss IO_URING in depth and follow with some examples to see it in practice.
What is IO_URING
IO_URING is a high-performance asynchronous I/O interface introduced in Linux kernel version 5.1. It was developed
to address the limitations of traditional Linux I/O mechanisms like epoll, select, and aio. These earlier
approaches often suffered from high overhead due to system calls, context switches, or inefficient batching, which
limited their scalability in handling modern high-throughput and low-latency workloads.
At its core, IO_URING provides a ring-buffer-based mechanism for submitting I/O requests and receiving their
completions, eliminating many inefficiencies in older methods. This allows applications to perform non-blocking,
asynchronous I/O with minimal kernel involvement, making it particularly suited for applications such as databases, web
servers, and file systems.
How does IO_URING work?
IO_URING’s architecture revolves around two primary shared memory ring buffers between user space and the kernel:
Submission Queue (SQ):
The SQ is a ring buffer where applications enqueue I/O requests.
User-space applications write requests directly to the buffer without needing to call into the kernel for each operation.
The requests describe the type of I/O operation to be performed (e.g., read, write, send, receive).
Completion Queue (CQ):
The CQ is another ring buffer where the kernel places the results of completed I/O operations.
Applications read from the CQ to retrieve the status of their submitted requests.
The interaction between user space and the kernel is simplified:
The user-space application adds entries to the Submission Queue and notifies the kernel when ready (via a single syscall like io_uring_enter).
The kernel processes these requests and posts results to the Completion Queue, which the application can read without additional syscalls.
Key Features
Batching Requests:
Multiple I/O operations can be submitted in a single system call, significantly reducing syscall overhead.
Zero-copy I/O:
Certain operations (like reads and writes) can leverage fixed buffers, avoiding unnecessary data copying between kernel and user space.
Kernel Offloading:
The kernel can process requests in the background, allowing the application to continue without waiting.
Efficient Polling:
Supports event-driven programming with low-latency polling mechanisms, reducing idle time in high-performance applications.
Flexibility:
IO_URING supports a wide range of I/O operations, including file I/O, network I/O, and event notifications.
Code
Let’s get some code examples going to see exactly what we’re dealing with.
First of all, check to see that your kernel supports IO_URING. It should. It’s been available since 51.
You’ll also need liburing avaliable to you in order to compile these examples.
Library setup
In this first example, we won’t perform any actions; but we’ll setup the library so that we can use these operations.
All of our other examples will use this as a base.
We’ll need some basic I/O headers as well as liburing.h.
We initialize our uring queue using io_uring_queue_init:
When we’re finished with the ring, we cleanup with io_uring_queue_exit.
Simple Write
In this example, we’ll queue up a write of a string out to a file and that’s it.
First, we need to open the file like usual:
Now, we setup the write job to happen.
The io_uring_get_sqe function will get us the next available submission queue entry from the job queue. Once we have
secured one of these, we then fill a vector I/O structure (a iovec) with the details of our data. Here it’s just the
data pointer, and length.
Finally, we prepare a vector write request using io_uring_prep_writev.
We submit the job off to be processed now with io_uring_submit:
We can wait for the execution to complete; even more powerful though is we can be off doing other things if we’d like!
In order to wait for the job to finish, we use io_uring_wait_cqe:
We check the result of the job through the io_uring_cqe structure filled by the io_uring_wait_cqe call:
Finally, we mark the uring event as consumed and close the file.
Finally, we’ll write an example that will process multiple operations in parallel.
The following for loop sets up 3 read jobs:
All of the requests now get submitted for processing:
Finally, we wait on each of the jobs to finish. The important thing to note here, is that we could be busy off doing
otherthings rather than just waiting for these jobs to finish.
IO_URING represents a transformative step in Linux asynchronous I/O, providing unparalleled performance and flexibility
for modern applications. By minimizing syscall overhead, enabling zero-copy I/O, and allowing concurrent and batched
operations, it has become a vital tool for developers working on high-performance systems.
Through the examples we’ve covered, you can see the practical power of IO_URING, from simple write operations to complex
asynchronous processing. Its design not only simplifies high-throughput I/O operations but also opens up opportunities
to optimize and innovate in areas like database systems, networking, and file handling.
SIMD (Single Instruction, Multiple Data) is a computing technique used in modern CPUs and GPUs to perform the same
operation on multiple pieces of data simultaneously. SIMD instructions are critical for optimizing tasks in
data-parallel applications, such as multimedia processing, scientific computing, and machine learning.
What is SIMD?
SIMD allows a single instruction to operate on multiple data elements in parallel. It is a subset of parallel computing
focused on data-level parallelism. Traditional instructions operate on a single data element (Single Instruction,
Single Data).
Most modern CPUs have SIMD instruction sets built into their architecture. These include:
Intel/AMD x86:
MMX (legacy)
SSE (Streaming SIMD Extensions)
AVX (Advanced Vector Extensions)
AVX-512 (latest in Intel’s Xeon and some desktop processors)
ARM:
NEON
PowerPC:
AltiVec (also known as VMX)
RISC-V:
Vector extensions.
When to Use SIMD
SIMD is ideal for applications with:
Data Parallelism: Repeated operations on arrays or vectors (e.g., adding two arrays).
Heavy Computation:
Multimedia processing (e.g., video encoding/decoding, image manipulation).
Scientific simulations (e.g., matrix operations).
Machine learning (e.g., tensor computations).
Regular Data Access Patterns: Data laid out in contiguous memory blocks.
SIMD support in your CPU provides vector registers to store multiple data elements (i.e. 4 floats in a 128-bit register).
From there, vectorized instructions are performed simultaneously. SIMD requires aligned memory for optimal performance.
Misaligned data incurs penalties or falls back to scalar processing.
How to use it
Intel Intrinsics for AVX
The following example simply adds two vectors together, and prints the results out to the terminal.
In order to compile this you need to use:
When the disassemble this program, we can see evidence that the extended instruction set is being used:
Compiler Auto-Vectorisation
SIMD is so common these days, that if you wrote the code above just in plain-old c:
If you were to compile this code with either -O2 or -O3, you’ll find that vectorisation gets enabled.
Without any optimisation, we get the following:
The use of movss and addss are indeed SIMD instructions; but they are only operating on scalar values at a time.
Now, if we turn the optimisation up you’ll notice that we start to use some of those SIMD primitives start working on
packed numbers.
These instructions (like addps) can add 4, 8, or 16 numbers at once.
Assembly
If you really feel the need to get that extra bit of power, you can crack out the assembly language yourself and have
a go.
For the work that it’s doing, this is very tidy code.
High Level Libraries
Finally, there are a number of high level libraries that industralise the usage of SIMD instructions really well. Using
these makes these operations much easier to write!
Branching can be an issue with SIMD struggling to diverge execution paths (e.g., if statements).
The alignment requirements are quite strict for the maximum optimum capability. SIMD often requires data to be aligned
to specific byte boundaries (e.g., 16 bytes for SSE, 32 bytes for AVX).
SIMD scales to a fixed number of elements per operation, determined by the vector register width. Scalability can be
an issue here with higher dimension vectors.
Code written with specific intrinsics or assembly may not run on CPUs with different SIMD instruction sets. So, if you’re
not using one of those higher level libraries - portability can be an issue.
Conclusion
SIMD is a powerful tool for optimizing performance in data-parallel applications, allowing modern CPUs and GPUs to
handle repetitive tasks more efficiently. By leveraging intrinsics, compiler optimizations, or high-level libraries,
developers can unlock significant performance gains with relatively little effort.
However, like any optimization, SIMD has its challenges, such as branching, memory alignment, and portability.
Understanding these limitations and balancing them with the benefits is key to effectively integrating SIMD into your
projects.
Whether you’re working on scientific simulations, multimedia processing, or machine learning, SIMD offers a compelling
way to accelerate your computations. Start small, experiment with intrinsics or auto-vectorization, and explore the
high-level libraries to see how SIMD can transform your application’s performance.
In a previous post we made a simple water droplet demonstration. This
is all built on the vga work that we’ve already done.
In today’s post, we’ll use this information again and put a waving flag demo together.
The effect that we’re looking to produce should look something like this:
The Idea
The waving flag effect simulates a piece of fabric rippling in the wind. Here’s the high-level approach:
Flag Bitmap: Create a bitmap with alternating horizontal stripes of green and white.
Wave Dynamics: Use trigonometric functions to displace pixels horizontally and vertically, simulating waves.
Buffering: Use an offscreen buffer to avoid flickering during rendering.
Building the Flag
The flag is a simple bitmap composed of horizontal stripes alternating between green and white. Here’s how it’s
generated:
Each stripe spans 10 rows, and the alternating colors give the flag a distinctive look.
Adding the Wave Effect
The waving effect is achieved by modifying the position of each pixel based on sine and cosine functions. Here’s the
core logic:
Key Features:
Wave Dynamics: The wave_offset creates a horizontal ripple effect based on sin(theta + xx * 0.1). A secondary vertical ripple adds realism.
Boundary Checks: Ensures pixels remain within the screen bounds.
Direct Pixel Copy: Pixels are copied from the flag bitmap to the appropriate buffer position.
Redundant Pixel Render: We make sure we render to all surrounding cells so we don’t experience tearing
Main Loop
The main loop ties everything together, handling synchronization, rendering, and input:
Highlights:
Synchronization: The wait_vsync() call ensures smooth updates.
Animation: The theta value incrementally changes, creating continuous movement.
Keyboard Interrupt: The kbhit() function allows the user to exit gracefully.
Conclusion
This waving flag effect combines simple algorithms with creative use of VGA mode 13h to create a visually stunning
effect. By leveraging trigonometry, palette manipulation, and efficient buffer handling, we replicate the mesmerizing motion of a flag in the wind.
You can find the complete code on GitHub as a gist.
Try it out, tweak the parameters, and share your own effects! There’s a lot of joy in creating beautiful visuals with
minimal resources.