Cogs and Levers A blog full of technical stuff

Learning Rust Part 7 - Macros and Metaprogramming

Introduction

Rust’s macros offer powerful metaprogramming tools, enabling code generation, compile-time optimizations, and even domain-specific languages (DSLs). Unlike functions, macros operate at compile time, which makes them flexible and efficient but also requires careful usage. Rust’s macro system includes two primary types: declarative macros and procedural macros. In this post, we’ll explore both types and look at some practical examples.

Declarative Macros (macro_rules!)

Declarative macros, created with macro_rules!, use pattern matching to expand code at compile time. These macros are ideal for handling repetitive code patterns and for defining custom DSLs.

Defining a Declarative Macro

Here’s an example of a logging macro that can handle multiple arguments. The macro uses pattern matching to determine how to expand the code.

macro_rules! log {
    ($msg:expr) => {
        println!("[LOG]: {}", $msg);
    };
    ($fmt:expr, $($arg:tt)*) => {
        println!("[LOG]: {}", format!($fmt, $($arg)*));
    };
}

fn main() {
    log!("Starting application");
    log!("Hello, {}", "world");
}

This log! macro can be called with either a single expression or a format string with additional arguments.

Repeaters ($()*)

Macros can use repeaters like $(...)* to handle variable numbers of arguments. Here’s a macro that generates a Vec<String> from a list of string literals:

macro_rules! vec_of_strings {
    ($($x:expr),*) => {
        vec![$(String::from($x)),*]
    };
}

fn main() {
    let v = vec_of_strings!["apple", "banana", "cherry"];
    println!("{:?}", v);
}

This macro makes it easy to create a vector of strings without repeating String::from for each element.

Metavariables

In Rust macros, metavariables specify the kinds of expressions a macro can match. Here’s a look at the most commonly used types, with examples to help clarify each one.

expr: Expressions

The expr metavariable type represents any valid Rust expression. This includes literals, function calls, arithmetic operations, and more.

macro_rules! log {
    ($msg:expr) => { 
        println!("[LOG]: {}", $msg); 
    };
}

fn main() {
    log!(42);                // 42 is a literal expression
    log!(5 + 3);             // 5 + 3 is an arithmetic expression
    log!("Hello, world!");   // "Hello, world!" is a string literal expression
}

tt: Token Tree

The tt metavariable stands for token tree and is the most flexible type, accepting any valid Rust token or group of tokens. This includes literals, expressions, blocks, or even entire function bodies. tt is often used for parameters with variable length, as in $($arg:tt)*.

In the example, ($($arg:tt)*) allows the macro to accept a variable number of arguments, each matching the tt pattern.

macro_rules! log {
    ($fmt:expr, $($arg:tt)*) => {
        println!("[LOG]: {}", format!($fmt, $($arg)*));
    };
}

fn main() {
    log!("Hello, {}", "world");    // "Hello, {}" is matched as $fmt, "world" as $arg
    log!("Values: {} and {}", 1, 2); // Two arguments matched as $arg
}

In this case, ($fmt:expr, $($arg:tt)*):

  • $fmt:expr matches a single format string.
  • $($arg:tt)* matches a sequence of additional arguments, like "world" or 1, 2.

Other Common Metavariable Types

Rust macros support additional metavariable types, each providing a different kind of flexibility. Here are some other commonly used types:

  • ident: Matches an identifier (variable, function, or type name).
macro_rules! make_var {
    ($name:ident) => {
        let $name = 10;
    };
}

fn main() {
    make_var!(x); // Expands to: let x = 10;
    println!("{}", x);
}
  • ty: Matches a type (like i32 or String).
macro_rules! make_vec {
    ($type:ty) => {
        Vec::<$type>::new()
    };
}

fn main() {
    let v: Vec<i32> = make_vec!(i32); // Expands to Vec::<i32>::new()
}
  • pat: Matches a pattern, often used in match arms.
macro_rules! match_num {
    ($num:pat) => {
        match $num {
            1 => println!("One"),
            _ => println!("Not one"),
        }
    };
}

fn main() {
    match_num!(1);
}
  • literal: Matches literal values like numbers, characters, or strings. Useful when you need to capture only literal values.
macro_rules! print_literal {
    ($x:literal) => {
        println!("Literal: {}", $x);
    };
}

fn main() {
    print_literal!(42);       // Works, as 42 is a literal
    // print_literal!(5 + 5); // Error: 5 + 5 is not a literal
}

Metavariable types

Here’s a quick reference of metavariable types commonly used in Rust macros:

Metavariable Matches Example
expr Any valid Rust expression 5 + 3, hello, foo()
tt Any token tree 1, { 1 + 2 }, foo, bar
ident Identifiers my_var, TypeName
ty Types i32, String
pat Patterns _, Some(x), 1..=10
literal Literals 42, 'a', "text"

Procedural Macros

Procedural macros allow more advanced metaprogramming by directly manipulating Rust’s syntax. They operate on tokens (the syntactic elements of code) rather than strings, offering greater control over code generation. Procedural macros are defined as separate functions, usually in a dedicated crate.

Types of Procedural Macros

Rust supports three main types of procedural macros:

  • Function-like macros: Called like functions but with macro-level flexibility.
  • Attribute macros: Add custom behavior to items like functions and structs.
  • Derive macros: Automatically implement traits for structs or enums.

Creating a Function-like Macro

A function-like macro uses the proc_macro crate to manipulate tokens directly. Here’s an example that generates a function called hello that prints a greeting:

use proc_macro;

#[proc_macro]
pub fn hello_macro(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
    let input_str = input.to_string();
    format!("fn hello() {{ println!(\"Hello, {}!\"); }}", input_str).parse().unwrap()
}

This macro generates a hello function that prints a customized message. It would typically be used by adding hello_macro!("Rust"); to the main code, and would output Hello, Rust!.

Attribute Macros

Attribute macros attach custom attributes to items, making them useful for adding behaviors to functions, structs, or enums. For instance, an attribute macro can automatically log messages when entering and exiting a function.

use proc_macro::TokenStream;

#[proc_macro_attribute]
pub fn log(_attr: TokenStream, item: TokenStream) -> TokenStream {
    let input = item.to_string();
    let output = format!(
        "fn main() {{
            println!(\"Entering function\");
            {}
            println!(\"Exiting function\");
        }}", input
    );
    output.parse().unwrap()
}

When applied to main, this macro logs messages before and after function execution, helping with function-level tracing.

Derive Macros

Derive macros are a powerful feature in Rust, enabling automatic trait implementation for custom data types. Commonly used for traits like Debug, Clone, and PartialEq, derive macros simplify code by eliminating the need for manual trait implementation.

Implementing a Derive Macro

Suppose we want to implement a custom Hello trait that prints a greeting. We can create a derive macro to automatically implement Hello for any struct annotated with #[derive(Hello)].

First, define the Hello trait:

pub trait Hello {
    fn say_hello(&self);
}

Then, implement the derive macro in a procedural macro crate:

use proc_macro::TokenStream;
use quote::quote;

#[proc_macro_derive(Hello)]
pub fn hello_derive(input: TokenStream) -> TokenStream {
    let ast: syn::DeriveInput = syn::parse(input).unwrap();
    let name = &ast.ident;

    let gen = quote! {
        impl Hello for #name {
            fn say_hello(&self) {
                println!("Hello from {}", stringify!(#name));
            }
        }
    };
    gen.into()
}

Now, any struct tagged with #[derive(Hello)] will automatically implement the Hello trait, making the code more modular and concise.

Domain-Specific Languages (DSLs) with Macros

Rust’s macros can be used to create DSLs, enabling specialized, readable syntax for specific tasks.

Example: Creating a Simple DSL

Here’s an example of a DSL for building SQL-like queries. The query! macro translates the input syntax into a formatted SQL query string.

macro_rules! query {
    ($table:expr => $($col:expr),*) => {
        format!("SELECT {} FROM {}", stringify!($($col),*), $table)
    };
}

fn main() {
    let sql = query!("users" => "id", "name", "email");
    println!("{}", sql); // Outputs: SELECT id, name, email FROM users
}

This example uses macro_rules! to create a custom query builder, transforming macro input into SQL syntax in a natural format.

Summary

Rust’s macros and metaprogramming features provide versatile tools for code generation, manipulation, and optimization. With declarative macros for straightforward pattern matching, procedural macros for syntax manipulation, and derive macros for auto-implementing traits, Rust enables developers to write efficient, flexible, and concise code. Macros can help create DSLs or extend functionality in powerful ways, making Rust an excellent choice for both performance and code expressiveness.

Learning Rust Part 6 - Traits and Generics

Introduction

Rust’s traits and generics offer powerful tools for creating reusable, flexible, and type-safe code. Traits define shared behaviors, while generics allow code to work with multiple types. Combined, they make Rust’s type system expressive and robust, enabling high-performance applications with minimal redundancy. In this post, we’ll explore how traits and generics work in Rust and how they enhance code reusability.

Trait Definition and Implementation

Traits in Rust are similar to interfaces in other languages, defining a set of methods that a type must implement. Traits allow different types to share behavior in a type-safe manner.

Defining and Implementing Traits

To define a trait, use the trait keyword. Traits can include method signatures without implementations or with default implementations, which can be overridden by specific types.

trait Describe {
    fn describe(&self) -> String; // Required method

    fn greeting(&self) -> String { // Optional with a default implementation
        String::from("Hello!")
    }
}

struct Person {
    name: String,
}

impl Describe for Person {
    fn describe(&self) -> String {
        format!("My name is {}", self.name)
    }
}

In this example, the Person struct implements the Describe trait, providing a specific implementation for describe.

Trait Objects (Dynamic Dispatch)

Rust supports dynamic dispatch with trait objects, allowing runtime polymorphism. This is useful when a function or collection must handle multiple types implementing the same trait.

Using Trait Objects with dyn

Trait objects are created by specifying dyn before the trait name. The trait must be object-safe, meaning it doesn’t use generic parameters.

fn print_description(item: &dyn Describe) {
    println!("{}", item.describe());
}

let person = Person { name: String::from("Alice") };
print_description(&person); // Works with any type implementing `Describe`

In this example, print_description can accept any type that implements Describe, thanks to dynamic dispatch.

Generics and Bounds

Generics in Rust allow writing code that can operate on different types. Generics are declared with angle brackets (<T>) and can be constrained with trait bounds to ensure they meet specific requirements.

Defining Generics

Generics can be used in functions or structs, acting as placeholders for any type.

fn largest<T: PartialOrd>(a: T, b: T) -> T {
    if a > b { a } else { b }
}

Trait Bounds

Trait bounds restrict a generic type to those implementing specific traits, enabling functions and structs to safely assume certain behaviors.

struct Container<T: Describe> {
    item: T,
}

impl<T: Describe> Container<T> {
    fn show(&self) {
        println!("{}", self.item.describe());
    }
}

In this example, the Container struct accepts only types implementing the Describe trait, ensuring show can safely call describe.

Standard Traits

Rust includes standard traits that add common behavior to types. Here are a few of them.

Clone: Duplicate a Value

The Clone trait enables a type to be duplicated.

#[derive(Clone)]
struct Point { x: i32, y: i32 }

let p1 = Point { x: 5, y: 10 };
let p2 = p1.clone();

Copy: Lightweight Copies for Simple Types

The Copy trait is used for types that can be copied by value, such as integers and simple structs.

#[derive(Copy, Clone)]
struct Point { x: i32, y: i32 }

Display and Debug: Print-Friendly Output

  • Display: Used to define how types are formatted in a user-friendly way, with {} in println!.
  • Debug: Used for formatting types in a developer-friendly way, with {:?} in println!. It’s often used for logging and debugging.

You typically implement Display for custom types if they will be user-facing, while Debug is helpful for logging.

 
use std::fmt;

struct Point { x: i32, y: i32 }

impl fmt::Display for Point { 
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "({}, {})", self.x, self.y) } 
}

impl fmt::Debug for Point { 
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "Point {{ x: {}, y: {} }}", self.x, self.y) } 
}

let point = Point { x: 5, y: 10 }; 

println!("Display: {}", point); // Uses Display 
println!("Debug: {:?}", point); // Uses Debug

Iterator: Sequentially Access Elements

The Iterator trait allows types to be iterated over in a sequence. Implementing Iterator requires defining the next method, which returns an Option<T>—either Some(value) for each item in the sequence or None to signal the end.

struct Counter { count: u32, }

impl Counter { 
    fn new() -> Self { Counter { count: 0 } } 
}

impl Iterator for Counter { 
    type Item = u32;

    fn next(&mut self) -> Option<Self::Item> {
        self.count += 1;
        if self.count <= 5 {
            Some(self.count)
        } else {
            None
        }
    }
}

let mut counter = Counter::new(); 
while let Some(num) = counter.next() { 
    println!("{}", num); // Prints 1 through 5 
}

Into: Converting to a Specified Type

The Into trait allows an instance of one type to be converted into another type. Implementing Into for a type enables conversions with .into(), making it easy to transform values between compatible types.

 
struct Celsius(f64); 
struct Fahrenheit(f64);

impl Into<Fahrenheit> for Celsius { 
    fn into(self) -> Fahrenheit { Fahrenheit(self.0 * 1.8 + 32.0) } 
}

let temp_c = Celsius(30.0); 
let temp_f: Fahrenheit = temp_c.into(); // Converts Celsius to Fahrenheit 

From: Converting from Another Type

The From trait is the counterpart to Into, providing a way to create an instance of a type from another type. Types that implement From for a specific type enable that conversion via From::from.

 
struct Millimeters(u32);

impl From<u32> for Millimeters { 
    fn from(value: u32) -> Self { Millimeters(value) } 
}

let length = Millimeters::from(100); // Converts a u32 into Millimeters 

PartialEq and Eq: Equality Comparison

The PartialEq trait enables types to be compared for equality with == and inequality with !=. Rust requires implementing PartialEq for custom types if you want to use them in conditional checks. Eq is a marker trait that indicates total equality, meaning the type has no partial or undefined cases (e.g., NaN for floats). Types that implement Eq also implement PartialEq.

 
#[derive(PartialEq, Eq)] 
struct Point { x: i32, y: i32 }

let p1 = Point { x: 5, y: 10 }; 
let p2 = Point { x: 5, y: 10 }; 

assert_eq!(p1, p2); // Checks equality using == 

PartialOrd and Ord: Ordering and Comparison

PartialOrd allows types to be compared for ordering with <, >, <=, and >=, while Ord requires that the ordering is total (e.g., every value is comparable). Ord is often used with types that have a logical sequence or order.

 
#[derive(PartialOrd, PartialEq, Ord, Eq)] 
struct Temperature(i32);

let t1 = Temperature(30); 
let t2 = Temperature(40); 

assert!(t1 < t2); // Checks if t1 is less than t2 

Default: Default Values

The Default trait provides a way to create a default instance of a type with Default::default(). This trait is particularly useful in generic programming when you want a type to have an initial state.

 
#[derive(Default)] 
struct Config { debug: bool, timeout: u32, }

let config = Config::default(); // Initializes Config with default values 

Drop: Custom Cleanup Logic

The Drop trait is called automatically when a value goes out of scope, making it ideal for managing resources, like closing files or network connections. Drop provides the drop method for custom cleanup logic.

 
struct File { name: String, }

impl Drop for File { 
    fn drop(&mut self) { println!("Closing file: {}", self.name); } 
}

fn main() { 
    let f = File { name: String::from("data.txt") }; 
} 

// Drop is called here, and "Closing file: data.txt" is printed 

AsRef and AsMut: Lightweight Borrowing

AsRef and AsMut enable types to convert themselves to references of another type, often used when you want to treat multiple types uniformly. They’re frequently used in APIs that need flexible input types.

 

fn print_length<T: AsRef<str>>(s: T) { 
    println!("Length: {}", s.as_ref().len()); 
}

print_length("hello"); // &str 
print_length(String::from("hello")); // String

Deref and DerefMut: Custom Dereferencing

The Deref and DerefMut traits allow custom types to behave like references, enabling access to the inner data with the * operator. This is particularly useful for types like Box, which act as smart pointers to heap-allocated values.

 

use std::ops::Deref;

struct MyBox<T>(T);

impl<T> Deref for MyBox<T> { 
    type Target = T;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

let x = MyBox(String::from("Hello")); 
println!("Deref: {}", *x); // Prints "Hello" due to Deref implementation

Advanced Trait Bounds and Lifetimes

In more complex scenarios, multiple trait bounds and lifetimes help enforce requirements on generic types and references.

Combining Multiple Trait Bounds

Multiple trait bounds can be combined with +, allowing a function to require several capabilities from a type.

fn describe<T: Describe + Debug>(item: T) {
    println!("{:?}", item);
    println!("{}", item.describe());
}

Lifetimes in Generics

When generics involve references, lifetimes ensure the references remain valid for the required scope. We went over lifetimes in part 2 of this series.

fn longest<'a, T>(x: &'a T, y: &'a T) -> &'a T
where
    T: PartialOrd,
{
    if x > y { x } else { y }
}

Operator Overloading with Traits

Rust allows operator overloading for custom types through traits in the std::ops module. This enables intuitive syntax for user-defined types.

Implementing Add for Custom + Behavior

The Add trait allows custom behavior for the + operator.

use std::ops::Add;

struct Point {
    x: i32,
    y: i32,
}

impl Add for Point {
    type Output = Point;

    fn add(self, other: Point) -> Point {
        Point { x: self.x + other.x, y: self.y + other.y }
    }
}

fn main() {
    let p1 = Point { x: 1, y: 2 };
    let p2 = Point { x: 3, y: 4 };
    let result = p1 + p2;
    println!("Result: ({}, {})", result.x, result.y);
}

This code allows adding two Point instances with the + operator, thanks to the Add trait implementation.

Summary

Rust’s traits and generics enable developers to write flexible, reusable code while maintaining type safety. Traits define shared behavior, making it easy to build common functionality for different types, while generics allow for code that adapts to various types. The combination of traits, generics, and advanced features like dynamic dispatch and operator overloading make Rust’s type system both powerful and expressive, allowing you to build complex, maintainable applications with ease.

Learning Rust Part 5 - Data Structures

Introduction

Rust offers a versatile range of data structures that make working with various types of data efficient and safe. In this post, we’ll cover fundamental data structures like vectors, hash maps, and arrays, along with Rust’s powerful enums and structs for creating custom data types.

Text

Strings and String Slices

Rust provides two main string types:

  • String: A growable, heap-allocated string.
  • &str (string slice): An immutable view into a string, commonly used for read-only access.

String example:

let mut s = String::from("Hello");
s.push_str(", world!");
println!("{}", s);

String slice example:

let greeting = "Hello, world!";
let slice = &greeting[0..5]; // "Hello"

Collections

Vectors

Vectors (Vec<T>) are dynamic arrays that can grow or shrink in size, making them ideal for storing sequences of values when the size isn’t known at compile-time.

fn main() {
    let mut numbers = vec![1, 2, 3];
    numbers.push(4); // Add an element
    println!("{:?}", numbers);
}

LinkedList

A LinkedList<T> is a doubly linked list that allows fast insertion and deletion of elements at both ends of the list. It is less commonly used than Vec but can be useful when you need to insert and remove elements frequently from both the front and back of the collection.

 
use std::collections::LinkedList;

fn main() { 
    let mut list = LinkedList::new(); 

    list.push_back(1); 
    list.push_back(2); 
    list.push_front(0);

    for value in &list {
        println!("{}", value); // Prints 0, 1, 2
    }
} 

HashMap

A HashMap<K, V> stores key-value pairs, enabling efficient value retrieval based on keys.

use std::collections::HashMap;

fn main() {
    let mut scores = HashMap::new();
    scores.insert("Alice", 10);
    scores.insert("Bob", 20);

    println!("{:?}", scores.get("Alice")); // Some(&10)
}

BTreeMap

BTreeMap<K, V> is similar to HashMap but keeps keys sorted, making it useful for scenarios where sorted keys are necessary.

use std::collections::BTreeMap;

fn main() {
    let mut scores = BTreeMap::new();
    scores.insert("Alice", 10);
    scores.insert("Bob", 20);

    for (key, value) in &scores {
        println!("{}: {}", key, value);
    }
}

BinaryHeap

A BinaryHeap<T> is a priority queue implemented with a binary heap, where elements are always ordered so the largest (or smallest) element can be accessed quickly. By default, BinaryHeap maintains a max-heap, but it can be customized for min-heap operations.

 
use std::collections::BinaryHeap;

fn main() { 
    let mut heap = BinaryHeap::new(); 

    heap.push(1); 
    heap.push(5); 
    heap.push(3);

    while let Some(top) = heap.pop() {
        println!("{}", top); // Prints values in descending order: 5, 3, 1
    }
}

HashSet

A HashSet<T> is a collection of unique values, implemented as a hash table. It provides fast membership checking and is useful when you need to store non-duplicate items without any specific order.

 
use std::collections::HashSet;

fn main() { 
    let mut set = HashSet::new(); 

    set.insert("apple"); 
    set.insert("banana"); 
    set.insert("apple"); // Duplicate, ignored by the set

    println!("{:?}", set.contains("banana")); // true
    println!("{:?}", set); // {"apple", "banana"}
}

BTreeSet

A BTreeSet<T> is a sorted, balanced binary tree-based set. Like HashSet, it only stores unique values, but unlike HashSet, it maintains its items in sorted order. This makes it suitable for range queries and ordered data.

 
use std::collections::BTreeSet;

fn main() { 
    let mut set = BTreeSet::new(); 

    set.insert(10); 
    set.insert(20); 
    set.insert(15);

    for value in &set {
        println!("{}", value); // Prints 10, 15, 20 in sorted order
    }
}

VecDeque

A VecDeque<T> (double-ended queue) is a resizable, efficient data structure that supports adding and removing elements from both the front and back. It’s ideal for queue-like operations where both ends need to be accessible.

 
use std::collections::VecDeque;

fn main() { 
    let mut deque = VecDeque::new(); 

    deque.push_back(1); 
    deque.push_front(0);

    println!("{:?}", deque.pop_back()); // Some(1)
    println!("{:?}", deque.pop_front()); // Some(0)

} 

Option and Result Types

Rust’s Option and Result types are enums that enable safe handling of optional values and errors. We went over error handling in part 3 of this series.

Option

The Option<T> type represents an optional value: Some(T) for a value, or None if absent.

fn find_element(index: usize) -> Option<i32> {
    let numbers = vec![1, 2, 3];
    numbers.get(index).copied()
}

fn main() {
    match find_element(1) {
        Some(number) => println!("Found: {}", number),
        None => println!("Not found"),
    }
}

Result

The Result<T, E> type is used for functions that may succeed (Ok(T)) or fail (Err(E)), promoting explicit error handling.

fn divide(a: f64, b: f64) -> Result<f64, &'static str> {
    if b == 0.0 {
        Err("Cannot divide by zero")
    } else {
        Ok(a / b)
    }
}

Custom Data Types and Enums

Rust’s enums and structs allow you to create custom data types, essential for building complex and expressive programs.

Enums

Rust’s enums can hold different types of data within each variant, enabling versatile data representations.

enum Message {
    Text(String),
    Image { url: String, width: u32, height: u32 },
    Quit,
}

fn main() {
    let msg = Message::Text(String::from("Hello"));
    match msg {
        Message::Text(text) => println!("Text: {}", text),
        Message::Image { url, width, height } => println!("Image at {}, size: {}x{}", url, width, height),
        Message::Quit => println!("Quit message"),
    }
}

Structs and Tuple Structs

Structs allow for creating complex types with named fields.

struct Person {
    name: String,
    age: u8,
}

fn main() {
    let person = Person {
        name: String::from("Alice"),
        age: 30,
    };
    println!("Name: {}, Age: {}", person.name, person.age);
}

Tuple structs are useful for grouping values without naming fields, often for simpler data types.

struct Color(u8, u8, u8);

fn main() {
    let red = Color(255, 0, 0);
    println!("Red: {}, {}, {}", red.0, red.1, red.2);
}

Arrays, Slices, and Compile-Time Length Arrays

Arrays

Arrays in Rust are fixed-size collections of elements with known length at compile-time. They’re stack-allocated, offering efficiency and safety.

fn main() {
    let numbers: [i32; 3] = [1, 2, 3];
    println!("First element: {}", numbers[0]);
}

Slices

Slices provide a way to view sections of an array or vector, avoiding the need to copy data.

fn main() {
    let numbers = [1, 2, 3, 4];
    let slice = &numbers[1..3];
    println!("{:?}", slice); // [2, 3]
}

Slices work with both arrays and vectors and are typically used as function parameters to avoid copying large data structures.

Reference Counting

Rc<Vec<T>> and Arc<Vec<T>>

Rc (Reference Counting) and Arc (Atomic Reference Counting) are common wrappers around collections like Vec to allow multiple ownership. Rc is single-threaded, while Arc is thread-safe, and both are used frequently for sharing collections between parts of a program.

 
use std::rc::Rc; 
use std::sync::Arc;

fn main() { 
    let vec = vec![1, 2, 3]; 
    let shared_rc = Rc::new(vec.clone()); 
    let shared_arc = Arc::new(vec);

    println!("Rc count: {}", Rc::strong_count(&shared_rc)); // Rc count: 1
    println!("Arc count: {}", Arc::strong_count(&shared_arc)); // Arc count: 1

} 

Summary

Rust’s data structures—from collections like Vec and HashMap to custom types with struct and enum—enable flexible, efficient, and safe data handling. With tools like Option and Result, Rust enforces a safety-first approach without compromising on performance, making it an ideal language for robust application development.

Learning Rust Part 4 - Concurrency

Introduction

Rust’s concurrency model provides a unique approach to safe parallel programming by eliminating data races and encouraging structured, reliable concurrent code. Through its ownership model, concurrency primitives, and async/await syntax, Rust enables developers to write efficient, parallel programs. In this post, we’ll explore Rust’s key tools and patterns for safe concurrency.

Threads and Thread Safety

Rust’s std::thread module allows developers to create threads, enabling programs to perform multiple tasks concurrently.

Creating Threads

Rust threads are created with std::thread::spawn, and they can run independently of the main thread. The join method is used to wait for threads to complete.

use std::thread;

fn main() {
    let handle = thread::spawn(|| {
        for i in 1..5 {
            println!("Thread: {}", i);
        }
    });

    for i in 1..5 {
        println!("Main: {}", i);
    }

    handle.join().unwrap(); // Wait for the thread to finish
}

Thread Safety

Rust’s ownership model ensures that data shared across threads is managed safely. Rust achieves this through two primary mechanisms:

  • Ownership Transfer: Data can be transferred to threads, where the original owner relinquishes control.
  • Immutable Sharing: If data is borrowed immutably, it can be accessed concurrently across threads without modification.

Concurrency Primitives (Mutex, RwLock)

Rust offers concurrency primitives, such as Mutex and RwLock, to allow safe mutable data sharing across threads.

Mutex (Mutual Exclusion)

A Mutex ensures that only one thread can access the data at a time. When using lock() on a Mutex, it returns a guard that releases the lock automatically when dropped.

use std::sync::{Mutex, Arc};
use std::thread;

fn main() {
    let data = Arc::new(Mutex::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let data = Arc::clone(&data);
        let handle = thread::spawn(move || {
            let mut num = data.lock().unwrap();
            *num += 1;
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Result: {}", *data.lock().unwrap());
}

RwLock (Read-Write Lock)

An RwLock allows multiple readers or a single writer, making it ideal for scenarios where data is read often but updated infrequently.

use std::sync::{RwLock, Arc};

fn main() {
    let data = Arc::new(RwLock::new(0));

    {
        let read_data = data.read().unwrap();
        println!("Read: {}", *read_data);
    }

    {
        let mut write_data = data.write().unwrap();
        *write_data += 1;
    }
}

Atomic Types

Atomic types like AtomicBool, AtomicIsize, and AtomicUsize enable lock-free, atomic operations on shared data, which is useful for simple counters or flags.

use std::sync::atomic::{AtomicUsize, Ordering};
use std::thread;

fn main() {
    let counter = AtomicUsize::new(0);

    let handles: Vec<_> = (0..10).map(|_| {
        thread::spawn(|| {
            counter.fetch_add(1, Ordering::SeqCst);
        })
    }).collect();

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Counter: {}", counter.load(Ordering::SeqCst));
}

Channel Communication

Rust’s channels, provided by the std::sync::mpsc module, allow message passing between threads. Channels provide safe communication without shared memory, following a multiple-producer, single-consumer pattern.

Creating and Using Channels

use std::sync::mpsc;
use std::thread;

fn main() {
    let (tx, rx) = mpsc::channel();

    thread::spawn(move || {
        let message = String::from("Hello from thread");
        tx.send(message).unwrap();
    });

    let received = rx.recv().unwrap();
    println!("Received: {}", received);
}

Multi-Threaded Producers

To enable multiple threads to send messages to the same receiver, you can clone the transmitter.

let tx = mpsc::Sender::clone(&tx);

Async/Await and Asynchronous Programming

Rust’s async/await syntax supports asynchronous programming, allowing tasks to pause (await) without blocking the entire thread. Async functions in Rust return Future types, which represent values available at a later time.

Defining and Using Async Functions

An async function returns a Future and only runs when awaited.

async fn fetch_data() -> u32 {
    42
}

#[tokio::main]
async fn main() {
    let data = fetch_data().await;
    println!("Data: {}", data);
}

.await will force the application to wait for fetch_data() to complete before moving on.

Combining Async Functions

Multiple async calls can be combined with tokio::join!, allowing concurrency without additional threads.

async fn first() -> u32 { 10 }
async fn second() -> u32 { 20 }

async fn run() {
    let (a, b) = tokio::join!(first(), second());
    println!("Sum: {}", a + b);
}

Task-Based Concurrency with Tokio and async-std

Rust offers runtime libraries like Tokio and async-std for task-based concurrency, providing asynchronous runtimes suited for managing complex async workflows.

Using Tokio

Tokio is a popular async runtime, offering tools for task management, timers, and network I/O.

use tokio::time::{sleep, Duration};

#[tokio::main]
async fn main() {
    let handle = tokio::spawn(async {
        sleep(Duration::from_secs(1)).await;
        println!("Task completed!");
    });

    handle.await.unwrap();
}

async-std Example

async-std offers similar functionality with a simpler API for certain tasks.

use async_std::task;
use std::time::Duration;

fn main() {
    task::block_on(async {
        task::sleep(Duration::from_secs(1)).await;
        println!("Task completed!");
    });
}

Summary

Rust’s concurrency model provides robust tools for safe multithreading and asynchronous programming. By combining threads, async/await syntax, and concurrency primitives like Mutex and RwLock, Rust enables safe data sharing and task-based concurrency, making it a powerful choice for high-performance concurrent applications.

Learning Rust Part 16 - Interoperability

Introduction

Rust’s FFI (Foreign Function Interface) capabilities and rich library support enable it to integrate seamlessly with other languages like C, C++, and Python. Rust can also produce shared libraries and handle various data interchange formats such as JSON, Protobuf, and MsgPack, making it a great choice for cross-language applications and APIs. This post covers essential tools and techniques for interfacing Rust with other languages.

FFI with C and C++

Rust’s FFI makes it possible to interact directly with C libraries, letting Rust leverage existing C code or integrate with languages like C++. The extern keyword and the libc crate facilitate this interoperability.

Calling C Functions from Rust

To call a C function, define an extern block and use #[link] to specify the library. Here’s an example with the C sqrt function from the math library:

extern "C" {
    fn sqrt(x: f64) -> f64;
}

fn main() {
    let x = 25.0;
    unsafe {
        println!("sqrt({}) = {}", x, sqrt(x));
    }
}

Exposing Rust Functions to C

To expose Rust functions for use in C, use #[no_mangle] and declare the function as extern "C". This prevents Rust from altering the function name.

#[no_mangle]
pub extern "C" fn rust_add(a: i32, b: i32) -> i32 {
    a + b
}

Interfacing with C++ using the cxx crate

The cxx crate provides an interface for calling C++ code from Rust and vice versa, handling C++ types like std::string and std::vector.

Add cxx to Cargo.toml and define a C++ bridge file (bridge.rs):

#[cxx::bridge]
mod ffi {
    extern "C++" {
        include!("example.h");
        fn cpp_function(x: i32) -> i32;
    }
}

fn main() {
    let result = ffi::cpp_function(42);
    println!("Result from C++: {}", result);
}

Rust and Python Interfacing with pyo3

The pyo3 crate allows Rust to execute Python code, call Python functions, and even create Python modules directly from Rust.

Calling Python Code from Rust

Use pyo3 to execute Python code within Rust. First, add pyo3 to Cargo.toml:

[dependencies]
pyo3 = { version = "0.15", features = ["extension-module"] }

Then, write a Rust function that interacts with Python:

use pyo3::prelude::*;

fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        let sys = py.import("sys")?;
        let version: String = sys.get("version")?.extract()?;
        println!("Python version: {}", version);
        Ok(())
    })
}

Building a Python Module in Rust

Rust can also create native Python modules. Annotate functions with #[pyfunction] and use #[pymodule] to define the module.

use pyo3::prelude::*;

#[pyfunction]
fn sum_as_string(a: i64, b: i64) -> PyResult<String> {
    Ok((a + b).to_string())
}

#[pymodule]
fn my_rust_module(py: Python, m: &PyModule) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(sum_as_string, m)?)?;
    Ok(())
}

Build this as a shared library, and it can be imported into Python just like a native module.

Building Shared Libraries

Rust can produce shared libraries (e.g., .dll on Windows, .so on Linux, and .dylib on macOS), making it easy to share Rust code across multiple languages.

Compiling Rust to a Shared Library

To build a Rust project as a shared library, set the crate-type in Cargo.toml:

[lib]
crate-type = ["cdylib"]

Then build the library with:

cargo build --release

This generates a .dll, .so, or .dylib file, depending on your operating system, which other languages can link to and use.

Using the Shared Library

From another language, import the shared library and call its functions. For instance, in Python, you can use ctypes to load and call functions from the Rust shared library:

import ctypes

lib = ctypes.CDLL('./target/release/libmy_rust_lib.so')
result = lib.rust_add(10, 20)
print(f"Result from Rust: {result}")

Using Rust with Other Languages

Rust can interface with languages like JavaScript, Ruby, and Go by using FFI or compiling Rust to WebAssembly or shared libraries.

WebAssembly (Wasm) for JavaScript Interoperability

WebAssembly allows Rust code to run in the browser or JavaScript environments. Using wasm-bindgen, Rust functions can be exposed to JavaScript.

Add wasm-bindgen to Cargo.toml:

use wasm_bindgen::prelude::*;

#[wasm_bindgen]
pub fn greet(name: &str) -> String {
    format!("Hello, {}!", name)
}

Build the Rust code as WebAssembly and import it in JavaScript, making Rust interoperable with frontend applications.

Data Interchange Formats (JSON, Protobuf, MsgPack)

Rust supports serialization formats that allow data interchange with other systems and languages.

JSON with serde_json

The serde_json crate is the standard for JSON serialization and deserialization in Rust.

use serde::{Serialize, Deserialize};
use serde_json;

#[derive(Serialize, Deserialize)]
struct User {
    id: u32,
    name: String,
}

fn main() -> serde_json::Result<()> {
    let user = User { id: 1, name: "Alice".to_string() };
    let json = serde_json::to_string(&user)?;
    println!("Serialized JSON: {}", json);

    let deserialized: User = serde_json::from_str(&json)?;
    println!("Deserialized: {:?}", deserialized);
    Ok(())
}

Protobuf with prost

Google’s Protocol Buffers (Protobuf) is a fast, language-agnostic format used for efficient data serialization. Rust’s prost crate generates Rust types from .proto files.

Define a .proto file for your data structures and use prost to generate Rust types.

use prost::Message;

#[derive(Message)]
struct User {
    #[prost(uint32, tag = "1")]
    pub id: u32,
    #[prost(string, tag = "2")]
    pub name: String,
}

MsgPack with rmp-serde

MsgPack is a compact, binary format for data serialization, providing efficiency for high-performance applications. rmp-serde allows Rust to serialize and deserialize MsgPack data using serde.

use serde::{Serialize, Deserialize};
use rmp_serde::{to_vec, from_slice};

#[derive(Serialize, Deserialize, Debug)]
struct User {
    id: u32,
    name: String,
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let user = User { id: 1, name: "Alice".to_string() };
    let msgpack = to_vec(&user)?; // Serialize to MsgPack

    let deserialized: User = from_slice(&msgpack)?; // Deserialize
    println!("Deserialized: {:?}", deserialized);
    Ok(())
}

Summary

Rust’s interoperability capabilities make it ideal for building cross-language applications. Whether through FFI, shared libraries, or data interchange formats like JSON and Protobuf, Rust can integrate seamlessly with various ecosystems, enabling it to act as a high-performance backend or computational layer in multi-language projects.