Rust, known for its performance, memory safety, and low-level control, is gaining traction in domains traditionally
dominated by Python, such as machine learning (ML). While Python is the go-to for prototyping ML models due to its
mature ecosystem, Rust shines in scenarios demanding high performance, safety, and seamless system-level integration.
In this post, we’ll explore how to implement logistic regression in Rust and discuss the implications of the model’s
output.
Why use Rust?
Before diving into code, it’s worth asking: why choose Rust for ML when Python’s libraries like TensorFlow and PyTorch
exist?
Benefits of Rust:
Performance: Rust offers near-C speeds, making it ideal for performance-critical tasks.
Memory Safety: Its ownership model ensures memory safety, preventing bugs like segmentation faults and data races.
Integration: Rust can easily integrate with low-level systems, making it a great choice for embedding ML models into IoT, edge devices, or game engines.
Control: Rust provides fine-grained control over execution, allowing developers to optimize their models at a deeper level.
While Rust’s ML ecosystem is still evolving, libraries like ndarray, linfa, and smartcore provide foundational
tools for implementing machine learning models.
Logistic Regression
Logistic regression is a simple yet powerful algorithm for binary classification. It predicts whether a data point
belongs to class 0 or 1 based on a weighted sum of features passed through a sigmoid function.
Below is a Rust implementation of logistic regression using the ndarray crate for numerical operations.
usendarray::{Array2,Array1};usendarray_rand::RandomExt;usendarray_rand::rand_distr::Uniform;fnsigmoid(x:f64)->f64{1.0/(1.0+(-x).exp())}fnlogistic_regression(X:&Array2<f64>,y:&Array1<f64>,learning_rate:f64,epochs:usize)->Array1<f64>{let(n_samples,n_features)=X.dim();letmutweights=Array1::<f64>::random(n_features,Uniform::new(-0.01,0.01));letmutbias=0.0;for_in0..epochs{letlinear_model=X.dot(&weights)+bias;letpredictions=linear_model.mapv(sigmoid);// Compute the errorleterror=&predictions-y;// Compute gradientsletgradient_weights=X.t().dot(&error)/n_samplesasf64;letgradient_bias=error.sum()/n_samplesasf64;// Update weights and biasweights-=&(learning_rate*gradient_weights);bias-=learning_rate*gradient_bias;}weights}fnmain(){letX=Array2::random((100,2),Uniform::new(-1.0,1.0));// Random featureslety=Array1::random(100,Uniform::new(0.0,1.0)).mapv(|v|ifv>0.5{1.0}else{0.0});// Random labelsletweights=logistic_regression(&X,&y,0.01,1000);println!("Trained Weights: {:?}",weights);}
Key Concepts:
Sigmoid Function: Converts the linear combination of inputs into a value between 0 and 1.
Gradient Descent: Updates weights and bias iteratively to minimize the error between predictions and actual labels.
Random Initialization: Weights start with small random values and are fine-tuned during training.
Output
When you run the code, you’ll see output similar to this:
Predictions close to 1 indicate class 1, while predictions close to 0 indicate class 0.
Why Does This Matter?
This simple implementation demonstrates the flexibility and control Rust provides for machine learning tasks. While
Python excels in rapid prototyping, Rust’s performance and safety make it ideal for deploying models in production,
especially in resource-constrained or latency-critical environments.
When Should You Use Rust for ML?
Rust is a great choice if:
Performance is critical: For example, in real-time systems or embedded devices.
Memory safety is a priority: Rust eliminates common bugs like memory leaks.
Integration with system-level components is needed: Rust can seamlessly work in environments where Python may not be ideal.
Custom ML Implementations: You want more control over how the algorithms are built and optimized.
For research or quick prototyping, Python remains the best choice due to its rich ecosystem and community. However,
for production-grade systems, Rust’s strengths make it a compelling alternative.
Conclusion
While Rust’s machine learning ecosystem is still maturing, it’s already capable of handling fundamental ML tasks like
logistic regression. By combining performance, safety, and control, Rust offers a unique proposition for ML developers
building high-performance or production-critical applications.
Loss functions are the unsung heroes of machine learning. They guide the
learning process by quantifying the difference between the predicted and actual outputs. While frameworks like
PyTorch and TensorFlow offer a plethora of standard loss
functions such as Cross-Entropy and
Mean Squared Error, there are times when a custom loss function
is necessary.
In this post, we’ll explore the why and how of custom loss functions by:
Setting up a simple neural network.
Using standard loss functions to train the model.
Introducing and implementing custom loss functions tailored to specific needs.
Pre-reqs
Before we begin, you’ll need to setup a python project and install some dependencies. We’ll be using PyTorch and
torchvision. To install these dependencies, use the following command:
pip install torch torchvision
Once installed, verify the installation by running:
python -c"import torch; print(torch.__version__)"
Network Setup
Let’s start by creating a simple neural network to classify data. For simplicity, we’ll use a toy dataset like the
MNIST digits dataset.
Dataet preparation
Use the MNIST dataset (handwritten digits) as an example.
Normalize the dataset for faster convergence during training.
importtorchimporttorch.optimasoptimfromtorchvisionimportdatasets,transforms# Data preparation
transform=transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5,),(0.5,))])train_data=datasets.MNIST(root='./data',train=True,transform=transform,download=True)train_loader=torch.utils.data.DataLoader(train_data,batch_size=64,shuffle=True)
Model Architecture
Input layer flattens the 28x28 pixel images into a single vector.
Two hidden layers with 128 and 64 neurons, each followed by a ReLU activation.
An output layer with 10 neurons (one for each digit) and no activation (handled by the loss function).
# Simple Neural Network
importtorch.nnasnnclassSimpleNN(nn.Module):def__init__(self):super(SimpleNN,self).__init__()self.fc1=nn.Linear(28*28,128)self.fc2=nn.Linear(128,64)self.fc3=nn.Linear(64,10)defforward(self,x):x=x.view(x.size(0),-1)# Flatten the input
x=torch.relu(self.fc1(x))x=torch.relu(self.fc2(x))x=self.fc3(x)returnx
Training Setup:
Use an optimizer (e.g., Adam) and CrossEntropyLoss for training.
Loop over the dataset for a fixed number of epochs, computing loss and updating weights.
# Initialize model, optimizer, and device
model=SimpleNN()optimizer=optim.Adam(model.parameters(),lr=0.001)device=torch.device('cuda'iftorch.cuda.is_available()else'cpu')model.to(device)
Standard Loss
Let’s train the model using the standard Cross-Entropy Loss, which is suitable for classification tasks.
Combines log_softmax and negative log likelihood into one step.
Suitable for classification tasks as it penalizes incorrect predictions heavily.
# Standard loss function
criterion=nn.CrossEntropyLoss()# Training loop
deftrain_model(model,train_loader,criterion,optimizer,epochs=5):model.train()forepochinrange(epochs):total_loss=0forimages,labelsintrain_loader:images,labels=images.to(device),labels.to(device)# Forward pass
outputs=model(images)loss=criterion(outputs,labels)# Backward pass and optimization
optimizer.zero_grad()loss.backward()optimizer.step()total_loss+=loss.item()print(f'Epoch {epoch+1}/{epochs}, Loss: {total_loss/len(train_loader):.4f}')train_model(model,train_loader,criterion,optimizer)
The output of this training session should look something like this:
Standard loss functions may not work well in cases like:
Imbalanced Datasets: Classes have significantly different frequencies.
Multi-Task Learning: Different tasks require different weights.
Task-Specific Goals: Optimizing for metrics like precision or recall rather than accuracy.
Example: Weighted Loss
Suppose we want to penalize misclassifying certain classes more heavily. We can achieve this by implementing a
weighted Cross-Entropy Loss.
# Custom weighted loss function
classWeightedCrossEntropyLoss(nn.Module):def__init__(self,class_weights):super(WeightedCrossEntropyLoss,self).__init__()self.class_weights=torch.tensor(class_weights).to(device)defforward(self,outputs,targets):log_probs=torch.log_softmax(outputs,dim=1)loss=-torch.sum(self.class_weights[targets]*log_probs[range(len(targets)),targets])/len(targets)returnloss# Example: Higher weight for class 0
class_weights=[2.0ifi==0else1.0foriinrange(10)]custom_criterion=WeightedCrossEntropyLoss(class_weights)# Training with custom loss function
train_model(model,train_loader,custom_criterion,optimizer)
After running this, you should see output like the following:
Sometimes, you might want to combine multiple objectives into a single loss function.
# Custom loss combining Cross-Entropy and L1 regularization
classCombinedLoss(nn.Module):def__init__(self,alpha=0.1):super(CombinedLoss,self).__init__()self.ce_loss=nn.CrossEntropyLoss()self.alpha=alphadefforward(self,outputs,targets,model):ce_loss=self.ce_loss(outputs,targets)l1_loss=sum(torch.sum(torch.abs(param))forparaminmodel.parameters())returnce_loss+self.alpha*l1_losscustom_criterion=CombinedLoss(alpha=0.01)# Training with combined loss
train_model(model,train_loader,lambdaoutputs,targets:custom_criterion(outputs,targets,model),optimizer)
Comparing Results
To compare the results of standard and custom loss functions, you need to evaluate the following:
Training Loss:
Plot the loss per epoch for both standard and custom loss functions.
Accuracy:
Measure training and validation accuracy after each epoch.
Compare how well the model performs in predicting each class.
Precision and Recall:
Useful for imbalanced datasets to measure performance on minority classes.
Visualization:
Confusion matrix: Visualize how often each class is misclassified.
Loss curve: Show convergence speed and stability for different loss functions.
We can use graphs to visualise how these metrics perform:
fromsklearn.metricsimportclassification_report,confusion_matriximportmatplotlib.pyplotaspltimportnumpyasnp# After training
model.eval()all_preds,all_labels=[],[]withtorch.no_grad():forimages,labelsintrain_loader:images,labels=images.to(device),labels.to(device)outputs=model(images)preds=torch.argmax(outputs,dim=1)all_preds.extend(preds.cpu().numpy())all_labels.extend(labels.cpu().numpy())# Confusion Matrix
cm=confusion_matrix(all_labels,all_preds)plt.imshow(cm,cmap='Blues')plt.title('Confusion Matrix')plt.colorbar()plt.show()# Classification Report
print(classification_report(all_labels,all_preds))
We can also produce visualisations of our loss curves:
# Assuming loss values are stored during training
plt.plot(range(len(train_losses)),train_losses,label="Standard Loss")plt.plot(range(len(custom_losses)),custom_losses,label="Custom Loss")plt.xlabel('Epoch')plt.ylabel('Loss')plt.legend()plt.title('Loss Curve')plt.show()
Conclusion
Custom loss functions empower you to fine-tune your neural networks for unique problems. By carefully designing and
experimenting with loss functions, you can align your model’s learning process with the specific goals of your
application.
Some closing tips for custom loss functions:
Always start with a simple baseline (e.g., Cross-Entropy Loss) to understand your model’s behavior.
Visualize performance across metrics, especially when using weighted or multi-objective losses.
Experiment with different weights and loss combinations to find the optimal setup for your task.
The key is to balance complexity and interpretability—sometimes, even simple tweaks can significantly impact
performance.
When building high-performance software, caches often play a vital role in optimizing performance by reducing redundant
computations or avoiding repeated I/O operations. One such common caching strategy is the Least Recently Used (LRU)
cache, which ensures that the most recently accessed data stays available while evicting the least accessed items when
space runs out.
What Is an LRU Cache?
At its core, an LRU cache stores a limited number of key-value pairs. When you access or insert an item:
If the item exists, it is marked as “recently used.”
If the item doesn’t exist and the cache is full, the least recently used item is evicted to make space for the new one.
LRU caches are particularly useful in scenarios where access patterns favor recently used data, such as:
Web page caching in browsers.
Database query caching for repeated queries.
API response caching to reduce repeated external requests.
In this post, we’ll build a simple and functional implementation of an LRU cache in Rust. Instead of diving into
complex data structures like custom linked lists, we’ll leverage Rust’s standard library collections
(HashMap and VecDeque) to achieve:
Constant-time access and updates using HashMap.
Efficient tracking of usage order with VecDeque.
This straightforward approach is easy to follow and demonstrates Rust’s powerful ownership model and memory safety.
LRUCache Structure
We’ll begin with a struct that defines the cache:
pubstructLRUCache<K,V>{capacity:usize,// Maximum number of items the cache can holdmap:HashMap<K,V>,// Key-value storeorder:VecDeque<K>,// Tracks the order of key usage}
This structure holds:
capacity: The maximum number of items the cache can store.
map: The main storage for key-value pairs.
order: A queue to maintain the usage order of keys.
Implementation
Our implementation of LRUCache includes some constraints on the generic types K (key) and V (value). Specifically,
the K type requires the following traits:
The Clone trait allows us to create a copy of the key when needed (via .clone()). Eq is a trait that ensure that
keys can be compared for equality and are either strictly equal or not. The Hash trait enables us to hash the keys
which is a requirement for using HashMap, and finally the PartialEq trait allows for equality comparisons between
two keys.
Technically Eq should already imply PartialEq but we explicity include it here for clarity.
HashMap::with_capacity: Preallocates space for the HashMap to avoid repeated resizing.
VecDeque::with_capacity: Allocates space for tracking key usage.
Value access via get
The get method retrieves a value by key and updates its usage order:
pubfnget(&mutself,key:&K)->Option<&V>{ifself.map.contains_key(key){// Move the key to the back of the order queueself.order.retain(|k|k!=key);self.order.push_back(key.clone());self.map.get(key)}else{None}}
Check if the key exists via contains_key
Remove the key from its old position in order and push it to the back
Return the vlaue from the HashMap
In cases where a value never existed or has been evicted, this function sends None back to the caller.
Value insertion via put
The put method adds a new key-value pair or updates an existing one:
pubfnput(&mutself,key:K,value:V){ifself.map.contains_key(&key){// Update existing key's value and mark it as most recently usedself.map.insert(key.clone(),value);self.order.retain(|k|k!=&key);self.order.push_back(key);}else{ifself.map.len()==self.capacity{// Evict the least recently used itemifletSome(lru_key)=self.order.pop_front(){self.map.remove(&lru_key);}}self.map.insert(key.clone(),value);self.order.push_back(key);}}
If the key exists
The value is updated in map
The key is moved to the back of order
If the cache is full
Remove the least recently used key (which will be the front of order) from map
Insert the new key-value pair and mark it as recently used
Size
Finally, we add a helper method to get the current size of the cache:
In this post, we built a simple yet functional LRU cache in Rust. A full implementation can be found as a gist here.
While this implementation is perfect for understanding the basic principles, it can be extended further with:
Thread safety using synchronization primitives like Mutex or RwLock.
Custom linked structures for more efficient eviction and insertion.
Diagnostics and monitoring to observe cache performance in real-world scenarios.
If you’re looking for a robust cache for production, libraries like lru offer feature-rich implementations. But for
learning purposes, rolling your own cache is an excellent way to dive deep into Rust’s collections and ownership model.
Network packet sniffing is an essential skill in the toolbox of any systems programmer or network engineer. It enables
us to inspect network traffic, debug communication issues, and even learn how various networking protocols function
under the hood.
In this article, we will walk through the process of building a simple network packet sniffer in C using raw sockets.
Before we begin, it might help to run through a quick networking primer.
OSI and Networking Layers
Before diving into the code, let’s briefly revisit the OSI model—a conceptual framework that standardizes network
communication into seven distinct layers:
Physical Layer: Deals with the physical connection and transmission of raw data bits.
Data Link Layer: Responsible for framing and MAC addressing. Ethernet operates at this layer.
Network Layer: Handles logical addressing (IP addresses) and routing. This layer is where IP packets are structured.
Transport Layer: Ensures reliable data transfer with protocols like TCP and UDP.
Session Layer: Manages sessions between applications.
Presentation Layer: Transforms data formats (e.g., encryption, compression).
Application Layer: Interfaces directly with the user (e.g., HTTP, FTP).
Our packet sniffer focuses on Layers 2 through 4. By analyzing Ethernet, IP, TCP, UDP, and ICMP headers, we gain
insights into packet structure and how data travels across a network.
The Code
In this section, we’ll run through the functions that are needed to implement our packet sniffer. The layers that we’ll
focus on are:
Layer 2 (Data Link): Capturing raw Ethernet frames and extracting MAC addresses.
Layer 3 (Network): Parsing IP headers for source and destination IPs.
Layer 4 (Transport): Inspecting TCP, UDP, and ICMP protocols to understand port-level communication and message types.
Layer 2 (Data Link)
The Data Link Layer is responsible for the physical addressing of devices on a network. It includes the Ethernet
header, which contains the source and destination MAC addresses. In this section, we analyze and print the Ethernet
header.
The Network Layer handles logical addressing and routing. In our code, this corresponds to the IP header, where we
extract source and destination IP addresses.
voidprint_ip_header(unsignedchar*buffer,intsize){structiphdr*ip=(structiphdr*)(buffer+sizeof(structethhdr));printf("\nIP Header\n");printf(" |-Source IP : %s\n",inet_ntoa(*(structin_addr*)&ip->saddr));printf(" |-Destination IP : %s\n",inet_ntoa(*(structin_addr*)&ip->daddr));printf(" |-Protocol : %d\n",ip->protocol);}
Here, we use the iphdr structure to parse the IP header. The inet_ntoa function converts the source and destination
IP addresses from binary format to a human-readable string.
Layer 4 (Transport)
The Transport Layer ensures reliable data transfer and includes protocols like TCP,
UDP, and ICMP.
We have specific functions to parse and display these packets:
The TCP version of this function has a source and destination for the packet, but also has a sequence and
acknowledgement which are key features for this protocol.
voidprint_tcp_packet(unsignedchar*buffer,intsize){structiphdr*ip=(structiphdr*)(buffer+sizeof(structethhdr));structtcphdr*tcp=(structtcphdr*)(buffer+sizeof(structethhdr)+ip->ihl*4);printf("\nTCP Packet\n");print_ip_header(buffer,size);printf("\n |-Source Port : %u\n",ntohs(tcp->source));printf(" |-Destination Port : %u\n",ntohs(tcp->dest));printf(" |-Sequence Number : %u\n",ntohl(tcp->seq));printf(" |-Acknowledgement : %u\n",ntohl(tcp->ack_seq));}
The UDP counterpart doesn’t have the sequencing or acknowledgement as it’s a general broadcast protocol.
voidprint_udp_packet(unsignedchar*buffer,intsize){structiphdr*ip=(structiphdr*)(buffer+sizeof(structethhdr));structudphdr*udp=(structudphdr*)(buffer+sizeof(structethhdr)+ip->ihl*4);printf("\nUDP Packet\n");print_ip_header(buffer,size);printf("\n |-Source Port : %u\n",ntohs(udp->source));printf(" |-Destination Port : %u\n",ntohs(udp->dest));printf(" |-Length : %u\n",ntohs(udp->len));}
ICMP’s type, code, and checksum are used in the verification process of this protocol.
The architecture of this code is fairly simple. The main function sets up a loop which will continually receive raw
information from the socket. From there, a determination is made about what level the information is at. Using this
information we’ll call/dispatch to a function that specialises in that layer.
intmain(){intsock_raw;structsockaddrsaddr;socklen_tsaddr_len=sizeof(saddr);unsignedchar*buffer=(unsignedchar*)malloc(BUFFER_SIZE);if(buffer==NULL){perror("Failed to allocate memory");return1;}sock_raw=socket(AF_PACKET,SOCK_RAW,htons(ETH_P_ALL));if(sock_raw<0){perror("Socket Error");free(buffer);return1;}printf("Starting packet sniffer...\n");while(1){intdata_size=recvfrom(sock_raw,buffer,BUFFER_SIZE,0,&saddr,&saddr_len);if(data_size<0){perror("Failed to receive packets");break;}process_packet(buffer,data_size);}close(sock_raw);free(buffer);return0;}
The recvfrom receives the raw bytes in from the socket.
The process_packet function is responsible for the dispatch of the information. This is really a switch statement
focused on the incoming protocol:
Because of the nature of the information that this application will pull from your system, you will need to run this as
root. You need that low-level access to your networking stack.
sudo ./psniff
Conclusion
Building a network packet sniffer using raw sockets in C offers valuable insight into how data flows through the
network stack and how different protocols interact. By breaking down packets layer by layer—from the Data Link Layer
(Ethernet) to the Transport Layer (TCP, UDP, ICMP)—we gain a deeper understanding of networking concepts and
system-level programming.
This project demonstrates key topics such as:
Capturing raw packets using sockets.
Parsing headers to extract meaningful information.
Mapping functionality to specific OSI layers.
Packet sniffers like this are not only useful for learning but also serve as foundational tools for network
diagnostics, debugging, and security monitoring. However, it’s essential to use such tools ethically and responsibly,
adhering to legal and organizational guidelines.
In the future, we could extend this sniffer by writing packet payloads to a file, adding packet filtering (e.g., only
capturing HTTP or DNS traffic), or even integrating with libraries like libpcap for more advanced use cases.
A full gist of this code is available to check out.
n this tutorial, we will explore how to write a Linux kernel module that intercepts system calls using kernel probes
(kprobes).
Instead of modifying the syscall table—a risky and outdated approach—we will use kprobes, an officially supported and
safer method to trace and modify kernel behavior dynamically.
What Are System Calls?
System calls are the primary mechanism by which user-space applications interact with the operating system’s kernel.
They provide a controlled gateway to hardware and kernel services. For example, opening a file uses the open syscall,
while reading data from it uses the read syscall.
What Are Kernel Probes?
Kprobes are a powerful debugging and tracing mechanism in the Linux kernel. They allow developers to dynamically
intercept and inject logic into almost any kernel function, including system calls. Kprobes work by placing breakpoints
at specific addresses in kernel code, redirecting execution to custom handlers.
Using kprobes, you can intercept system calls like close to log parameters, modify behavior, or gather debugging
information, all without modifying the syscall table or kernel memory structures.
The Code
We have some preparation steps in order to be able to do Linux Kernel module development. If your system is already
setup to do this, you can skip the first section here.
Before we start, remember to do this in a safe environment. Use a virtual machine or a disposable system for
development. Debugging kernel modules can lead to crashes or instability.
Prerequisites
First up, we need to install the prerequisite software in order to write and build modules:
This tells the kernel which function to monitor dynamically.
The handler_pre function is executed before the intercepted function runs. It logs the file descriptor (fd) argument
passed to the close syscall:
staticinthandler_pre(structkprobe*p,structpt_regs*regs){printk(KERN_INFO"Intercepted close syscall: fd=%ld\n",regs->di);return0;}
In this case, regs->di contains the first argument to the syscall (the file descriptor).
The kprobe_init function initialises the kprobe, registers the handler, and logs its status. If registration fails, an
error message is printed:
staticint__initkprobe_init(void){intret;kp.pre_handler=handler_pre;ret=register_kprobe(&kp);if(ret<0){printk(KERN_ERR"register_kprobe failed, returned %d\n",ret);returnret;}printk(KERN_INFO"Kprobe registered\n");return0;}
The kprobe_exit function unregisters the kprobe to ensure no stale probes are left in the kernel:
Now that we’ve got our module code, we can can build and install our module. The following Makefile will allow us to
build our code:
obj-m += syscall_interceptor.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
We build the module:
make
After a successful build, you should be left with a ko file. In my case it’s called syscall_interceptor.ko. This is
the module that we’ll install into the kernel with the following:
sudo insmod syscall_interceptor.ko
Verify
Let’s check dmesg to verify it’s working. As we’ve hooked the close call we should end up with a flood of messages
to verify:
dmesg | tail
You should see something like this:
[ 266.615596] Intercepted close syscall: fd=-60473131794600
[ 266.615596] Intercepted close syscall: fd=-60473131794600
[ 266.615597] Intercepted close syscall: fd=-60473131794600
[ 266.615600] Intercepted close syscall: fd=-60473131794600
[ 266.615731] Intercepted close syscall: fd=-60473131925672
You can unload this module with rmmod:
sudo rmmod syscall_interceptor
Understand Kprobe Handlers
Kprobe handlers allow you to execute custom logic at various stages of the probed function’s execution:
Pre-handler: Runs before the probed instruction.
Post-handler: Runs after the probed instruction (not used in this example).
Fault handler: Runs if an exception occurs during the probe.
Modify the module to add post- or fault-handling logic as needed.
Clean Up
Always unregister kprobes in the module’s exit function to prevent leaving stale probes in the kernel. Use dmesg to
debug any issues during module loading or unloading.
Caveats and Considerations
System Stability: Ensure your handlers execute quickly and avoid blocking operations to prevent affecting system performance.
Kernel Versions: Kprobes are supported in modern kernels, but some symbols may vary between versions.
Ethical Usage: Always ensure you have permission to test and use such modules.
Conclusion
Using kprobes, you can safely and dynamically intercept system calls without modifying critical kernel structures. This
tutorial demonstrates a clean and modern approach to syscall interception, avoiding deprecated or risky techniques like
syscall table modification.