Modern compilers are incredibly sophisticated, capable of transforming even the most inefficient code into highly
optimized machine instructions. Recursive algorithms, often seen as elegant yet potentially costly in terms of
performance, present a fascinating case study for these optimizations. From reducing function call overhead to
transforming recursion into iteration, compilers employ a range of techniques that balance developer productivity with
runtime efficiency.
In this article, we’ll explore how GCC optimizes recursive algorithms. We’ll examine key techniques such as tail-call
optimization, stack management, and inlining through a simple, easy to understand example. By the end, you’ll have a clearer
understanding of the interplay between recursive algorithms and compiler optimizations, equipping you to write code that
performs better while retaining clarity.
Factorial
The first example that we’ll look at is calculating a factorial.
This block of code is fairly simple. n is the factorial that we want to calculate with acc facilitating the
recursive processing that we’re looking to optimise.
-O0
First of all, we’ll compile this function with -O0 (no optimisation):
int factorial(int n, int acc) {
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 89 7d fc mov %edi,-0x4(%rbp)
b: 89 75 f8 mov %esi,-0x8(%rbp)
if (n == 0) {
e: 83 7d fc 00 cmpl $0x0,-0x4(%rbp)
12: 75 05 jne 19 <factorial+0x19>
return acc;
14: 8b 45 f8 mov -0x8(%rbp),%eax
17: eb 16 jmp 2f <factorial+0x2f>
}
return factorial(n - 1, n * acc);
19: 8b 45 fc mov -0x4(%rbp),%eax
1c: 0f af 45 f8 imul -0x8(%rbp),%eax
20: 8b 55 fc mov -0x4(%rbp),%edx
23: 83 ea 01 sub $0x1,%edx
26: 89 c6 mov %eax,%esi
28: 89 d7 mov %edx,%edi
2a: e8 00 00 00 00 call 2f <factorial+0x2f>
}
2f: c9 leave
30: c3 ret
The compiler generates straightforward assembly that closely follows the original C code. No optimizations are applied
to reduce function call overhead or improve performance. You would use this level of optimisation (or lack thereof) in
situations where you might be debugging; and a straight-forward translation of your code is useful.
Stack operations (push, mov, sub, etc.) are explicitly performed for each recursive call. This results in the
largest amount of assembly code and higher function call overhead.
-O1
Next, we’ll re-compile this function at -O1 which will give us basic optimisations:
int factorial(int n, int acc) {
0: 89 f0 mov %esi,%eax
if (n == 0) {
2: 85 ff test %edi,%edi
4: 75 01 jne 7 <factorial+0x7>
return acc;
}
return factorial(n - 1, n * acc);
}
6: c3 ret
int factorial(int n, int acc) {
7: 48 83 ec 08 sub $0x8,%rsp
return factorial(n - 1, n * acc);
b: 0f af c7 imul %edi,%eax
e: 89 c6 mov %eax,%esi
10: 83 ef 01 sub $0x1,%edi
13: e8 00 00 00 00 call 18 <factorial+0x18>
}
18: 48 83 c4 08 add $0x8,%rsp
1c: c3 ret
The first thing to notice here is the stack management at the start of the function.
-O0:
push %rbp
mov %rsp,%rbp
sub $0x10,%rsp
The stack frame is explicitly set up and torn down for every function call, regardless of whether it is needed. This
includes saving the base pointer and reserving 16 bytes of stack space.
We then have slower execution due to redundant stack operations and higher memory overhead.
-O1:
sub $0x8,%rsp
The stack frame is more compact, reducing overhead. The base pointer (%rbp) is no longer saved, as it’s not strictly
necessary. This give us reduced stack usage and faster function calls
Next up, we see optimisations around tail-call optimisation (TCO).
-O0:
call 2f <factorial+0x2f>
Recursive calls are handled traditionally, with each call creating a new stack frame.
-O1:
call 18 <factorial+0x18>
While -O1 still retains recursion, it simplifies the process by preparing for tail-call optimization. Unnecessary
operations before and after the call are eliminated.
We also see some arithmetic simplification between the optimisation levels:
-O0:
mov -0x4(%rbp),%eax
imul -0x8(%rbp),%eax
sub $0x1,%edx
Arithmetic operations explicitly load and store intermediate results in memory, reflecting a direct translation of the
high-level code.
-O1:
imul %edi,%eax
sub $0x1,%edi
Intermediate results are kept in registers (%eax, %edi), avoiding unnecessary memory access.
There’s also some instruction elimination between the optimisation levels:
Each variable is explicitly loaded from the stack and moved between registers, leading to redundant instructions.
-O1:
mov %esi,%eax
The compiler identifies that some operations are unnecessary and eliminates them, reducing instruction count.
We finish off with a return path optimisation.
-O0:
leave
ret
Explicit leave and ret instructions are used to restore the stack and return from the function.
-O1:
ret
The leave instruction is eliminated as it’s redundant when the stack frame is managed efficiently.
With reduced stack overhead and fewer instructions, the function executes faster and consumes less memory at -O1
compared to -O0. Now we’ll see if we can squeeze things even further.
-02
We re-compile the same function again, turning optimisations up to -O2. The resulting generated code is this:
int factorial(int n, int acc) {
0: 89 f0 mov %esi,%eax
if (n == 0) {
2: 85 ff test %edi,%edi
4: 74 28 je 2e <factorial+0x2e>
6: 8d 57 ff lea -0x1(%rdi),%edx
9: 40 f6 c7 01 test $0x1,%dil
d: 74 11 je 20 <factorial+0x20>
return acc;
}
return factorial(n - 1, n * acc);
f: 0f af c7 imul %edi,%eax
12: 89 d7 mov %edx,%edi
if (n == 0) {
14: 85 d2 test %edx,%edx
16: 74 17 je 2f <factorial+0x2f>
18: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
1f: 00
return factorial(n - 1, n * acc);
20: 0f af c7 imul %edi,%eax
23: 8d 57 ff lea -0x1(%rdi),%edx
26: 0f af c2 imul %edx,%eax
if (n == 0) {
29: 83 ef 02 sub $0x2,%edi
2c: 75 f2 jne 20 <factorial+0x20>
}
2e: c3 ret
2f: c3 ret
First we see some instruction-level parallelism here.
-O2 introduces techniques that exploit CPU-level parallelism. This is visible in the addition of the lea (load
effective address) instruction and conditional branching.
-O1:
imul %edi,%eax
sub $0x1,%edi
call 18 <factorial+0x18>
-O2:
lea -0x1(%rdi),%edx
imul %edi,%eax
mov %edx,%edi
test %edx,%edx
jne 20 <factorial+0x20>
At -O2, the compiler begins precomputing values and uses lea to reduce instruction latency. The conditional branch
(test and jne) avoids unnecessary function calls by explicitly checking the termination condition.
Next, we see the compiler partially does some loop unrolling
-O1 Recursion is preserved:
call 18 <factorial+0x18>
-O2 Loop structure replaces recursion:
imul %edi,%eax
lea -0x1(%rdi),%edx
sub $0x2,%edi
jne 20 <factorial+0x20>
The recursion is transformed into a loop-like structure that uses the jne (jump if not equal) instruction to iterate
until the base case is met. This eliminates much of the overhead associated with recursive function calls, such as
managing stack frames.
More redundant operations removed from the code. Redundant instructions like saving and restoring registers are
removed. This is particularly noticeable in how the return path is optimized.
-O1:
add $0x8,%rsp
ret
-O2:
ret
-O2 eliminates the need for stack pointer adjustments because the compiler reduces the stack usage overall.
Finally, we see some more sophisticated conditional simplifications.
-O1:
test %edi,%edi
jne 7 <factorial+0x7>
-O2:
test %edi,%edi
je 2e <factorial+0x2e>
Instead of jumping to a label and performing additional instructions, -O2 jumps directly to the return sequence
(2e <factorial+0x2e>). This improves branch prediction and minimizes unnecessary operations.
These transformations further reduce the number of instructions executed per recursive call, optimizing runtime
efficiency while minimizing memory footprint.
-O3
When we re-compile this code for -O3, we notice that the output code is identical to -O2. This suggests that the
compiler found all of the performance opportunities in previous optimisation levels.
This highlights an important point: not all functions benefit from the most aggressive optimization level.
The factorial function is simple and compact, meaning that the optimizations applied at -O2 (tail-recursion
transformation, register usage, and instruction alignment) have already maximized its efficiency. -O3 doesn’t
introduce further changes because:
The function is too small to benefit from aggressive inlining.
There are no data-parallel computations that could take advantage of SIMD instructions.
Loop unrolling is unnecessary since the tail-recursion has already been transformed into a loop.
For more complex code, -O3 often shines by extracting additional performance through aggressive heuristics, but in
cases like this, the improvements plateau at -O2.
Conclusion
Recursive algorithms can often feel like a trade-off between simplicity and performance, but modern compilers
significantly narrow this gap. By employing advanced optimizations such as tail-call elimination, inline expansion, and
efficient stack management, compilers make it possible to write elegant, recursive solutions without sacrificing runtime
efficiency.
Through the examples in this article, we’ve seen how these optimizations work in practice, as well as their limitations.
Understanding these techniques not only helps you write better code but also deepens your appreciation for the compilers
that turn your ideas into reality. Whether you’re a developer crafting algorithms or just curious about the magic
happening behind the scenes, the insights from this exploration highlight the art and science of compiler design.
Visual effects like water droplets are mesmerizing, and they showcase how simple algorithms can produce complex, beautiful
animations. In this article, I’ll walk you through creating a water droplet effect using VGA mode 13h.
We’ll rely on some of the code that we developed in the VGA routines from Watcom C
article for VGA setup and utility functions, focusing on how to implement the effect itself.
The effect that we’re looking to produce should look something like this:
The Idea
The water droplet effect simulates circular ripples spreading out from random points on the screen. Here’s the high-level approach:
Drops: Represent each drop with a structure containing its position, energy, and ripple generation.
Drawing Ripples: Use trigonometry to create circular patterns for each ripple generation.
Blur Effect: Smooth the buffer to simulate water’s fluid motion.
Palette: Set a blue-themed palette to enhance the watery feel.
Setting the Water Palette
First, we set a blue-tinted gradient palette. Each color gradually transitions from dark blue to bright blue.
voidset_water_palette(){uint16_ti;uint8_tr,g,b;for(i=0;i<256;i++){r=i>>2;// Dim redg=i>>2;// Dim greenb=63;// Maximum blueset_palette(i,r,g,b);}}
Representing Drops
Each drop is represented by a structure that tracks:
(x, y): The origin of the drop.
e: Energy, which fades with time.
g: Current ripple generation.
structdrop{intx;/* original x-coordinate */inty;/* original y-coordinate */inte;/* energy left in the drop */intg;/* current generation */};structdropdrops[N_DROPS];
Creating and Advancing Drops
Drops are reset with random positions, maximum energy, and zero ripple generation:
Ripples are drawn using polar coordinates. We calculate x and y offsets using cosine and sine functions for each
angle and scale by the current generation.
voiddraw_drop(structdrop*d,uint8_t*buffer){// if this droplet still has some energyif(d->e>0){// 0 to 2πfor(floatrad=0.0f;rad<6.28f;rad+=0.05f){// x, y co-ordinates to go around the circleintxx=(int)(cos(rad)*(float)d->g);intyy=(int)(sin(rad)*(float)d->g);// translate them into the fieldxx+=d->x;yy+=d->y;// clip them to the visible fieldif((xx>=0)&&(xx<320)&&(yy>=0)&&(yy<200)){uint16_toffset=xx+(yy<<6)+(yy<<8);// VGA offsetuint16_tc=buffer[offset];// clamp the pixel colour to 255if((c+d->e)>255){c=255;}else{c+=d->e;}// set the pixelbuffer[offset]=c;}}}}
The colour that is rendered to the buffer is additive. We take the current colour at the pixel position, and add to it
giving the droplets a sense of collision when they overlap.
Simulating Fluid Motion
A blur effect smooths the ripples, blending them into neighboring pixels for a more fluid appearance. This is done by
averaging surrounding pixels.
voidblur_buffer(uint8_t*buffer){memset(buffer,0,320);// Clear top bordermemset(buffer+63680,0,320);// Clear bottom borderfor(uint16_ti=320;i<63680;i++){buffer[i]=(buffer[i-321]+buffer[i-320]+buffer[i-319]+buffer[i-1]+buffer[i+1]+buffer[i+319]+buffer[i+320]+buffer[i+321])>>3;// Average of 8 neighbors}}
Main Loop
The main loop handles:
Adding new drops randomly.
Advancing and drawing existing drops.
Applying the blur effect.
Rendering the buffer to the VGA screen.
intmain(){uint8_t*back_buffer=(uint8_t*)malloc(64000);uint8_tdrop_index=0;set_mcga();// Switch to VGA modeset_water_palette();// Set blue gradientclear_buffer(0x00,back_buffer);// Clear the back bufferwhile(!kbhit()){// Continue until a key is pressed// Randomly reset a dropif((rand()%10)==0){reset_drop(&drops[drop_index]);drop_index++;drop_index%=N_DROPS;}// Process and draw each dropfor(inti=0;i<N_DROPS;i++){advance_drop(&drops[i]);draw_drop(&drops[i],back_buffer);}blur_buffer(back_buffer);// Apply the blur effectwait_vsync();// Synchronize with vertical refreshcopy_buffer(vga,back_buffer);// Copy back buffer to screenclear_buffer(0x00,back_buffer);// Clear back buffer for next frame}free(back_buffer);set_text();// Return to text modereturn0;}
Conclusion
This water droplet effect combines simple algorithms with creative use of VGA mode 13h to create a visually stunning effect. By leveraging circular ripples, energy fading, and a blur filter, we replicate the mesmerizing motion of water.
You can find the complete code on GitHub as a gist.
Try it out, tweak the parameters, and share your own effects! There’s a lot of joy in creating beautiful visuals with minimal resources.
The VGA era was all about getting the most out of limited hardware. It required clever tricks to push pixels and make
things move. To make it easier to work with VGA and related concepts, I put together a library called freak.
This library includes tools for VGA handling, keyboard input, vector and matrix math, and
fixed-point math. In this post, I’ll go through each part of the library and explain how it works, with examples.
Be warned - this stuff is old! You’ll need a Watcom Compiler as well as a dos-like environment to be able to run any
of your code. I’ve previously written about getting Watcom up and running with DosBox. If you want to get this running
you can read the following:
The code that this article outlines is available here.
Video routines
First of all, we’re going to take care of shifting in and out of old mode 13.
Setting a video mode
To shift in and out of video modes we use the int 10h bios interrupt.
#define BIOS_VIDEO_80x25 0x03
#define BIOS_VIDEO_320x200x256 0x13
voidfreak_set_video(uint8_tmode);#pragma aux freak_set_video = \
"mov ah, 0" \
"int 0x10" \
parm [al];
/** Sets the video to 320x240x256 */inlinevoidfreak_set_mcga(){freak_set_video(BIOS_VIDEO_320x200x256);}/** Sets the video to 80x25 text */inlinevoidfreak_set_text(){freak_set_video(BIOS_VIDEO_80x25);}
Passing the video mode into al and setting ah to 0 allows us to change modes.
We also need to define where we want to draw to. VGA maps to A000:0000 in real mode. Because we’re in protected mode
(thanks to DOS/4G) we set our pointer to 0xA0000.
We defined the pointer freak_vga as a location in memory. From that point in memory for the next 64,000 bytes (we’re
using 320x200x8 which is 64k) are all of the pixels on the screen.
That means we can treat the screen like any old memory buffer. That also means that we can define virtual buffers as
long as we have 64k to spare; which we do.
You could imagine doing something like this pretty easily:
uint8_t*back_buffer=(uint8_t*)malloc(64000);
We could use memset and memcpy to work with these buffers; or we would write our own optimised implementations to
use instructions to move a double at a time (like movsd and stosd):
/** Clears a buffer with a value */voidfreak_clear_buffer(uint8_tc,uint8_t*buf);#pragma aux freak_clear_buffer = \
"mov ah, al" \
"mov bx, ax" \
"shl eax, 16" \
"mov ax, bx" \
"mov ecx, 16000" \
"rep stosd" \
modify [eax ebx ecx] \
parm [al] [edi];
/** Copies a buffer onto another */voidfreak_copy_buffer(uint8_t*dest,uint8_t*src);#pragma aux freak_copy_buffer = \
"mov ecx, 16000" \
"rep movsd" \
modify [ecx] \
parm [edi] [esi];
Before flipping a back buffer onto the vga surface, we wait for the vsync to complete. This removes any flicker.
/** Waits for a vertical sync to occur */voidfreak_wait_vsync();#pragma aux freak_wait_vsync = \
"mov dx, 03dah" \
"@@vsync1:" \
"in al, dx" \
"test al, 08h" \
"jz @@vsync1" \
"@@vsync2:" \
"in al, dx" \
"test al, 08h" \
"jnz @@vsync2" \
modify [ax dx];
Colours
In mode13, we are given 256 colour slots to where we can control the red, green, and blue component. Whilst the default
palette does provide a vast array of different colours; it kinda sucks.
In order to set the r, g, b components of a colour we first need to write the colour index out to port 0x3c8. We then
write the r, g, and b components sequentially out to 0x3c9.
The fixed point article that I had previously written walks you through
the basic mechanics of the topic. The bit lengths of the whole and fractional parts are pretty small; and unusable. So
we’re going to use this technique, but scale it up.
Conversions
First of all, we need to be able to go from the “C type world” (the world of int and double, for instance) into the
“fixed point world”. We also need to make our way back:
Our trig tables are based around a nerd number of 1,024 making this a little easier to reason about and giving us an
acceptable level of precision between fractions of radians for what we need.
These are then nicely wrapped up in macros.
Operations
The fixed multiply is a very simple integer-based operation (by design):
To rotate around an arbitrary axis \(\mathbf{a} = \begin{bmatrix} a_x \\ a_y \\ a_z \end{bmatrix}\) by an angle \(\theta\),
the rotation matrix is defined as:
The freak library is my attempt to distill the essence of classic VGA programming into a modern, accessible toolkit. By
combining essential building blocks like graphics handling, input, and math operations, it provides everything you need
to recreate the magic of the demoscene or explore retro-style programming.
I hope this article inspires you to dive into the world of low-level programming and experiment with the techniques that
defined a generation of creativity. Whether you’re building your first polygon renderer or optimizing an effect with
fixed-point math, freak is here to make the journey both rewarding and fun.
Let me know what you think or share what you build—there’s nothing quite like seeing new creations come to life with
tools like these!
Sometimes, you need to squeeze more performance out of your Python code, and one great way to do that is to offload some of your CPU-intensive tasks to an extension. Traditionally, you might use a language like C for this. I’ve covered this topic in a previous post.
In today’s post, we’ll use the Rust language to create an extension that can be called from Python. We’ll also explore the reverse: allowing your Rust code to call Python.
Setup
Start by creating a new project. You’ll need to switch to the nightly Rust compiler:
# Create a new project
cargo new hello_world_ext
cd hello_world_ext
# Set the preference to use the nightly compiler
rustup override set nightly
Next, ensure pyo3 is installed with the extension-module feature enabled. Update your Cargo.toml file:
This function simply returns the string "Hello, world!".
The #[pyfunction] attribute macro exposes Rust functions to Python. The return type PyResult<T> is an alias for Result<T, PyErr>, which handles Python function call results.
The #[pymodule] attribute macro defines the module. The add_wrapped method adds the wrapped function to the module.
Building
With the code in place, build the module:
cargo build
Once built, install it as a Python package using maturin. First, set up a virtual environment and install maturin:
# Create a new virtual environment
python -m venv venv
# Activate the environmentsource ./venv/bin/activate
# Install maturin
pip install maturin
Now, build and install the module:
maturin develop
The develop command that we use here builds our extension, and automatically installs the result into our virtual
environment. This makes life easy for us during the development and testing stages.
usepyo3::prelude::*;usepyo3::types::IntoPyDict;fnmain()->PyResult<()>{Python::with_gil(|py|{letsys=py.import("sys")?;letversion:String=sys.getattr("version")?.extract()?;letlocals=[("os",py.import("os")?)].into_py_dict(py);letuser:String=py.eval("os.getenv('USER') or os.getenv('USERNAME') or 'Unknown'",None,Some(&locals))?.extract()?;println!("Hello {}, I'm Python {}",user,version);Ok(())})}
Rewriting critical pieces of your Python code in a lower-level language like Rust can significantly improve performance. With pyo3, the integration between Python and Rust becomes seamless, allowing you to harness the best of both worlds.
In a previous post we covered the basic setup on
drawing to a <canvas> object via WebAssembly (WASM). In today’s article, we’ll create animated graphics directly on a
HTML5 canvas.
We’ll break down the provided code into digestible segments and walk through each part to understand how it works. By
the end of this article, you’ll have a clear picture of how to:
Set up an HTML5 canvas and interact with it using Rust and WebAssembly.
Generate random visual effects with Rust’s rand crate.
Build an animation loop with requestAnimationFrame.
Use shared, mutable state with Rc and RefCell in Rust.
Let’s get started.
Walkthrough
I won’t cover the project setup and basics here. The previous post
has all of that information for you. I will cover some dependencies that you need for your project here:
There’s a number of features in use there from web-sys. These will become clearer as we go through the code. The
getrandom dependency has web assembly support so
we can use this to make our animations slightly generative.
Getting Browser Access
First thing we’ll do is to define some helper functions that will try and acquire different features in the browser.
We need to be able to access the browser’s window object.
fnwindow()->web_sys::Window{web_sys::window().expect("no global `window` exists")}
This function requests the common window object from the Javascript environment. The expect will give us an error
context if it fails, telling us that no window exists.
The function being requested here is documented as the callback.
The window.requestAnimationFrame() method tells the browser you wish to perform an animation. It requests the browser to call a user-supplied callback function before the next repaint.
This will come in handy to do our repaints.
Now, in our run function, we can start to access parts of the HTML document that we’ll need references for. Sitting in
our HTML template, we have the <canvas> tag that we want access to:
When we double-buffer graphics, we need to allocate the block of memory that will act as our “virtual screen”. We draw
to that virtual screen, and then “flip” or “blit” that virtual screen (piece of memory) onto video memory to give the
graphics movement.
The size of our buffer will be width * height * number_of_bytes_per_pixel. With a red, green, blue, and alpha channel
that makes 4 bytes.
Animation Loop
We can now setup our animation loop.
This approach allows the closure to reference itself so it can schedule the next frame, solving Rust’s strict ownership
and borrowing constraints.
letf=Rc::new(RefCell::new(None));letg=f.clone();*g.borrow_mut()=Some(Closure::new(move||{// do the animation code here// queue up another re-draw requestrequest_animation_frame(f.borrow().as_ref().unwrap());});// queue up the first re-draw request, to start animationrequest_animation_frame(g.borrow().as_ref().unwrap());
This pattern is common in Rust for managing shared, mutable state when working with closures in scenarios where you need
to reference a value multiple times or recursively, such as with event loops or callback-based systems. Let me break it
down step-by-step:
The Components
Rc(Reference Counted Pointer):
Rc allows multiple ownership of the same data by creating a reference-counted pointer. When the last reference to the data is dropped, the data is cleaned up.
In this case, it enables both f and g to share ownership of the same RefCell.
RefCell(Interior Mutability):
RefCell allows mutable access to data even when it is inside an immutable container like Rc.
This is crucial because Rc itself does not allow mutable access to its contents by design (to prevent race conditions in a single-threaded context).
Closure:
A closure in Rust is a function-like construct that can capture variables from its surrounding scope.
In the given code, a Closure is being stored in the RefCell for later use.
What’s Happening Here?
Shared Ownership:
Rc is used to allow multiple references (f and g) to the same underlying RefCell. This is required because the closure may need to reference f while being stored in it, which is impossible without shared ownership.
Mutation with RefCell:
RefCell enables modifying the underlying data (None → Some(Closure)) despite Rc being immutable.
Setting the Closure:
The closure is created and stored in the RefCell via *g.borrow_mut().
This closure may reference f for recursive or repeated access.
We follow this particular pattern here because the closure needs access to itself in order to recursively schedule calls
to requestAnimationFrame. By storing the closure in the RefCell, the closure can call itself indirectly.
If we didn’t use this pattern, we’d have some lifetime/ownership issues. Referencing the closure while defining it
would create a circular reference problem that Rust wouldn’t allow.
Drawing
We’re going to find a random point on our virtual screen to draw, and we’re going to pick a random shade of grey. We’re
going to need a random number generator:
Blitting refers to copying pixel data from the backbuffer to the canvas in a single operation. This ensures the displayed
image updates smoothly
Now we need to blit that back buffer onto our canvas. We need to create an ImageData object in order to do this.
Passing in our backbuffer object, we can create one with the following:
letimage_data=ImageData::new_with_u8_clamped_array_and_sh(Clamped(&backbuffer),// Wrap the slice with Clampedwidthasu32,heightasu32,).unwrap();
We then use our 2d context to simply draw the image:
And there you have it—a complete walkthrough of creating dynamic canvas animations with Rust and WebAssembly! We covered
how to:
Set up the canvas and prepare a backbuffer for pixel manipulation.
Use Rust’s rand crate to generate random visual effects.
Manage mutable state with Rc and RefCell for animation loops.
Leverage requestAnimationFrame to achieve smooth, frame-based updates.
This approach combines Rust’s strengths with the accessibility of modern web technologies, allowing you to build fast,
interactive graphics directly in the browser.