Cogs and Levers A blog full of technical stuff

Making a REPL with NASM and glibc

In the previous article we learned something important:

Assembly becomes dramatically more productive the moment you stop rewriting libc.

Printing text, formatting numbers, comparing strings, and handling input are already solved problems — and they’ve been solved extremely well.

Now we’re going to push that idea to its natural conclusion. We are going to write a real interactive program in pure assembly. A program that stays alive, reads commands, parses arguments, and performs actions.

In other words — a REPL.

By the end, this will work:

> help
commands: help add quit

> add 5 7
12

> add 1 2
3

> what
unknown command

> quit
bye

And we still won’t write a single syscall.

The full code listing for this article can be found here. We will be covering this code, piece by piece.

The Shape of the Program

Before writing any code, we need to understand the structure.

A REPL is just a loop:

  1. print a prompt
  2. read a line
  3. decide what it means
  4. run a handler
  5. repeat

There is no magic here. High level languages don’t provide REPLs — they just hide loops.

In assembly, we simply write the loop ourselves.

External Functions

We will use these glibc functions:

  • printf — formatted output
  • getline — dynamic input
  • strcmp — command matching
  • atoi — integer parsing
  • free — memory ownership

Let’s declare them.

BITS 64
DEFAULT REL

extern printf
extern getline
extern strcmp
extern atoi
extern free

global main

Exactly like before, these symbols exist inside glibc and will be resolved at link time.

Static Data

We now define the strings our program will use.

section .rodata
prompt      db "> ", 0
bye_msg     db "bye", 10, 0
unk_msg     db "unknown command", 10, 0
help_msg    db "commands: help add quit", 10, 0
add_fmt     db "%d", 10, 0

cmd_help    db "help", 0
cmd_add     db "add", 0
cmd_quit    db "quit", 0

This is exactly like C string constants — null terminated and stored in read-only memory.

Writable Storage

We now need somewhere to store input state.

getline allocates memory for us, but we must own the pointer.

section .bss
lineptr     resq 1
linesize    resq 1

This is important.

getline does not return a string.

It fills a pointer that we provide.
That pointer may be reallocated between calls.

So we must store it globally.

Program Entry

We now write main.

section .text
main:
  push rbp
  mov  rbp, rsp

We create a normal stack frame. Not strictly required — but keeps debugging sane and mirrors C expectations.

Now we initialise the buffer state.

mov qword [lineptr], 0
mov qword [linesize], 0

This tells getline:

I do not own a buffer yet — please allocate one.

The REPL Loop

Here is the heart of the program.

repl:

A label is all a loop really is.

Printing the Prompt

lea rdi, [rel prompt]
xor eax, eax
call printf wrt ..plt

We load the format string into rdi.

Why xor eax, eax?

Because printf is variadic.
The System V ABI requires rax to contain the number of vector registers used — zero in our case.

C hides this rule. Assembly makes you honest.

Reading a Line

lea rdi, [rel lineptr]
lea rsi, [rel linesize]
mov rdx, [rel stdin]
call getline wrt ..plt

getline signature:

ssize_t getline(char **lineptr, size_t *n, FILE *stream);

So we pass:

register value
rdi pointer to buffer pointer
rsi pointer to size
rdx stdin

This function may:

  • allocate memory
  • grow memory
  • reuse memory

Which means:

We must eventually call free.

Extract Command

We now compare the input against commands.

mov rdi, [lineptr]
lea rsi, [rel cmd_help]
call strcmp wrt ..plt
test eax, eax
je do_help

strcmp returns zero when equal.

So we branch. This is effectively our switch and case.

Unknown Command Fallback

lea rdi, [rel unk_msg]
xor eax, eax
call printf wrt ..plt
jmp repl

This is our default case.

Help Command

do_help:
lea rdi, [rel help_msg]
xor eax, eax
call printf wrt ..plt
jmp repl

No surprises — just structured control flow.

Assembly is not chaotic.
It just doesn’t auto-indent for you.

Quit Command

mov rdi, [lineptr]
lea rsi, [rel cmd_quit]
call strcmp wrt ..plt
test eax, eax
je do_quit
do_quit:
  lea rdi, [rel bye_msg]
  xor eax, eax
  call printf wrt ..plt

  mov rdi, [lineptr]
  call free wrt ..plt

  xor eax, eax
  leave
  ret

Here we finally release memory ownership.

This is the most important rule in the entire article:

If libc allocates it, libc expects you to free it.

Assembly didn’t make this hard — ignoring ownership did.

Add Command (Parsing Arguments)

Now the interesting part.

We skip "add " and parse numbers.

do_add:
mov rbx, [lineptr]
add rbx, 4

We manually advance past "add ".

This is literally what C does internally. Now we process the first number.

mov rdi, rbx
call atoi wrt ..plt
mov r12d, eax

atoi converts text to integer.

We store it in a preserved register. Now we’ll look for the second parameter.

find_space:
cmp byte [rbx], 0
je repl
cmp byte [rbx], ' '
je found_space
inc rbx
jmp find_space

found_space:
inc rbx

We manually walk the string.

This is what string parsing actually is:
a loop and a condition.

mov rdi, rbx
call atoi wrt ..plt
add eax, r12d

Now we have the result.

mov esi, eax
lea rdi, [rel add_fmt]
xor eax, eax
call printf wrt ..plt
jmp repl

And the loop continues.

Building

Same as before.

nasm -felf64 repl.asm -o repl.o
gcc repl.o -o repl

What We Actually Built

We did not implement:

  • input buffering
  • dynamic allocation
  • number parsing
  • formatted output
  • terminal handling

Yet this is undeniably a real interactive program.

The difference between C and assembly is not capability.

It is visibility.

C hides the machine.
Assembly exposes it.
glibc carries the weight in both cases.

Conclusion

Assembly feels impossible when you try to do everything yourself.

But real programs were never written that way — not even in the 1970s.

They were written as small pieces of logic sitting on top of shared libraries.

That’s exactly what we built here.