This post contains implementations of echo and printenv in 32-bit x86 assembly for Linux.
echo is a Unix utility that prints its arguments to standard output.
printenv is a Unix utility that prints the environment to standard output.
The core functionality of these programs can be written in a few lines of C, where program arguments and the environment are passed as function arguments to main.
When a process is executed on Linux (or other Unix-like systems), its stack contains pointers to the program arguments and the environment, as shown below.
|--------------------------| Low 0(%esp) | Argument Count | Addresses |--------------------------| 4(%esp) | Argument Pointers | | ... | |--------------------------| | 0 | |--------------------------| | Environment Pointers | | ... | |--------------------------| | 0 | |--------------------------| | Additional Data | | ... | High |--------------------------| Addresses
The stack grows downward in memory. That is, the top of the stack has the lowest memory address. When a program is executed, the top of the process stack contains 1) the argument count (conventionally referred to as argc in source code), followed by 2) pointers to the argument strings, 3) zero, 4) pointers to the environment strings, 5) zero, and 6) additional data (including the data that the argument/environment pointers reference).
echo and printenv can be implemented in assembly language by traversing the stack and printing out the relevant strings.
Helper Functions
Both assembly programs have a helper macro, print, for writing to standard output. They also share a helper function, strlen, which returns the length of a string.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
.section .text | |
# ************************************ | |
# * print macro | |
# * Caller is responsible for setting | |
# * %ecx and %edx, and saving %eax and | |
# * %ebx if necessary. | |
# ************************************ | |
.macro print | |
movl $4, %eax | |
movl $1, %ebx | |
int $0x80 | |
.endm | |
.section .text | |
# ************************************ | |
# * int strlen(char* str); | |
# * Returns the length of a string. | |
# ************************************ | |
.type strlen, @function | |
strlen: | |
pushl %ebp | |
movl %esp, %ebp | |
movl $0, %eax # Index | |
movl 8(%ebp), %ecx # Address of str | |
strlen_loop: | |
movb (%ecx,%eax,1), %dl # Current char | |
cmpb $0, %dl | |
je strlen_end | |
incl %eax | |
jmp strlen_loop | |
strlen_end: | |
movl %ebp, %esp | |
popl %ebp | |
ret |
echo Assembly Code
The assembly code for echo iterates over the argument pointers on the stack, printing the string corresponding to each argument. The iteration starts at the second element on the stack (past the first element, argument count), and stops when reaching a zero.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/*** echo.s ***/ | |
// Description | |
// echo – print arguments to stdout | |
// Synopsis | |
// echo [STRING]… | |
// Build | |
// $ as –32 -o echo.o echo.s | |
// $ ld -m elf_i386 -o echo echo.o | |
# ************************************ | |
# * print macro and strlen function | |
# * are omitted. | |
# ************************************ | |
.section .rodata | |
newline_char: | |
.ascii "\n" | |
space_char: | |
.ascii " " | |
.section .text | |
# ************************************ | |
# * echo [STRING]… | |
# * Prints the string(s) to stdout. | |
# ************************************ | |
.globl _start | |
_start: | |
movl %esp, %ebp | |
# Local variables (offset from %ebp) | |
.equ index, -4 # Stack index being operated on | |
.equ address, -8 # Current arg address | |
.equ length, -12 # Current arg length | |
subl $12, %esp | |
# Start at index 1 (this skips argc) | |
movl $1, index(%ebp) | |
echo_loop: | |
# Set address local variable | |
movl index(%ebp), %ecx | |
movl 4(%ebp, %ecx, 4), %eax | |
movl %eax, address(%ebp) | |
# Check if we're done (reached a NULL pointer) | |
cmpl $0, address(%ebp) | |
je echo_loop_end | |
# Calculate length of string | |
pushl address(%ebp) | |
call strlen | |
addl $4, %esp | |
# Set length local variable | |
movl %eax, length(%ebp) | |
# Print leading space if index > 1 | |
cmpl $1, index(%ebp) | |
jle leading_space_end | |
movl $space_char, %ecx | |
movl $1, %edx | |
leading_space_end: | |
# Print current argument | |
movl address(%ebp), %ecx | |
movl length(%ebp), %edx | |
incl index(%ebp) | |
jmp echo_loop | |
echo_loop_end: | |
# Print newline char | |
movl $newline_char, %ecx | |
movl $1, %edx | |
# Exit | |
movl $1, %eax | |
movl $0, %ebx | |
int $0x80 | |
The full source code for echo, including the print macro and strlen function, is available at https://gist.github.com/dstein64/890e02e8e277f17d931c8a250ceaaf44.
printenv Assembly Code
The assembly code for printenv is similar to the code above for echo, but starts iteration a few elements deeper into the stack, at the first environment variable pointer. It uses the argument count on the stack to jump past the argument pointers.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/*** printenv.s ***/ | |
// Description | |
// printenv – print the environment to stdout | |
// Synopsis | |
// printenv | |
// Build | |
// $ as –32 -o printenv.o printenv.s | |
// $ ld -m elf_i386 -o printenv printenv.o | |
# ************************************ | |
# * print macro and strlen function | |
# * are omitted. | |
# ************************************ | |
.section .rodata | |
newline_char: | |
.ascii "\n" | |
.section .text | |
# ************************************ | |
# * printenv | |
# * Prints the environment to stdout. | |
# ************************************ | |
.globl _start | |
_start: | |
movl %esp, %ebp | |
# Local variables (offset from %ebp) | |
.equ index, -4 # Stack index being operated on | |
.equ address, -8 # Current env var address | |
.equ length, -12 # Current env var length | |
subl $12, %esp | |
# Start at index argc+1 (this skips the argument vector) | |
movl $0, index(%ebp) | |
movl (%ebp), %eax # argc | |
incl %eax | |
addl %eax, index(%ebp) | |
printenv_loop: | |
# Set address local variable | |
movl index(%ebp), %ecx | |
movl 4(%ebp, %ecx, 4), %eax | |
movl %eax, address(%ebp) | |
# Check if we're done (reached a NULL pointer) | |
cmpl $0, address(%ebp) | |
je printenv_loop_end | |
# Calculate length of string | |
pushl address(%ebp) | |
call strlen | |
addl $4, %esp | |
# Set length local variable | |
movl %eax, length(%ebp) | |
# Print current env var | |
movl address(%ebp), %ecx | |
movl length(%ebp), %edx | |
# Print newline char | |
movl $newline_char, %ecx | |
movl $1, %edx | |
incl index(%ebp) | |
jmp printenv_loop | |
printenv_loop_end: | |
# Exit | |
movl $1, %eax | |
movl $0, %ebx | |
int $0x80 |
The full source code for printenv, including the print macro and strlen function, is available at https://gist.github.com/dstein64/a52146a3c6a12c8c0b84cfd4e084bb15.