By the end of
part Ⅰ,
the classic phrase,
“Hello, world!”,
had appeared on screen
in beautiful bright green
and all was well.
It was, though,
not at all close
to being any sort of game.
To get closer to
being able to implement Tetris,
more I/O functions are necessary.
And in implementing these functions,
it will be useful to be able to debug them.
Debugging
Assembly is just as easy to debug as C.
In fact, the same tools work for both.
As with C,
the first thing to do
is compile with debugging information.
The flag for NASM is the same as for GCC, -g
.
It’s also useful to set a flag
for conditional compilation with
-d DEBUG
.
This can be tested in NASM
very similarly to C.
%ifdef DEBUG
; ...
%endif
To have QEMU listen for a debugger,
use the -s
flag.
To have it pause execution
until a debugger is attached
(and the continue
command issued),
use the -S
flag as well.
After starting GDB
with the path to the kernel binary,
it can be attached
to a listening QEMU instance with
target remote localhost:1234
.
This command can be executed on start
with the -ex
command-line option.
To have GDB output the current instruction
at each step,
use display/i $pc
and set disassembly-flavor intel
.
Most GDB commands can be used as normal.
However,
there are alternate commands
to step by instruction:
nexti
and stepi
(or ni
and si
).
Register values can be printed
with p $eax
, for example.
Memory can be printed with x
.
(gdb) break main
Breakpoint 1 at 0x101a00
(gdb) c
Continuing.
Breakpoint 1, 0x00101a00 in main ()
1: x/i $pc
=> 0x101a00 <main>: push 0x20
(gdb) si
0x00101a02 in main ()
1: x/i $pc
=> 0x101a02 <main+2>: call 0x101d97 <clear>
(gdb) p $esp
$1 = (void *) 0x1040fc
(gdb) x $esp
0x1040fc: 0x00000020
The puts
function from last time
puts strings on the screen,
but provides no way
to print numbers.
The traditional name for a function
that formats numbers as strings
is itoa
.
The function will take a double-word number
and a word containing
the radix in the upper byte
and the desired width in the lower byte.
It will return a pointer
into a static string buffer.
Since the game will be single-threaded,
a static buffer works fine here.
Since the longest possible output
is a 32-bit double-word formatted as binary,
the static buffer needs to be 32 bytes long
with a null-terminator.
The function will also need a list
of digit characters to use.
section .data
output db '00000000000000000000000000000000', 0
digits db '0123456789ABCDEF'
The implementation will be using
all of the callee-saved registers,
so its prologue and epilogue look like this.
section .text
itoa:
push ebp
mov ebp, esp
push ebx
push esi
push edi
pop edi
pop esi
pop ebx
mov esp, ebp
pop ebp
ret
Algorithm
The algorithm for formatting numbers
is the same as for converting numbers between radixes.
- Divide the number by the radix
- The remainder is the next digit, starting from the least significant
- Repeat with the quotient until it is zero
For example,
the remainders from dividing 26 by 2
give its binary representation,
11010.
- 26 = 2 × 13 + 0
- 13 = 2 × 6 + 1
- 6 = 2 × 3 + 0
- 3 = 2 × 1 + 1
- 1 = 2 × 0 + 1
To use this to format numbers,
the remainders will be used
as offsets into the list of digit characters.
Additionally,
since itoa
takes a width,
it will simply loop that number of times,
rather than stopping at a quotient of zero.
Implementation
Since the digits are calculated least-significant first,
itoa
needs to start at the end of the string buffer.
To work backwards using string operations,
it uses the std
instruction.
lea edi, [output + 32]
std
Next, the number is loaded into eax
and the base into ebx
.
The width is loaded into ecx
for use with loop
.
mov eax, [ebp + 8]
movzx ebx, byte [ebp + 13]
movzx ecx, byte [ebp + 12]
The loop itself first clears edx
on each iteration,
since 32-bit division treats it as the high bits of the dividend.
It then divides,
indexes into the digit characters,
and writes to the buffer.
The quotient remains in eax
for further division.
.loop:
xor edx, edx
div ebx
lea esi, [digits + edx]
movsb
loop .loop
At the end of this loop,
edi
has actually been decremented
one too many times,
so one is added
when setting eax
,
the return value.
According to the calling convention,
the direction flag is also reset
with cld
.
Hello, numbers
Now that itoa
is implemented,
it can be combined with puts
to print numbers.
main:
push word BG.BLACK | ' '
call clear
push word 0x0208
push dword 26
call itoa
push dword 0x0101 << 16 | FG.BRIGHT | FG.GREEN
push eax
call puts
add esp, 16
jmp halt
Way better than “Hello, world!”, right?
That covers the O of “I/O,”
now for the I.
There are two ways to read keyboard input on x86:
with interrupts
and through polling.
Interrupts are complicated to set up,
and they aren’t needed
for this particular case,
so polling will be used.
Keyboard input comes in the form
of byte-sized “scancodes.”
There are key-down and key-up scancodes
for each key,
which are read
from port 0x60
.
To be useful,
the scan
function
will keep track
of the previously read scancode
to detect when it changes.
Since the function won’t be taking any parameters,
it doesn’t need a prologue or epilogue.
It will, however,
start by zeroing eax
,
since it will set only al
for its return value.
section .text
scan:
xor eax, eax
Then,
the current scancode is read
and compared to the previous code.
If the scancode has not changed,
zero is returned.
Otherwise,
the previous scancode is updated
and returned.
in al, 0x60
cmp al, [key]
je .zero
mov [key], al
ret
.zero:
xor al, al
ret
Putting everything together,
an input loop can
show the last scancode
on the screen.
This is useful
finding scancodes of new keys.
main:
push BG.BLACK | ' '
call clear
add esp, 2
.loop:
call scan
test al, al
jz .loop
push word 0x0A02
push eax
call itoa
push dword 0x0101 << 16 | FG.BRIGHT | FG.GREEN
push eax
call puts
add esp, 16
jmp .loop
This is what holding the left shift key looks like.
To reset using the keyboard,
the returned scancode can be compared
to the R key-down scancode, 0x13
.
call scan
test al, al
jz .loop
cmp al, 0x13
je reset
Next time…
Speaking of time,
the next set of basic functions
to implement will be for timing.
After that,
it should be possible
to start making the actual game.