This is a series about Star Anise Chronicles: Cheezball Rising, an expansive adventure game about my cat for the Game Boy Color. Follow along as I struggle to make something with this bleeding-edge console!
GitHub has intermittent prebuilt ROMs, or you can get them a week early on Patreon if you pledge $4. More details in the README!
In this issue, I fill in the remaining bits necessary to have something that looks like a game.
Previously: drawing a sprite.
Next: a little spring cleaning.
Recap
So far, I have this.

It took unfathomable amounts of effort, but it’s something! Now to improve this from a static image to something a bit more game-like.
Quick note: I’ve been advised to use the de facto standard hardware.inc
file, which gives symbolic names to all the registers and some of the flags they use. I hadn’t introduced it yet while doing the work described in this post, but for the sake of readability, I’m going to pretend I did and use that file’s constants in the code snippets here.
Interrupts
To get much further, I need to deal with interrupts. And to explain interrupts, I need to briefly explain calls.
Assembly doesn’t really have functions, only addresses and jumps. That said, the Game Boy does have call
and ret
instructions. A call
will push the PC register (program counter, the address of the current instruction) onto the stack and perform a jump; a ret
will pop into the PC register, effectively jumping back to the source of the call
.
There are no arguments, return values, or scoping; input and output must be mediated by each function, usually via registers. Of course, since registers are global, a “function” might trample over their values in the course of whatever work it does. A function can manually push
and pop
16-bit register pairs to preserve their values, or leave it up to the caller for speed/space reasons. All the conventions are free for me to invent or ignore. A “function” can even jump directly to another function and piggyback on the second function’s ret
, kind of like Perl’s goto &sub
… which I realize is probably less common knowledge than how call/return work in assembly.
Interrupts, then, are calls that can happen at any time. When one of a handful of conditions occurs, the CPU can immediately (or, rather, just before the next instruction) call an interrupt handler, regardless of what it was already doing. When the handler returns, execution resumes in the interrupted code.
Of course, since they might be called anywhere, interrupt handlers need to be very careful about preserving the CPU state. Pushing af
is especially important (and this is the one place where af
is used as a pair), because a
is necessary for getting almost anything done, and f
holds the flags which most instructions will invisibly trample.
Naturally, I completely forgot about this the first time around.
The Game Boy has five interrupts, each with a handler at a fixed address very low in ROM. Each handler only has room for eight bytes’ worth of instructions, which is enough to do a very tiny amount of work — or to just jump elsewhere.
A good start is to populate each one with only the reti
instruction, which returns as usual and re-enables interrupts. The CPU disables interrupts when it calls an interrupt handler (so they thankfully can’t interrupt themselves), and returning with only ret
will leave them disabled.
Naturally, I completely forgot about this the first time around.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
These will do nothing. I mean, obviously, but they’ll do even less than nothing until I enable them. Interrupts are enabled by the dedicated ei
instruction, which enables any interrupts whose corresponding bit is set in the IE register ($ffff).
So… which one do I want?
Game loop
To have a game, I need a game loop. The basic structure of pretty much any loop looks like:
- Load stuff.
- Check for input.
- Update the game state.
- Draw the game state.
- GOTO 2
(If you’ve never seen a real game loop written out before, LÖVE’s default loop is a good example, though even a huge system like Unity follows the same basic structure.)
The Game Boy seems to introduce a wrinkle here. I don’t actually draw anything myself; rather, the hardware does the drawing, and I tell it what to draw by using the palette registers, OAM, and VRAM.
But in fact, this isn’t too far off from how LÖVE (or Unity) works! All the drawing I do is applied to a buffer, not the screen; once the drawing is complete, the main loop calls present()
, which waits until vblank and then draws the buffer to the screen. So what you see on the screen is delayed by up to a frame, and the loop really has an extra “wait for vsync” step at 3½. Or, with a little rearrangement:
- Load stuff.
- Wait for vblank.
- Draw the game state.
- Check for input.
- Update the game state.
- GOTO 2
This is approaching something I can implement! It works out especially well because it does all the drawing as early as possible during vblank. That’s good, because the LCD operation looks something like this:
1 2 3 4 5 6 7 |
|
While the LCD is refreshing, I can’t (easily) update anything it might read from. I only have free control over VRAM et al. during a short interval after vblank, so I need to do all my drawing work right then to ensure it happens before the LCD starts refreshing again. Then I’m free to update the world while the LCD is busy.
First, right at the entry point, I enable the vblank interrupt. It’s bit 0 of the IE register, but hardware.inc
has me covered.
1 2 3 4 5 |
|
Next I need to make the handler actually do something. The obvious approach is for the handler to call one iteration of the game loop, but there are a couple problems with that. For one, interrupts are disabled when a handler is called, so I would never get any other interrupts. I could explicitly re-enable interrupts, but that raises a bigger question: what happens if the game lags, and updating the world takes longer than a frame? With this approach, the game loop would interrupt itself and then either return back into itself somewhere and cause untold chaos, or take too long again and eventually overflow the stack. Neither is appealing.
An alternative approach, which I found in gb-template but only truly appreciated after some thought, is for the vblank handler to set a flag and immediately return. The game loop can then wait until the flag is set before each iteration, just like LÖVE does. If an update takes longer than a frame, no problem: the loop will always wait until the next vblank, and the game will simply run more slowly.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The handler fits in eight bytes — the linker would yell at me if it didn’t, since another section starts at $0048! — and leaves all the registers in their previous states. As I mentioned before, I originally neglected to preserve registers, and some zany things started to happen as a
and f
were abruptly altered in the middle of other code. Whoops!
Now the main loop can look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
It’s looking all the more convenient that I have my own copy of OAM — I can update it whenever I want during this loop! I might need similar facilities later on for editing VRAM or changing palettes.
Doing something and reading input
I have a loop, but since nothing’s happening, that’s not especially obvious. Input would take a little effort, so I’ll try something simpler first: making Anise move around.
I don’t actually track Anise’s position anywhere right now, except for in the OAM buffer. Good enough. In my main loop, I add:
1 2 3 4 |
|
The second byte in each OAM entry is the x-coordinate, and indeed, this causes Anise’s torso to glide rightwards across the screen at 60ish pixels per second. Eventually the x-coordinate overflows, but that’s fine; it wraps back to zero and moves the sprite back on-screen from the left.

Excellent. I mean, sorry, this is extremely hard to look at, but bear with me a second.
This would be a bit more game-like if I could control it with the buttons, so let’s read from them.
There are eight buttons: up, down, left, right, A, B, start, select. There are also eight bits in a byte. You might suspect that I can simply read an I/O register to get the current state of all eight buttons at once.
Ha, ha! You naïve fool. Of course it’s more convoluted than that. That single byte thing is a pretty good idea, though, so what I’ll do is read the input at the start of the frame and coax it into a byte that I can consult more easily later.
Turns out I pretty much have to do that, because button access is slightly flaky. Even the official manual advises reading the buttons several times to get a reliable result. Yikes.
Here’s how to do it. The buttons are wired in two groups of four: the dpad and everything else. Reading them is thus also done in two groups of four. I need to use the P1 register, which I assume is short for “player 1” and is so named because the people who designed this hardware had also designed the two-player NES?
Bits 5 and 6 of P1 determine which set of four buttons I want to read, and then the lower nybble contains the state of those buttons. Note that each bit is set to 1 if the button is released; I think this is a quirk of how they’re wired, and what I’m doing is extremely direct hardware access. Exciting! (Also very confusing on my first try, where Anise’s movement was inverted.)
The code, which is very similar to an example in the official manual, thus looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
|
Phew. That was a bit of a journey, but now I have the button state as a single byte. To help with reading the buttons, I’ll also define a few constants labeling the individual bits. (There are instructions for reading a particular bit by number, so I don’t need to mask a single bit out.)
1 2 3 4 5 6 7 8 9 |
|
Now to adjust the sprite position based on what directions are held down. Delete the old code and replace it with:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Miraculously, Anise’s torso now moves around on command!

Neat! But this still looks really, really, incredibly bad.
Aesthetics
It’s time to do something about this artwork.
First things first: I’m really tired of writing out colors by hand, in binary, so let’s fix that. In reality, I did this bit after adding better art, but doing it first is better for everyone.
I think I’ve mentioned before that rgbasm has (very, very rudimentary) support for macros, and this seems like a perfect use case for one. I’d like to be able to write colors out in typical rrggbb
hex fashion, so I need to convert a 24-bit color to a 16-bit one.
1 2 3 4 5 6 |
|
This is going to need a whole paragraph of caveats.
A macro is contained between MACRO
and ENDM
. The assembler has a curious sort of universal assignment syntax, where even ephemeral constructs like macros are introduced by labels. Macros can take arguments, but they aren’t declared; they’re passed more like arguments to shell scripts, where the first argument is \1
and so forth. (There’s even a SHIFT
command for accessing arguments beyond the ninth.) Also, passing strings to a macro is some kind of byzantine nightmare where you have to slap backslashes in just the right places and I will probably avoid doing it altogether if I can at all help it.
Oh, one other caveat: compile-time assignments like I have above must start in the first column. I believe this is because assignments are also labels, and labels have to start in the first column. It’s a bit weird and apparently rgbasm’s lexer is horrifying, but I’ll take it over writing my own assembler and stretching this project out any further.
Anyway, all of that lets me write dcolor $ff0044
somewhere and have it translated at compile time to the appropriate 16-bit value. (I used dcolor
to parallel db
and friends, but I’m strongly considering using CamelCase exclusively for macros? Guess it depends how heavily I use them.)
With that on hand, I can now doodle some little sprites in Aseprite and copy them in. This part is not especially interesting and involves a lot of squinting at zoomed-in sprites.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
Gorgeous. You may notice that I put the colors as data instead of inlining them in code, which incidentally makes the code for setting the palette vastly shorter as well:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Loading sprites into VRAM also becomes a bit less of a mess:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Someday I should write an actual copy function, since at the moment, I’m using an alarming amount of space for pointlessly unrolled loops. Maybe later.
You may notice I now have two tiles, whereas before I was relying on filling the entire screen with one tile, tile 0. I want to dot the landscape with tile 1, which means writing a bit more to the actual background grid, which begins at $9800 and has one byte per tile.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Sorry for all these big blocks of code, but check out this payoff!

POW! Gorgeous.
And hey, why stop there? With a little more pixel arting against a very reduced palette…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
Yes, I am having trouble deciding on a naming convention.
This is now a 16×16 sprite, made out of two 8×16 parts. This post has enough code blocks as it is, and the changes to make this work are relatively minor copy/paste work, so the quick version is:
- Set the LCDC flag (bit 2, or
LCDCF_OBJ16
) that makes objects be 8×16. This mode uses pairs of tiles, so an object that uses either tile 0 or 1 will draw both of them, with tile 0 on top of tile 1. - Extend the code that loads object tiles to load four instead.
- Define a second sprite that’s 8 pixels to the right of the first one.
- Remove the hard-coded object palette, and instead load the
PALETTE_ANISE
that I sneakily included above. This time the registers are calledrOCPS
andrOCPD
.
Finally, extend the code that moves the sprite to also move the second half:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Cross my fingers, and…

Hey hey hey! That finally looks like something!
To be continued
It was a surprisingly long journey, but this brings us more or less up to commit 313a3e
, which happens to be the first commit I made a release of! It’s been more than a week, so you can grab it on Patreon or GitHub. I strongly recommend playing it with a release of mGBA prior to 0.7, for… reasons that will become clear next time.
Next time: I’ll take a breather and clean up a few things.