A New Game
I’m writing a new game. Technically it’s an old game; it’s something I’ve been meaning to get around to for decades now, even if the plan has evolved over the years and what I’m trying to make now doesn’t have much, if anything, in common with my original ideas.
What I’m working on now is a Metroidvania-like platformer for the GBA. I have a few goals:
Like old LucasArts adventures, there should be no way to die. I’ve found that being forced to repeat the same few minutes of a game over and over until I get it right just isn’t interesting for me any more, and I don’t have time for it.
It should be possible to save anywhere. Castlevania, Metroid, and even more modern games like Bloodstained that really should know better, devolve into a frantic scramble from one save room to the next so that I can switch off the console and go deal with something more urgent.
The game should tell an interesting story.
The game should focus on exploration. Exploration should reward the player with new capabilities that allow access to new areas and more pieces of the story.
It shouldn’t be too large or complicated. No-one cares about GBA homebrew so there is no point.
I have the basics of story figured out, influenced too much by Greg Bear’s “Hull Zero Three”, Aldiss’ “Non-Stop”, Heinlein’s “Orphans of the Sky”, Galouye’s “Dark Universe”, and the original Star Trek episode “For The World is Hollow and I Have Touched the Sky”. Derivative? Yes. It’s even been done before in videogames. The Sega Genesis/Mega Drive game “Generations Lost” has an intro scene that is almost identical to the opening I’d envisioned for this game. I’ve got some original twists planned, though.
I don’t have a good answer for what the challenge is yet. There’s exploration; that’s good. And there’s a story; great. But what do you do in the game if there’s no way to die, and no monsters? Just wander around a big empty environment finding things to collect and chatting to NPCs? Is that a compelling game? Sounds a lot like Treasure Island Dizzy and that kept me entertained for hours - once I got the invulnerability cheat working - so maybe that’s enough if I throw in a few take-thing-x-to-place-y puzzles. Generations Lost is a more traditional platformer and has enemies and weapons, but that game only managed to keep my attention for a few minutes before I switched it off in annoyance. Is it a “walking simulator”? I’ve never played one, but it sounds like it might be. Perhaps I’ll start with that and see where I get to.
The other big concern is the game’s graphics. It takes me days to laboriously create terrible pixel art. The sub-Amiga-public-domain graphics in Professor Sinister are just about the limit of what I can do. I’m bad at pixel art and I don’t want to practice. Graphics are a chore that I’d rather someone else did.
The ECS Pattern
Coding the game is a different matter entirely. The game is an excuse for a fun experiment with the “entity component system” pattern. Anything that you’d think of as a “game object”, like the player’s character, NPCs, moving platforms, or collectible items, is represented by an “entity”. An entity is essentially just a container for a bunch of different “components”. A component is a struct that holds a set of basic information about some aspect of the entity. For example, a “velocity” component might contain the current x and y speed of the entity, the delta by which to mutate those values, and the maximum values that can be attained. Lastly, a “system” is a stateless function that iterates over all relevant entities (perhaps those with both a “velocity” component and a “position” component, for example), and mutates the state of their components using the components themselves as the inputs.
The ECS pattern is neat because giving game objects behaviors is simply a matter of giving the objects the relevant component(s). Want an entity to be affected by gravity? Add the “gravity” and “velocity” components. Now the gravity system will recognize the entity and mutate its velocity appropriately. It’s far more flexible than the approach I used in my unfinished shoot-em-up Chroma-X, which tried to achieve something similar with protocols/interfaces: a game object had a number of “sockets”, like an “engine” socket, and changing how the object behaved meant crafting a new class that implemented the appropriate interface and could be plugged into the appropriate socket. Want an enemy to move in a sine wave? Plug in an instance of the “sine wave” engine. Adding an additional behavior to a game object with that design meant modifying its class to support the behavior’s interface rather than just appending a component and letting the rest of the system do its thing automatically.
Professor Sinister used object-oriented C. Data was stored in reference-counted opaque types and everything had to be
malloced. Dynamic memory management allowed me to create my own implementations of variable-length mutable collections like arrays, sets, hashmaps, strings, etc. This time around I’m going in exactly the opposite direction. There isn’t a single call to
malloc anywhere in the code, unless I really want that chunk of memory to live in EWRAM.
Achieving the loosely-coupled ECS panacea (in which an entity can contain 0…n components, each of which has a different size) in C is impossible in any sane way without using
malloc() and variably-sized collections, so my solution gives all entities all possible components. Components include an “enabled” flag that systems look for when determining if an entity is relevant or not. It’s wasteful of memory but efficient in terms of processing time, as it eliminates searching through an entity’s components.
(On the subject of “loose coupling”, it’s worth pointing out that the ECS pattern doesn’t achieve it. For example, the jump system is responsible for determining if an entity can jump, and it needs input from numerous components to make that decision: Does the jump component indicate that the entity is already jumping? Does the input component say that the jump button is pressed? Does the duck component indicate that the entity is ducking? Does the environment collision component indicate that the entity is colliding with a tile above it? Systems are dependent on multiple components; those dependencies require that the components are updated in a specific order; and that ordering creates implicit dependencies between systems. Those dependencies can’t be enforced by a compiler, and they’re an impediment to multithreading the game logic.)
Deciding on the boundaries between components is interesting. For example, I have a “duck” component that tracks whether or not an entity is ducking. The component consists of two bools:
ducking. The ducking system checks the state of other components (is the entity jumping, or falling, or climbing?) and, if they are in a good state, sets the
ducking bool to match the state of the
down button on the d-pad. Meanwhile, running is implemented not as a separate component but as part of the velocity system. If
left on the d-pad is held down then the entity will move left; if
left are held, then the entity will move left more quickly. Should there be a separate component for the run state of the entity? A separate component feels wrong but I’m not entirely sure why, other than the additional overhead it would incur; possibly because it’s a modification of the walk function rather than a separate capability?
Similarly, it’s tricky to decide on which system should update the components. For example, if an entity is flying and grabs hold of a ladder, which system is responsible for setting the state of the
flight component from
inactive? The climbing system? Or the flying system? Perhaps an appropriate rule is “any system can read a component’s state, but only the main system for that component can mutate that state”. In our example, the flying system mutates the flying component. That rule feels like it will quickly collapse under the weight of too many edge-cases.
The platforming engine is coming along nicely. It supports climbable tiles (ladders), solid tiles, tiles with solid edges (any or all of the edges can be solid, so it’s possible to walk into the tile from the left but then hit its solid right edge), tiles with one-way edges (you could jump up through the tile from beneath and then land on its solid top edge), and tiles with a height map. Height maps are expressed as an array of 8 values between 0…8 that specify the height of the tile at a given x co-ordinate. They allow me to implement slopes with arbitrary gradients, bumps, dips, etc. This page was particularly helpful when getting slopes to work:
Map tiles can be marked as destructible. The player can stand next to destructible tiles and press a button, which results in the tiles being removed from the map. The GBA’s hardware limitations made that feature more complex to implement than you’d expect. Possible approaches I considered were:
- Mutating the level data (this isn’t desirable because it would require copying the level data from ROM to EWRAM, which is limited to 256K);
- Storing a dictionary of destroyed tile co-ordinates (this isn’t desirable because looking up the data in the dictionary would not necessarily be fast; each co-ordinate would require a couple of uint32_ts to store; and the maximum size for the dictionary would either need to be the size of the map (see previous problem regarding RAM limits) or some arbitrary smaller limit, which would mean we could unexpectedly run out of space in the dictionary);
- Create a small ring buffer of destroyed tile co-ordinates. Tiles would be overwritten automatically with newly destroyed co-ordinates when the buffer wraps, which means the oldest tiles would heal themselves like Lode Runner when new tiles were destroyed (not exactly what I wanted, but it would work);
- Create a bit array in which each bit represents the destroyed state of a tile in the map.
After playing around with the first idea for a while I realised it wasn’t going to work with the way I was representing tiles. Each tile was a complex struct that stored a large amount of metadata. I rewrote that to use something like the flyweight pattern. I maintain a palette of unique tile definitions and represent the map with a second array of indexes into the tile palette. That shaved about 20% off the size of the ROM, which currently only has a couple of mostly-empty test maps.
Even with that change, the bit array option was far more efficient. A single screen map consumes a minimum of 600 bytes if copied to RAM (assuming one byte per tile entry). The equivalent bit array consumes 75 bytes.
Initially I had everything implemented with integer math, but it became increasingly clear that the engine needed to use fixed-point math instead. The jump was too stiff; gravity was too aggressive; and running was too fast. Making the switch introduced an exciting selection of new problems and bugs, most of which were related to rounding problems. For example,
maxX() used to be implemented like this:
return rect.origin.x + rect.size.width - 1;
If I have a rect at point (0, 0) with width and height both equal to 2,
maxX() will give me a value of 1. That’s what I’d expect: the maximum X coordinate contained within the rect is 1. I can use the rect functions to step through the columns in the rect like this:
for (int i = minX(rect); i <= maxX(rect); ++i)
That falls apart with fractional values, though. Suppose I have a rect with a width of 0.5. The integer version of the function would return a maximum X coordinate of -0.5, which is clearly wrong. If we were to retain the decision to have
maxX() return the largest X coordinate that falls within the rect the correct implementation would look like this:
return rect.origin.x + rect.size.width - FIXED_EPSILON
FIXED_EPSILON is the smallest magnitude value representable by a fixed point number.
It was more effort, but less surprising for users of the function, to change
maxX() so that it simply returned the first possible coordinate outside the bounds of the rectangle:
return rect.origin.x + rect.size.width;
Stepping through the columns in the rect then requires this alternative iterator:
for (int i = minX(rect); i < maxX(rect); ++i)
Writing anything for the GBA requires time fixing performance problems. The most significant cause of problems so far, and the most surprising, was the
Div BIOS function. I’d taken care to use the BIOS
Mod functions everywhere thinking that they were the most performant way to divide on the system. Given that I’ve already experienced bugs in the DS’ BIOS (cough CpuFastSet cough) you’d think I’d know better. In any case, ROMs built using
Div performed noticeably worse than ROMs using the standard C divide operator.
The Testing Dance
Testing ROMs requires an elaborate dance with three laptops and a GBA:
- New laptop to write and build the code;
- Old laptop with an SD slot to copy the ROM to an SD card;
- GBA to test;
- Very old laptop running Windows to check no$gba’s CPU meter.
Grabbing the REG_VCOUNT value gives me a reasonable estimate of CPU usage and should mean I don’t need no$gba, but my first attempt at writing a profiler wasn’t accurate enough. mGBA’s logging functions (which I was calling to see the contents of the profiler) are expensive. If I improve the profiler hopefully I won’t need the Windows machine again.