2017-10-15

SZLib 3DS

Just in time for it to be killed off by Nintendo in favour of the Switch (if you could ever find them in the shops; grumble grumble), Hanky Alien and Chuckie Egg now run on the 3DS. Getting the new platform up and running was pretty easy. Input was straightforward: the d-pad required switching to a couple of new functions for reading the button state but otherwise the constants are the same in libctru as those in libnds. Touch somehow just worked, which was surprising.

The hardest part was getting the graphics up and running. Not only does the 3DS introduce yet another set of RGB encodings, but the output of the LCDs is rotated 90 degrees from the VRAM. What’s that? You have a bunch of 2D graphics functions that are all optimized around VRAM being laid out in rows, from left to right? Bwahaha! Now VRAM is organized into columns, from bottom to top.

Though graphics were the trickiest part, the most laborious exercise was checking that the projects continued to build correctly for all 5 of the currently supported platforms (DS, 3DS, PSP, Dreamcast and SDL). I really need to come up with a better build system for the PSP and Dreamcast, and it’d be neat if devKitARM supported makefile names other than “makefile”.

There are a few things still to do:

  • libctru introduces yet another sound API, so there’s no sound yet;
  • SZLib doesn’t cater for the possibility that a device has multiple screens with different dimensions, so the bottom screen is off-center.

Anyhow, here are early 3DS builds, which I’ve tested out on my old 3DS XL and Citra (which is shaping up to be a great homebrew tool):

2017-04-10

Chuckie Egg 20170410

With egg hunting season almost upon us, it seems appropriate to release a little update for my favorite egg hunting game. Grab it now:

I’d hoped to get a level editor into the next release, but it’s not finished. This version fixes a few bugs and adds a couple of minor new features.

Changelog:

  • Scores are persisted on DS and PSP.
  • PSP analog stick can control Harry.
  • Fixed crash when completing a level if a cheat label is showing.
  • Fixed crash when switching to high score table.
  • Harry dismounts ladders correctly when climbing and jump is pressed.
  • Player does not get points when hens eat grain.
  • Bonus cannot be negative.

I haven’t looked into working with the Dreamcast’s VMU yet.

2016-11-19

Chuckie Egg Releases

Just in time for the holidays, here are final versions of Chuckie Egg for DS and PSP, and an almost-final version for the Dreamcast:

Chuckie Egg DS

Post Mortem

PSP

The PSP has a huge screen. Well, relative to the DS; maybe not so huge if you compare it to a modern phone. What do you do with all that space if you’re remaking a game that fits in 256x192 pixels? The obvious thing to do is double up the pixels and make the graphics twice the size. But wait; at 272 pixels tall the PSP’s screen is only 70% larger vertically than the Dragon 32, which means we can’t double up the resolution without losing 30% of the game’s display. I imagine it’s possible to wrap the game’s bitmap onto a quad and get the GPU to stretch it, but I eventually decided just to centre the game at its native resolution.

The PSP’s screen is 480 pixels wide but its default framebuffer is 512 pixels wide. Trying to change the width of the framebuffer to match the screen does not work (it results in the display being smeared horizontally, which always means “the dimensions of your bitmap do not match the layout of memory”). I had to add in the concept of “usable” vs “available” framebuffer space (“physical” vs “logical”, I suppose) and base the layout of the various screens on that.

It’s possible to push the PSP’s “home” button and quit from a game back to the launcher. Making this work entails creating a background thread and wiring up a callback that the OS can use to signal to the game that it needs to quit. It’s very simplistic; if the game doesn’t implement the callback then the system just hangs forever. The OS doesn’t appear to forcibly quit rogue processes. I’m not even sure that it’s really a multi-process system. I imagine Sony handled errors like this in their QA process so they didn’t bother with anything more elaborate to catch badly-implemented games.

Different geographic regions have different PSP button layouts. I don’t quite know what Sony thought the advantage of swapping the O and X buttons was, but it means well-behaved homebrew has to work around it. Games typically either allow the buttons to be remapped or come in different builds for different layouts. Official games were region-locked so that wasn’t an issue; it was probably fixed during localization. I’m too lazy to deal with it.

Getting sound working was troublesome. The basic PSPSDK libraries offer a way - I think - to craft PCM data and play it back, which really doesn’t help if you just want to play a WAV and not put any thought into it. I ended up using libmikmod, which has some peculiarities. Its MikMod_Update() function must be called regularly in order for sounds to work, but that can be stuffed into another background thread and forgotten about. It offers two ways to play back samples: Sample_Play(), which will automatically choose a channel to play back on, and Voice_Play() which lets you choose the channel. I found that Sample_Play() always chose exactly the wrong channel and ended up stopping sounds that I wanted to continue playing, so I switched to the alternative function. Voice_Play() doesn’t have any useful documentation. Inspecting the library’s code reveals that the trick is to set the volume, frequency and panning of the voice each time a new sample begins:

Voice_Play(channel, sound, 0);
Voice_SetVolume(channel, sound->volume << 2);
Voice_SetPanning(channel, PAN_CENTER);
Voice_SetFrequency(channel, sound->speed);

DS

The DS has a second screen. What do you do with two screens when remaking a game designed for one? I had a few ideas:

  • Level editor;
  • Chuckie Egg title screen;
  • Simian Zombie logo;
  • Dragon 32 image.

In the end the SZ logo was simplest, so I went with that. Otherwise, the DS is my default platform, so there aren’t any surprises here.

Dreamcast

It appears that calling snd_sfx_stop() on a channel playing a stereo WAV file will only stop one of the channels, which explains why sounds weren’t behaving themselves in the beta. Switching to mono files fixed it. Another problem was caused by limitations of the audio hardware: the Dreamcast can only play back the first 65534 samples in a WAV file, which is far shorter than several of the sounds in the game. To work around this I halved the sample rates of the longer samples. That fixed most of them, but the title song and Spectrum loading sound were still too long. I converted those to MP3 and used the streaming API, but that created problems of its own. Sounds still cut out or don’t play, and the emulator I’ve been using to help development crashes whenever I call mp3_stop(). There’s still some work to do there.

The DC has a similar problem to the PSP: I’m using the 320x240 NTSC graphics mode, but a good portion of that is off-screen on my TV. The “usable” framebuffer space concept helped out here.

The DC feels noticeably faster than the other two platforms, which is odd, as I’d expect them to run at identical framerates. Perhaps it’s just because it runs on such a massive display.

The Dreamcast’s reversed ARGB ordering means I’ve had to convert bitmaps into both DS and DC sourcecode and use #ifdefs to choose the right version of the data for the current target platform at compile time.

Cross Platform

What was most surprising was how little effort I had to put into Chuckie Egg in order to make it performant on these different platforms. The effort that went into Hanky Alien presumably helped a lot. I was testing HankyAlien with up to ~140 views on screen at once and up to ~45 of those updating simultaneously. In contrast, despite superficially being a more complex game, Chuckie Egg has ~45 views on screen at once and only ~10 updating simultaneously. (It’s interesting to note that, while both the DS and DC struggled with the HankyAlien test scenario and their framerates suffered, the PSP sauntered through it without a hitch. I really must write something specifically for the PSP in future.)

Writing object-oriented C is fun, but sometimes I look at the amount of boilerplate necessary to create a new “class” and decide to do things another way, particularly if I’m creating a base class. The most simple class requires:

  • A static metaclass struct, which forms a linked list back through the chain of superclasses (it enables the program to determine the type of an object at runtime).
  • A static callback struct that includes pointers to equality/comparison/hash/copy/dealloc functions.
  • Declarations of those functions.
  • Implementations of those functions.
  • A struct with the member variables of the class.
  • A typedef that defines a pointer to the struct, allowing the struct to be hidden.
  • A constructor that calls the allocator and initializer methods of the class.
  • An initializer function that sets up the members of class instances and calls back to the superclass’ initializer.
  • A convenience release method that wraps the basic release method.

If the class is intended to be subclassed it must also:

  • Provide an additional private header including:
    • A callback struct definition that nests the superclass’ callback struct and enables “methods” to be “overridden” (really it’s the template pattern with function pointers).
    • The struct itself, so subclasses can directly access properties of the struct and include the struct as the first item in their own structs.
  • Call the functions in the callback struct when appropriate (including those for the superclass).

Every time I create a new class I do so by copying-and-pasting an existing class and modifying it rather than start it from scratch and have to retype all of the boilerplate. Sometimes I’ll think about the work involved in setting one of these things up, pine for C++ or Objective-C, and then use a different structure that’s less “correct” but will still get the job done without smelling too badly. The grain and eggs, for example, should really inherit from a single “collectible item” class, but the correctness of that design doesn’t in any way compensate for the amount of work involved in setting it all up.

Keeping a strict separation between the model (game logic), the views (the stuff you see on screen) and the view controller (delegate of the model; updates the views to match incoming events) results in some beautifully clean code but makes prototyping difficult, especially early on when none of the infrastructure is in place. I did most of the early prototyping in the view controller before moving it into a separate model.

Each level renders its static content (ladders, platforms and bird cage) as a bitmap when it begins. It passes that up through the model to the view controller, which uses the bitmap as the background view for the game. That eliminated the need to create dozens of views to represent tiles that never change during the course of a level.

There are now 134,000 lines of code in the project, but a sizable number of those are included in bitmaps that have been converted to C.

2016-11-14

PSP Surprise

I’ve mentioned before that SZLib presents challenges:

The downside to completely ignoring the capabilities of the DS hardware and doing all of the graphics work with just the CPU and a framebuffer is that it is insanely expensive.

However, its bare-bones approach - it just needs a framebuffer and a way to read a joystick, pretty much - made getting it working on the Dreamcast fairly trivial. Getting it working on other platforms should be simple too, but when choosing a new platform for the library I eliminated everything but the Dreamcast as a potential target.

I was too hasty when I crossed the PSP off that list. The handheld does offer the ability to write directly to the framebuffer. It took about an hour or so to get to this point:

ChuckieEggPSP

By far the hardest part was getting MikMod set up to play sound effects, which took a couple of hours on its own.

There’s a little bit of work to do. The PSP has a physical screen size of 480x272, but for some reason it seems that the frame buffer is 512 pixels wide. The disconnect between the physical and logical sizes means everything is slightly off-center. Also, the “home” button isn’t wired up yet. Overall, though, the game works just as well on the PSP as it doesn on the Dreamcast and NDS.

Excitingly, because they’re based on the same libraries, HankyAlien also now runs on the PSP.

2016-11-10

Chuckie Egg

Back in the early eighties, when the home computer boom was booming, my parents decided to buy themselves a computer. Following the advice of various computing magazines of the era they ended up with what was supposedly the best of the bunch: the Dragon 32. It featured 32K of RAM, a real keyboard and a Centronics parallel port. It was powered by Bill Gates’ favourite 8-bit CPU, the Motorola 6809. Unfortunately for my parents and their new Dragon, the manufacturer ran into financial troubles and collapsed not long after.

The unusual choice of the 6809 CPU meant very little software was written for or ported to the platform. One of the few games that made it was the BBC and Spectrum classic “Chuckie Egg”. This obscure port (the World of Spectrum doesn’t list it) is by far my favourite version, and it’s the game I spent the most time playing on our Dragon. There has never been a remake of it. Worse, neither the Dreamcast nor the Nintendo DS has a native version of Chuckie Egg. Let’s fix that.

Here’s a beta version of my remake of the Dragon 32 version of Chucky Egg for Nintendo DS and Dreamcast:

A couple of screenshots:

Chuckie Egg Logo Chuckie Egg Game

The game itself is fully playable, but there’s still some polishing to do:

  • Sound on the Dreamcast is misbehaving;
  • The Dreamcast version runs faster than the DS, so some timings are off;
  • Later levels do not have increased numbers of hens or faster moving hens;
  • Hens don’t eat grain;
  • The high score table is not functional (it’s just a bitmap);
  • There’s no “game over” screen;
  • The score/timer/etc display layout and font is all wrong;
  • Remaining lives aren’t displayed;
  • The font in the title screen scroller is wrong.

This is an inexhaustive list of bugs I spotted in the original game while making this:

  • On the first level, climb the largest ladder to the third platform on the right (the start of the staircase). Stand on the ladder just below the top of the floor and jump right. Instead of bouncing off the first step in the staircase, Harry lands inside it. Now press right and Harry will fall through the platform on the right. (The remake fixes this.)

  • On level 7, stand next to the two blocks immediately on the right when the level starts. Face them, hold “right”, and jump. Harry stands in the middle of the upper block. (The remake reproduces this.)

  • Animation frame doesn’t reset when Harry stands on an elevator. (The remake fixes this.)

  • When a new level rolls in the remaining time is set to 999; it immediately jumps to 898 when the level starts. After dying it resets to 899.

  • Harry gets killed if the elevator he is standing on hits the top of the screen, but if he is jumping then he falls through the elevator. (The remake doesn’t kill Harry; he just drops down when the elevator disappears.)

  • There are a bunch of inconsistencies with ladder behavior. Sometimes it is possible to walk from a platform all the way across a ladder; sometimes Harry falls through half-way along the ladder. This looks more like a bug than desired behavior.

  • One of the extra hens on level 3 (at the bottom of the twin ladders) starts one block below where it should do.

Here are some of the differences between the remake and the original:

  • Jump isn’t quite the same.
  • Frame rate is much better.
  • No flickering.
  • Hen patterns aren’t the same (but the Dragon doesn’t match the BBC or ZX, and they don’t match each other, so that’s OK).
  • Duck moves smoothly.
  • Hens move more smoothly.

I think this will be the last remake for a while. This is the list of remakes I’ve managed to complete (more or less) so far:

  • Pong (Blitz Basic/C/C++/Java/Flash; Amiga/Cybiko/GBA/DS)
  • Asteroids (Flash)
  • Mario Bros (Flash)
  • Tetris (Flash)
  • IK+ (Flash)
  • PacMan (Java; Linux)
  • Space War (Java; web applet)
  • Earth Shaker (C++; DS/SDL)
  • Super Foul Egg (C++/Objective-C; DS/SDL/OSX/iOS)
  • BioShock sub game (C++; DS/SDL)
  • Space Invaders (C; DS/SDL/DC)
  • Chuckie Egg (C; DS/SDL/DC)

This is the list of remakes I started but abandoned:

  • World Grand Prix (Flash)
  • Trick or Treat (Java; web applet)
  • Atic Atac (C/C++; Cybiko/DS/SDL)
  • Defender (C/C++; DS)
  • Dan Dare (C; DS/SDL)

2016-10-13

HankyAlien v20161013

New releases of HankyAlien for Dreamcast and Nintendo DS!

Changes for Dreamcast are:

  • Added sound;
  • Fixed palette;
  • Improved performance;
  • Fixed layout for NTSC TVs.

Note that the sound doesn’t work at all well with lxdream.

Aside from the massive behind-the-scenes rewrite, there shouldn’t be anything noticeably different between this DS version and the last release.

2016-10-10

HankyAlienDC

A few more evenings of reorganising things for the Dreamcast and here’s HankyAlienDC:

HankyAlienDC

The archive contains a .elf version that will run in lxdreams, and presumably other emulators, and a .bin version that will run via an SD adaptor and DreamShell RC4. It will presumably run if you somehow burn it to a CD, but it’s been so long since I’ve made a CD of Dreamcast homebrew that I’ve completely forgotten how to do it. Plus I’m on a different OS and don’t have an optical drive any more.

There are still a few issues to be fixed:

  • The colors are wrong. The components in DS pixels are ordered ABGR; DC pixels are ordered ARGB.
  • Elements at the top and bottom of the screen are cropped out. The display is set to 320x240@60Hz, but of course NTSC TVs don’t display 240 vertical pixels; they display 200.
  • There’s no sound yet.

Here’s my Dreamcast test hardware, acquired from Amazon and eBay:

Dreamcast

Things I was surprised to learn about the Dreamcast:

  • There were multiple revisions of the hardware. The final US revision doesn’t support homebrew because Sega removed support for the “Mil-CD” multimedia disc format that had enabled all piracy on the system. There was only one PAL revision, which did support Mil-CD, which explains why I’d never heard of incompatible hardware before.
  • On a modern flatscreen TV - or at least, on my TV - the quality of the composite video output is awful. Really, really awful.
  • The performance of the emulator isn’t at all representative of the performance of the hardware.

2016-10-05

Hanky Alien Refactored

I’ve been toying with the idea of writing another homebrew DS game using the libraries I developed for Hanky Alien. Unfortunately the existing code has one considerable flaw: the game logic is intimately entwined with the rendering system, to the point that writing a new game unavoidably involves rewriting a bunch of graphics code that really should have been separate.

To fix this I’ve pulled apart Hanky Alien and reassembled it into three distinct pieces:

  • A layering library that handles all of the rect stuff I insist on using instead of just writing a sprite simulator for SDL like any sensible person;
  • An MVC GUI framework with the components necessary to get the game running (view hierarchy, controllers, timers, label views, bitmap views, event-based stylus and button handling, transitions, etc);
  • The game itself.

The game logic is currently included in the same build target as its presentation layer, but those can be trivially separated once I get around to it.

The new structure has a multitude of benefits. In order to place text on the screen, the code no longer needs to figure out where the text will go, figure out if it needs to erase what was previously at those co-ordinates, and then render the text; it just updates the string in a label view and lets the underlying libraries do all of the hard work. There’s a complete separation between the game logic and all I/O. Timer events piped down from the controller kick the game’s update logic, and game events (such as movement, or the addition or removal of objects within the game) are sent up to the controller via delegate callbacks. The controller is responsible for creating, moving, updating and removing views. I can change the game logic without having to think about any rendering. Each scene in the game - the “Simian Zombie” scene, the title scene, and the game scene itself - are all separate controllers. The GUI framework handles animating the transition from one controller to the next, which means a bunch of complex custom state machine logic became redundant. On top of that, the whole shebang is written in pure C.

All of this structure came with a horrible cost, however. Performance was nowhere near acceptable.

One huge benefit of manually managing the layout and redraw of the screen means each situation can be as efficient as possible. A generic approach to updating the framebuffer means there’s no opportunity for taking shortcuts that make sense in one place but would break other code. This forced me to dig through the entire codebase and try to optimize everything.

First step: profiling with Instruments. Hanky Alien builds happily using SDL2, so I can profile it on my Mac. It doesn’t necessarily give me accurate information, obviously; the Mac and DS have wildly different architectures and capabilities. It does give me a starting point, though.

One mistake that quickly became apparent was the amount of boxing and unboxing of rect structs that the layering library was performing. Subtracting one rectangle from another produces an array of leftover rectangles, and those were returned in an array. The array provided by the core library would only store objects, so I’d opted for the naive approach of boxing the leftover rect structs into value objects and sticking those in an array. Unfortunately each boxing operation required a malloc(), and any attempt at using the data in the objects required them to be unboxed back into structs first. The fix for this was easy: create a new array class specifically for storing collections of rect structs. As long as a resize isn’t required - which inevitably demands a realloc() call - appending rects involves nothing more than changing an int and copying a struct.

Creating those arrays was also slow, particularly when multiple rectangles are divided in a tight loop. The code used to look something like this:

for (int i = 0; i < SZRectArrayLength(rectsToDivide); ++i) {
    SZRect r = SZRectArrayRectAtIndex(rectsToDivide, i);
    SZRectArrayRef a = SZRectCreateArrayBySubtractingRect(rect, r);

    // Do stuff with a

    SZRectArrayRelease(a);
}

It now looks like this:

static SZRectArrayRef a = NULL;

if (!a) {
    a = SZRectArrayCreate();
}

for (int i = 0; i < SZRectArrayLength(rectsToDivide); ++i) {
    SZRect r = SZRectArrayRectAtIndex(rectsToDivide, i);
    SZRectSubtractRectAndPopulateArray(rect, r, a);

    // Do stuff with a

    SZRectArrayRemoveAll(a);
}

The “remove all” function just sets the array’s size property to 0. We create a single static array and re-use it in each call of the function and each iteration of the loop. We can get away with this because all of the code is single-threaded.

I’ve been unrolling loops. The code above does a bunch of work in the loop that we could unroll. We can move the variable declarations out of the loop and stop querying the array length on each iteration:

int rectCount = SZRectArrayLength(rectsToDivide);
SZRect r;

for (int i = 0; i < rectCount; ++i) {
    r = SZRectArrayRectAtIndex(rectsToDivide, i);
    SZRectSubtractRectAndPopulateArray(rect, r, a);

    // Do stuff with a

    SZRectArrayRemoveAll(a);
}

We’re passing structs around by value; we should probably stop doing that too:

for (int i = 0; i < rectCount; ++i) {
    r = SZRectArrayRectAtIndex(rectsToDivide, i);
    SZRectSubtractRectAndPopulateArray(&rect, &r, a);

    // Do stuff with a

    SZRectArrayRemoveAll(a);
}

The game logic itself had some inefficiencies. Something I’d annotated with a “todo” in the original game was the alien collision code. In order to check if the player’s bullets were colliding with an alien, the code looped over all aliens in the game. The new version treats each column of aliens as a unit and checks collisions with those before checking against the aliens themselves.

The code uses a huge number of rect calculations, so those needed to be as fast as possible. All of the functions had code to cater for rects with a negative width or height. That would never be possible with the data coming out of the various libraries, so I added a bunch of “fast” versions that didn’t do any sanity checking.

The rendering code now uses Cearn’s ARM assembly copy functions for blitting stuff to the framebuffer.

The code is much, much faster than it used to be, but it’s not enough to throw around as many bullets as the older version.

Some of the exciting failures were:

  • Flattening the view hierarchy into a single view and manually handling a bunch of the redrawing logic. Initially it seemed like this would lead to a massive speed boost, but as the redrawing code got more and more complex the game got slower and slower.
  • Replacing bitmaps with solid blocks of color to see if the DMA would help.
  • Using quadtrees to help partition the space and reduce the amount of collision detection being done.
  • Creating my own profiler to record the execution time of various functions.

My final idea was to ditch the DS completely and switch platforms to something with more CPU power. Requirements for the new platform:

  • Must be a console or handheld (rules out anything with a keyboard or without a joypad).
  • An easily-installed homebrew dev kit with great C support (rules out pretty much everything prior to 32 bit).
  • A faster CPU than the DS’ 66MHz ARM9 (no GBA).
  • No fighting firmware upgrades (so long, Vita/3DS/Wii U).
  • Well-designed input devices (no GPH devices).
  • Must be easy to test on the real device.
  • Must have a reasonably capable OSX emulator.

By my estimation, that leaves three potential devices: the PSP, the Wii, and the Dreamcast. I’ve had a look at the dev kits for the PSP and Wii and neither one really seemed to fit with the archaic software rendering that I’m trying to do. The Dreamcast, though, appears to have a wonderfully simplistic framebuffer graphics mode and even supports DS-style RGB555-encoded pixel data. It could use some more homebrew.

One trip to Amazon later and a new (old) NTSC Dreamcast is winging its way to me! After a few evenings of tinkering I managed to get this running in lxdream:

HankyAlien-Dreamcast

Unfortunately, despite the Dreamcast being fantastically more capable on paper (200MHz CPU, 16MB RAM, 8MB VRAM vs the DS’ 66MHz CPU, 4MB RAM and 656KB VRAM), it turns out that in terms of raw CPU performance there’s not a whole lot between them. On a more positive note, programming the Dreamcast looks like it’ll be fun, so once this refactoring project is complete I think I’ll try writing something exclusively for it.