2012-09-11

Blogging Engines

I’ve been looking for a replacement for BitBlogger. It’s a neat little blogging system, but it has two problems:

  • AppHarbor applications get shut down after 10 minutes of inactivity;
  • It’s written in C#.

The first issue means that my BitBlogger site usually takes far too long to respond. BitBlogger is designed to use the cache exclusively to store all of its content. BitBucket notifies it of any data changes, at which point it sucks the latest version of the blogpost repository into the cache. This works well if the cache doesn’t get wiped, but every time the app gets killed the cache goes too. The application has to rebuild the cache by reloading everything from BitBucket when it gets restarted, which takes time.

I designed BitBlogger in this way to avoid spending a fortune on AppHarbor’s data storage plans, but I didn’t know about the shut down policy until later.

As for the second issue, using C# means I have to maintain a Windows virtual machine, an installation of VMWare Fusion and a copy of Visual Studio. That’s far too much overhead for something that could be written in a few hundred lines of Go and deployed on any cheap Linux server.

With hindsight, having a blog that is stored in Mercurial and gets deployed on each commit isn’t as useful as I’d hoped. There’s no way to write a post on an iPhone, for example. Something I’ve been considering lately is using DropBox for storage instead, and it seems that a few other people have had the same idea:

My favourite implementation so far is Scriptogram. Getting a blog set up is completely painless. Creating a post is as easy as writing a Markdown file in a specific DropBox folder, and syncing them just involves clicking a button on the Scriptogram website.

Scriptogram was so frictionless that I seriously considered moving this blog - I even wrote a WordPress to Scriptogram converter - until I realised that Scriptogram doesn’t have a comment system. I could add Disqus to the theme files, but I find that the comments are the best part of this blog and I don’t want to outsource their storage. It looks like we’re keeping WordPress.

2011-08-11

Really Bad Eggs - Objective-C Edition

The Objective-C version of Really Bad Eggs is coming along nicely. I’ll try to get some screenshots and maybe a video up, but in the meantime here’s some stream-of-consciousness pontificating on the port so far.

When I wrote DS code I used to complain about the lack of a debugger. I’ve written most of the Objective-C version of Really Bad Eggs whilst out and about on a Windows laptop, so I haven’t had a debugger or even a compiler. Despite the lack of tools and the game being my first Objective-C program it’s working well so far. Rewriting the code in a new language has allowed me to re-examine a lot of the original ideas. In some cases I’ve refactored both the C++ and Objective-C code to be simpler or more versatile. In others I’ve used features of Objective-C to improve on the original code.

Bug Fixes and General Improvements

The DS’s representation of incoming garbage eggs was inaccurate due to a typo. I’ve fixed this in both versions of the game. I’ve also improved the placement of garbage eggs that are dropped into the grid.

Garbage eggs are always dropped into the columns with the least eggs first. The (simplified) algorithm looks like this:

  • Get a list of columns and their heights, sorted in ascending order by height (insertion sort).
  • Add the leftmost column to the list of “active” columns.
  • While there are eggs to place:
    • Place a garbage egg in each active column.
    • If the column to the right of the last active column has the same height as the last active column, add it to the active column list.

It essentially fills the grid from bottom to top as though it were pouring in water.

There is a subtle problem with the algorithm. The sorting system produces a list of columns that are sorted by height and column index. In SQL, it would look like this:

SELECT
    columnHeight
   ,columnIndex
FROM
    columns
ORDER BY
    columnHeight ASC
   ,columnIndex ASC

If columns 0, 1, 2, and 5 are of height 2, whilst columns 3 and 4 are of height 1, and we add a garbage egg, we know that the egg will be added to column 3. Here’s the column state:

        000  0
        000000
        ------
Column: 012345

Here’s the sorted data:

Column: { 3, 4, 0, 1, 2, 5 }
Height: { 1, 1, 2, 2, 2, 2 }

The first column to receive an egg will be column 3, as it is the left-most column in the sorted list.

This predictability means that advanced players can start using it to their advantage. They can drop a block here because they know they’ve got an incoming garbage egg and want it to go there. They can change their plans because they know what’s going to happen next. To prevent this, the sorting algorithm now randomises columns of the same height within the sorted array.

Memory Management in Objective-C

Memory management is a pain. I can see exactly why Apple have implemented reference counting the way they have. I’ve worked extensively with low-level memory management in C++ due to the combo attack of the DS’ tiny stack/slow CPU. The combination makes both smart pointers and stack-based objects mostly unusable, leading to the same questions time and time again:

  • If I call a function, who owns the memory it allocates? Do I delete it or not?
  • Does this function expect to receive an existing block of memory that I allocate or will it allocate its own?
  • When writing functions, should I allocate memory or should I expect it to be allocated in advance?
  • Should I return allocated memory via the return statement or via a pointer-to-a-pointer passed as a parameter?
  • Should I return an object on the stack and hope RVO kicks in, or should I return a pointer to an object on the heap?
  • Should functions receive and return pointers or references to objects?
  • If an object is told to store a pointer to another object, should it store the pointer or a copy? What if the pointed-to object changes during the lifetime of my object?

I’ve become used to considering these issues every time I work with memory in C++, but Objective-C tries to solve the issues itself. Mostly it uses conventions to suggest the correct approach. Functions with “new” or “init” at the start of the names, for example, will retain the returned object and increase its reference count. The caller therefore becomes the owner of the object. Convenience methods, such as NSArray’s arrayWithObjects method, return autoreleased objects. These are released automatically by the Cocoa framework when the current iteration of the application’s state terminates. If the creating code needs them around longer then it needs to retain them. Collections retain objects added to them, so they can be deemed to be the owners of those objects. It is the responsibility of the object’s previous owner to ensure that he releases the object.

Getting the hang of the conventions is tricky, particularly when dealing with frameworks like cocos2d which aren’t the best architected examples I’ve seen. There are a number of edge cases, too. Suppose you retain a reference to object A, which creates and retains a reference to object B, which itself creates and retains a reference to object C. During its creation, object C retains object B. What happens to B and C when A is released? Why, nothing, of course. B and C have a cyclic relationship. Releasing object A reduces the reference count of B by one, but B is still referenced by C (and vice-versa) so neither will be deallocated. This isn’t an issue in C++ as the developer manually deletes objects. Deleting A will delete B which will delete C (assuming the destructors tidy things up correctly). Nor is it an issue for C#’s garbage collector because it can tell that the only references to B and C are each other.

For the most part, I’ve used what Objective-C calls “weak references” - I’ve got pointers to objects without retaining them, the same as I would in C++. My code probably looks hideous to experienced Objective-C coders as I’ve trampled conventions all over the place.

One of the main reasons I upgraded to OSX Lion so quickly was the promise of ARC, or “Automatic Reference Counting”. Skimming through the docs suggests that there are a lot of edge cases to it, though (that seems to be a common theme with Objective-C - watch out for the edge cases), so I’m sticking with the traditional method first in order to understand fully what ARC is replacing.

A bug that caught me out for a while was a classic Objective-C memory leak. The “Leaks” utility insisted I’d leaked memory from one class like crazy, but it also told me that the reference count for all leaked objects was 0. That makes no sense until you remember that Objective-C allows you to make boneheaded mistakes like this:

- (void)dealloc { }

Oops. All dealloc methods should call the superclass’ dealloc method:

- (void)dealloc {
    [super dealloc];
}

I’d have suggested that Apple use the template pattern for this instead. NSObject could have methods like this:

- (void)onDealloc { }

- (void)dealloc {
    [self onDealloc];
    [super dealloc];
}

User code would look like this:

- (void)onDealloc {
    // Cleanup here
}

It’s now impossible to leak memory by forgetting to dealloc the superclass. I imagine there’s a valid case for giving developers the power to shoot themselves in the foot, but there’s a lot of boilerplate code needed in classes that could be excised very easily by using the template pattern. The same is true for object initialisation. Here’s what you do now:

- (id)init {
    if ((self = [super init])) {
    // Initialise the object
    }

    return self;
}

This is what a templated NSObject could look like:

- (void)onInit { }

- (id)init {
    if ((self = [super init])) {
        [self onInit];
    }

    return self;
}

This is what a user class would look like:

- (void)onInit {
    // Initialise the object
}

Tidier, but potentially less powerful.

Performance

Running the OSX version of Really Bad Eggs through Xcode’s CPU profiler demonstrates how efficiently the game works. Most of the game code barely registers; cocos2d’s drawing routines use several magnitudes more CPU time than anything else in the game. It also demonstrated how abysmally slow the NSArray objects are. One function I used them in took up more CPU power than anything else in the entire game (except the cocos2d drawing routines), so I refactored the function without NSArrays and the problem disappeared.

Dynanism

The dynamic nature of Objective-C means I’m happy to use calls to an object’s isKindOfClass method, whereas in C++ I’d have been more inclined to refactor to remove the need for type checking or calls to dynamic_cast<>(). For example, the C++ function to test if two eggs can be connected looks like this:

void NormalBlock::connect(const BlockBase* top,
                          const BlockBase* right,
                          const BlockBase* bottom,
                          const BlockBase* left) {
    setConnections(top != NULL && top->getColour() == _colour && top->getState() == BlockBase::BLOCK_STATE_NORMAL,
                   right != NULL && right->getColour() == _colour && right->getState() == BlockBase::BLOCK_STATE_NORMAL,
                   bottom != NULL && bottom->getColour() == _colour && bottom->getState() == BlockBase::BLOCK_STATE_NORMAL,
                   left != NULL && left->getColour() == _colour && left->getState() == BlockBase::BLOCK_STATE_NORMAL);
}

All egg classes inherit from the BlockBase class (descriptive name, I know) which includes a “_colour” member. Each subclass sets this to a colour representative of the bitmaps used to display the egg. The red egg has an RGB value of (31, 0, 0) whilst the blue egg has an RGB value of (0, 0, 31). Checking if two eggs are of the same type simply entails a comparison between two 16-bit integers.

The Objective-C function just compares the eggs’ classes instead:

- (void)connect:(BlockBase*)top right:(BlockBase*)right bottom:(BlockBase*)bottom left:(BlockBase*)left {

    BOOL topSet = top != NULL && [top class] == [self class] && top.state == BlockNormalState;
    BOOL rightSet = right != NULL && [right class] == [self class] && right.state == BlockNormalState;
    BOOL bottomSet = bottom != NULL && [bottom class] == [self class] && bottom.state == BlockNormalState;
    BOOL leftSet = left != NULL && [left class] == [self class] && left.state == BlockNormalState;

    [self setConnectionTop:topSet right:rightSet bottom:bottomSet left:leftSet];
}

It’s not a major change, but it does remove some cruft from the code.

Blocks as Callback Functions

The Grid class represents the well of eggs and manages all of the eggs within it. It includes functions for rotating the live eggs, identifying and exploding chains of eggs, and so on. In the C++/DS version, the Grid class is also responsible for calling methods of the SoundPlayer class to trigger audio playback every time anything sound-worthy occurs. This works, but it’s not terribly clean - the Grid shouldn’t need to deal with sounds. The design ties the logic of the game to the platform it is running on.

A better solution would be to trigger an event, but if you’ve ever implemented an event system you’ll know that it’s a very verbose pattern. At the very least, you’ll need interfaces and classes that represent event arguments and event listeners. I’d like to tidy up the design but I’m not that bothered.

A considerably terser option would be to use a function pointer as a callback method, but that opens up a whole can of non-OO, global namespace-polluting nastiness. C++’s pointers-to-members are so limited - the exact type of the pointed-to object must be known - that they aren’t even worth considering.

Another possibility would be to create a message queue and have the grid push messages into it. A message pump would dispatch messages to the sound player object as necessary. Again, this is ugly and involves far too much new code for virtually no benefit.

It was only after working through this familiar thought process that I remembered that Apple has added closures (“blocks”) as a non-standard extension to C. With closures, the problems disappear. No bloated event system, no global namespace pollution with random C functions, and a simple way to extract the sound triggers from the Grid class.

The “Grid.h” file includes the following typedefs:

typedef void(^GridEvent)(Grid*);
typedef void(^GridBlockEvent)(Grid* grid, BlockBase* block);

I think of the typedefs as being analogous to C#’s delegate signatures. The class itself includes the following members:

GridEvent _onGarbageRowAdded;
GridEvent _onLand;
GridEvent _onGarbageLand;
GridBlockEvent _onBlockAdd;
GridBlockEvent _onBlockRemove;
GridBlockEvent _onGarbageBlockLand;

The class triggers the closures like so:

if (_onBlockLand != nil) _onBlockLand(self);

Wiring up code to an instance of the Grid class is simple:

Grid* grid = [[Grid alloc] initWithHeight:2 playerNumber:0];

grid.onBlockLand = ^(Grid* sender) {
    printf("A block has landed");
};

It looks like C, and it tastes like C, but I can perform JavaScript-like tricks with it. ‘Tis a beautiful abomination.

I’ve used this pattern all over the code to trigger sounds and visual effects as well as add/move/remove sprites. The game engine code is completely separate from the presentation layer.

Classes as Objects

The BlockFactory class has also benefited a little from the switch to Objective-C. When a grid needs to place a new egg, it fetches a new egg instance from the BlockFactory rather than create the egg itself.

The BlockFactory is necessary because all players receive the same sequence of eggs. Suppose player 1 requests his third pair of eggs and is given a red egg and a blue egg. When player 2 requests his third pair of eggs, he must also be given a red egg and a blue egg. No player can “cheat” due to a particularly advantageous sequence of eggs. Winning is a matter of skill and speed rather than luck.

The BlockFactory must, therefore, maintain a list of all egg types that have been generated. It also needs to remember which pair of eggs all players will receive next. Once a pair of eggs has been used by all players it can be removed from the list. Attempts to retrieve a pair of eggs not in the list will cause a new pair of random egg types to be created and appended to the list. The BlockFactory instance is shared across all players.

In the C++ version, the list stores enum values that represent an egg type. When a new egg is requested, the correct constructor is called for that value using a switch() statement. Here’s the code that adds a new item to the list:

BlockFactory::BlockType BlockFactory::getRandomBlockType() const {
    return (BlockFactory::BlockType)(rand() % _blockColourCount);
}

void BlockFactory::addRandomBlock() {
    _blockList.push_back(getRandomBlockType());
}

Here’s the code that creates a new egg from a given type:

BlockBase* BlockFactory::newBlockFromType(BlockFactory::BlockType type) const {
    switch (type) {
        case BLOCK_RED:
            return new RedBlock();
        case BLOCK_PURPLE:
            return new PurpleBlock();
        case BLOCK_YELLOW:
            return new YellowBlock();
        case BLOCK_BLUE:
            return new BlueBlock();
        case BLOCK_GREEN:
            return new GreenBlock();
        case BLOCK_ORANGE:
            return new OrangeBlock();
    }

    // Included to silence compiler warning
    return new RedBlock();
}

The BlockFactory class includes a BlockType enum necessary for this scheme to work.

The Objective-C version dumps the enum and uses a tidier solution. As classes in Objective-C are themselves objects, storing enum values in the list is unnecessary. Instead, the list stores pointers to the class objects. Here’s the code that adds a new item to the list:

- (void)addRandomBlockClass {
    [_blockList addObject:[self randomBlockClass]];
}

- (Class)randomBlockClass {
    int type = rand() % _blockColourCount;

    switch (type) {
        case 0:
            return [RedBlock class];
        case 1:
            return [BlueBlock class];
        case 2:
            return [YellowBlock class];
        case 3:
            return [PurpleBlock class];
        case 4:
            return [GreenBlock class];
        case 5:
            return [OrangeBlock class];
    }

    // Included to silence compiler warning
    return [RedBlock class];
}

Creating a new BlockBase object from the list becomes trivial. A single line of code will create a new BlockBase object from an item in the list:

BlockBase* block = [[[_blockList objectAtIndex:0] alloc] init];

The item in the list is a class, so calling its alloc and init methods gives us a new object automatically.

The difference between the two sets of code is minor, but conceptually it’s a big leap. Objective-C lets me store classes in arrays. I’m pretty certain that the only other language I’ve used that could do anything even remotely like this is JavaScript.

2011-07-14

Farewell, DS Homebrew

It’s time for me to say goodbye to the DS. I’ve been working on DS homebrew projects in my spare time since March 2007, and I’ve learned a huge amount doing so. However, I’ve noticed recently that I’m not really learning much from any of the new projects I work on. It’s still hugely enjoyable, particularly now that I can get games whipped up in a few days and polished for release in a few weeks, but I’m just repeating the same patterns in the same language. Worse, there isn’t an audience for DS homebrew any more. The audience, and most of the developers, have moved over to iOS and Android devices. The DS itself is two generations out of date.

To be honest, I wanted to ditch the DS back when Quirky switched to Android development. In fact, I wanted to ditch the DS as soon as Apple announced officially-supported apps for the iPhone. I’ve stuck with it for three reasons:

  • I wanted to see Woopsi through to completion;
  • The iPhone’s input devices aren’t a patch on the DS’ physical buttons for the kind of games I enjoy playing;
  • Having grown up playing Atari, Sega and Nintendo consoles, actually writing software for one fills me with childlike delight.

I’ve achieved everything I set out to with the DS. I’ve written developer tools, libraries and games for a handheld console, and even entered a homebrewing competition. EarthShakerDS placed 7th out of 24 in GBATemp’s homebrew bounty competition, which isn’t bad for a game knocked up in a couple of weeks.

It’s time to move on.

I do have a few updates to release before I say goodbye to devKitARM - bugfixes to Woopsi, WoopsiGfx and EarthShakerDS, and Really Bad Eggs will get a final release as soon as the title screen is done. I won’t be starting any new DS projects, though.

Unlike Quirky, I’m switching to Apple development. My current project is to port Really Bad Eggs from C++/DS/SDL to Objective-C/cocos2d. I know that I could just use the C++ version on the Mac, but I want to learn Objective-C. Porting something I’ve recently written seems like a good first project. The game engine and AI are already running; I/O is next in the list.

2011-02-17

BitBug and BitBugPy

Another pair of C#/Python programs. This time it’s an idea branched from BugBare, my distributed bug tracker. The real problem with distributed bug trackers is their complete lack of utility and visibility for people who aren’t programmers on the project in question. BitBug dumps the distributed part and is really just a command line interface to BitBucket. All bugs are stored in BitBucket’s issues database. They obviously don’t work offline, but they’re as quick to use as BugBare and come with a ready-made web interface.

The two programs are here:

Of the two, BitBugPy is the tidier program. It uses the BitBucketAPI library for all BitBucket interaction, so the script itself is tiny. Again, it uses Python 3.2 (release candidate) so I’m sure 99% of the world won’t be able to use it. I’d look into making it a Mercurial extension, but - shock! - Mercurial is written in Python 2, not 3, and I’m not interesting in backporting it.

The C# version - BitBug - uses my MurkyCommando library as its command line UI. Since playing with the various command line argument parsers in Python I’m a lot less happy with it now.

They are both rather limited at the moment, mainly because the API offered by BitBucket is still a work in (very slow) progress.

2011-02-04

BitSync

I’ve been playing with the BitBucket API recently. It’s pretty neat - it’s a REST API that transmits data in JSON format. I can’t imagine a lower-friction way of getting at BitBucket’s data. It’s not complete, unfortunately, and does have some bugs, but hopefully it will improve over time.

My main reason for experimenting with it was to find a way of pulling down all Mercurial repositories belonging to a specific user. I’ve decided to take on curation duties for CaptainChemo’s repositories rather than leave them orphaned and unloved. I’m not sure what I’m going to do with them yet, though. There are a few private repositories in there that might contain proprietary code that I can’t open-source, but unfortunately I’m not in a position to find out.

Rather than use the command line to pull down the repositories one at a time, I knocked up a quick C# program to do it for me:

BitSync is command line driven. Example usage:

 bitsync /username ant512 /password mypassword
 /localpath C:\Mercurial /repoowner CaptainChemo

Note that the “repoowner” argument is case-sensitive.

BitSync will get a list of the repositories on the server and compare it with a list of repositories stored in the local destination path. If a repository already exists locally, BitSync will attempt to pull and update it. If a repository does not exist locally, BitSync will attempt to clone it.

Private repositories are only visible if the username and password combination have access to them.

The program uses a combination of Hammock and Json.NET for accessing the API, and my own PrettyConsole library developed in haste for the Woopsi font/bitmap tools.

2010-11-13

Side Projects

I’ve been working on a few side projects lately that I’ve neglected to mention here, so I’ve decided to rectify that.

Working Week JS

This is the latest project. It’s a port of the C# WorkingWeek library that I mentioned a while ago to JavaScript. It’s functionally identical, except that it includes a TimeSpan class (JavaScript doesn’t have one) and it doesn’t include the custom iterators. JavaScript 1.7 apparently includes these, but as only FireFox supports this version (and even FireFox wouldn’t recognise the “yield” keyword) I replaced them with a more traditional approach.

MurkyCommando

This is a little C# framework for creating console applications with a UI like Mercurial. Mercurial lets you perform all manner of shortcuts to get at the commands you want:

hg sta
hg help sta

These are automatically translated back to:

hg status
hg help status

MurkyCommando lets you achieve the same thing. It includes the help functionality and will pre-parse command line arguments into argument/value combinations.

BugBare

BugBare is a distributed bug tracker that uses MurkyCommando to present a very Mercurial-like command line UI. It’s written in C#, and works in both .NET and Mono.

There are quite a few distributed bug trackers out there. For example:

They all have problems, though:

  • Bugs Everywhere doesn’t work in Windows
  • DITrack only works with Subversion
  • Distract has vanished from the net
  • pitz doesn’t have a UI and requires you to use the Python interpreter and its object model to create bug reports
  • ditz was abandoned 2 years ago and doesn’t work with the latest version of Ruby
  • Fossil logically separates bugs from code, so bugs cannot be branch-specific

I haven’t used Artemis yet, so I can’t comment on that one.

BugBare is designed to allow Windows developers (or anyone else who doesn’t mind installing Mono) to get in on the distributed fun.

Distributed bug tracking sounds terribly complicated, but it’s not at all. All but Fossil in the list above work the same way. They create a sub directory within the source repository and add bugs as human-readable text files. This makes merging bugs trivial for most source control systems. Fossil is only different because it’s backed by a SQLite3 database.

One of the main aims I had for BugBare was to make logging bugs as easy, or easier, than just updating a text file. A few years ago I went to the FogBugz World Tour in Cambridge to hear former uber-blogger Joel Spolsky speak. The one thing that really struck me about FogBugz was just how easy and fast it was to enter a bug into the database. Spolsky noted that most developers use a text file to log bugs because it’s quick. I do the same - if I spot a bug whilst I’m in the middle of coding something, I really don’t want to spend 5 minutes filling in dozens of tiny boxes in trac and lose my concentration in the process. I just want to note the bug down somewhere and forget about it until later. Developers don’t want a high-friction bug tracker.

Logging a bug in BugBare (which is entirely command line based, like all of the other trackers mentioned, barring Fossil (again)) takes just one command:

bg add "Something went wrong."

Assuming you’re using a command line DVCS, it is extremely easy to work BugBare into your work flow. For example:

hg pull
hg update
hg commit -m "Made some changes."
bg add "Found a bug somewhere."
hg add
hg commit -m "Made some more changes, added a bug."
hg push

It’s as quick as using a text file, but you get extra benefits:

  • You can add comments to bugs
  • Bugs store the date they were logged/modified
  • Bugs store the user who logged them, which defaults to the logged-in user
  • Bugs can be open/closed/rejected
  • Bugs can have several different priorities (low, normal, high, critical)
  • You can search the database for bugs based on various criteria

Sounds great! However, I’m not entirely convinced that distributed bug trackers are the way to go. Imagine you’ve downloaded a new version of your favourite open-source program; FireFox, for example. You spot a bug and want to report it. However, FireFox has switched to using git and Bugs Everywhere. To log the bug, you have to:

  • Install git
  • Learn the basics of how it works
  • Clone the FireFox source repository
  • Install Bugs Everywhere and its dependencies
  • Learn how to use Bugs Everywhere
  • Log the bug
  • Commit it to the repository
  • Send a patch to the FireFox team that contains the bug (you won’t be able to push to their repo)

From an end-user’s perspective, distributed bug trackers are awful. For most open-source projects, which rely on the “many eyeballs” way of identifying bugs, they just won’t work. It gets worse when you consider that each branch of each different repository clone can have its own separate bug database. Even if someone wrote a web front end to a distributed bug tracker, to allow it to be used by non-developers, it would quickly become confusing if it had to provide access to all bugs in all branches of the repository.

This is probably why most of the trackers listed in this post see very little activity these days. They’re a great idea, and fantastic for individual developers and small teams, but they’re extremely counter productive for open source projects.

So where does that leave BugBare? I’m announcing a new program and at the same time saying that using it probably isn’t a great idea. Perhaps I should get a job in marketing? Personally, I don’t think I’ll use it much. BitBucket’s bug tracker is fairly quick. More importantly, anyone can see it and add bug reports with little effort.

2010-07-05

More Mini Projects

I’ve been tinkering with a few mini projects lately, all of which are available from my BitBucket page:

www.bitbucket.org/ant512

Networked Pacman

Though I released this ages ago, I published it as a zip archive instead of hosting the code in a source code management system. I’ve rectified this and added it to BitBucket:

Networked Pacman

PyScrabble

This is a simple implementation of the logic of Scrabble, written in Python 3. It doesn’t have a UI, so it’s not terribly exciting for anyone but Python programmers, but perhaps someone will write a front-end for it. This was written in a couple of days and was an exercise in trying to think “Pythonically”. Whether or not I achieved this I’m not sure…

PyScrabble

WorkingWeek

Finally, this is a C# library that provides date functions that operate on a user-defined working week. For example, you can define a week of 9:30am to 5:30pm shifts (with time for lunch, natch) and use the functions provided by the library to determine whether or not a given date/time falls within a shift, or how many working hours there are between two dates, etc. It does some neat things with custom iterators and the yield statement to condense a lot of fiddly logic into a couple of very powerful methods.

WorkingWeek

2009-12-22

Canvas Tag Clipping 3, AKA a Javascript Window Manager

Building on yesterday’s code, here’s the latest version:

At this point, the similarities to Woopsi should be apparent. This is a basic but workable window manager for the canvas tag. New things since yesterday include window and button gadgets, a top-level container gadget, event handling (click, release, release outside, focus and blur), focus and blur functionality, and probably some other stuff that I’m forgetting.

In various places I’ve just ported code straight from C++/Woopsi to Javascript, whilst in others I’ve taken advantage of lessons I learned whilst writing Woopsi to come up with better approaches. For example, the list of gadget event handlers is a separate class. I’ll probably update Woopsi to follow the same pattern.

As for why I’m doing this - I haven’t the faintest idea. Seemed like a good idea at the time.

2009-11-26

Multiple Inheritance and the Dreaded Diamond

Woopsi refactoring continues apace. Today’s latest changes were an attempt to consolidate the ListBox, ContextMenu and CycleButton gadgets.

Each of these gadgets is a different kind of view onto a list of items. The ContextMenu is the most primitive - it shows all of the items in its list as a vertical list. The ListBox is basically the same thing, except it introduces scrolling into the mix. The CycleButton shows just one option at a time; the other options can be paged through by clicking the button.

At present, the ListBox has the most advanced data mechanism. It leaves all of the data functionality to the ListData class, which wraps around a WoopsiArray (ie. vector) and provides such handy functions as sorting and item selection/deselection (enforcing single-only or multiple selection rules). It raises events to indicate changes to the data or selections within the data.

Meanwhile, the CycleButton and ContextMenu classes have very primitive data manipulation functionality. They both include an instance of the WoopsiArray and work with it directly. Wouldn’t it be better if they could benefit from the wealth of functionality in the ListData class?

Since the ContextMenu and ListBox work in very similar ways - the ContextMenu is basically a customised ListBox without scrolling - it makes sense to create a base class providing the most basic list view functionality that they can both inherit from. I split up the ListBox into “BasicListBox” (no scrolling) and “ListBox” (inherits from BasicListBox and ScrollingPanel, which adds the scrolling functionality). Here the problems began.

The BasicListBox inherits from the fundamental Gadget class, from which all Woopsi UI components inherit. The ScrollingPanel also inherits from the Gadget class. When the ListBox attempts to inherit from them both, it ends up with two copies of the Gadget class floating around in its inheritance hierarchy. Any attempts at working with the features of the Gadget class are now ambiguous and the compiler throws errors all over the place. This is called a “diamond” inheritance pattern:

           Gadget
       /             \
      /               \
     /                 \
BasicListBox    ScrollingPanel
     \                 /
      \               /
       \            /
           ListBox

The C++ FAQ Lite has a whole page about the diamond inheritance problem. The solution is to declare the inheritance from Gadget as virtual, which eliminates the ambiguities by eliminating the second copy of Gadget. However, attempting to do this has horrible problems in the Woopsi codebase. C-style casting cannot be used (uh oh) and the class at the top of the diamond (in this case, Gadget) should not include data members that need to be initialised (it can, but it shouldn’t).

The upshot of all this is that it is impossible to create a gadget that inherits from two other gadgets in Woopsi. If you think about it, this does make sense. It doesn’t help my goal of rationalising the list display gadgets, though. I could include an instance of the ScrollingPanel inside the ListBox to achieve the same effect, but that would be less efficient than the current solution. I could alternatively make the ListBox the basic list and have the ContextMenu inherit from that, and just not use the scrolling feature (though it would be handy if there are too many items to fit on-screen at once), but again that’s less efficient than the current solution.

I kept some of the changes I made before I reverted everything else back. ListData::swapItems() no longer raises a data changed event, as the only place it is called is in the sort() method. That method now raises the event instead. The ListDataItem struct has been replaced with a ListDataItem class, and it includes a C#/Java-style compareTo() method. Subclassing the ListDataItem and overriding this method allows the ListData class’ sort() method to sort the list differently. The ListBox’s scrolling canvas now has the correct height and the extra 1px at the bottom of every ListBox has now gone.

Finally, there are a couple of fixes in other places. The Woopsi class’ constructor no longer tries to retrieve the system font before it has been initialised, and the ContextMenu does not cast away the const-ness of the ContextMenuItem when raising a selection event to its listeners.

2009-11-05

Strings, C and C++

A function to read a single line of text from a text file is a tricky thing to write in C++, at least when one is limited to fgetc() and fgets() from the standard C library.

Memory allocation is the main problem. A readline() function must obviously store its output somewhere. There are three possible approaches. In the first, and most usually recommended, the function should receive a block of pre-allocated memory. In the second, the function creates and returns a new block of memory. An in-between method could receive pre-allocated memory but enlarge it as needed.

The fgets() function is the most primitive example of a readline() function in the standard C library. It relies on the caller to supply it with pre-allocated memory in which to store its output, and it looks like this:

char* fgets(char* buffer, int count, FILE* stream)

The function reads from stream until it hits either a newline, end of file, or has read the number of characters indicated by count (this should be at most the size of buffer). It stores the output in buffer, and also returns a pointer to buffer when the function ends.

fgets() has a major problem, of course. How many characters are there in the line being read? Answer: No idea. fgets() is a low-level function that may read a whole line or just part of a line, and it is the programmer’s responsibility to identify when a partial line has been returned and repeat the process until all characters are retrieved.

A function that solves this problem is getline() from the GNU C library (not the same as the getline() method available to C++ istreams). It looks like this:

ssize_t getline(char** lineptr, size_t* count, FILE* stream)

The function is guaranteed to read an entire line of text. lineptr performs the same function as buffer in fgets() - the text retrieved by the function is stored here. count contains the size of the lineptr buffer. stream is a pointer to the file being read. In this function, memory can be allocated before the function is called, but it does not need to be (in that case - deep breath - the pointer to which lineptr points must point to NULL and count must be 0). The function uses realloc() to increase the size of the lineptr buffer as necessary. The eventual size of the buffer is stored in count and returned as the output of the function.

It can be thought of as an example of the “in-between” approach described earlier. It relies on externally-created memory but can enlarge the memory area to accommodate the full string. Choosing a generous initial buffer size will reduce the number of realloc() operations performed. Assuming realloc() is usually successful in increasing a block of allocated memory, rather than needing to allocate a new region and copy the old data into it, getline() should be comparable in speed to fgets(). The minor speed penalty is worth the ease of use the function affords.

However, this function has its own set of problems, at least within the scope of Woopsi. It is not available for starters - devkitARM does not include the GNU C library. Even if I were to borrow the code, it really highlights one of the problems with C++.

getline() expects to be passed a pointer to either memory allocated via malloc() or NULL via the lineptr parameter. C++, however, essentially deprecates the malloc/free method of allocating heap memory. A C++ programmer should instinctively use the new/delete memory allocation system. This will potentially cause all manner of chaos when getline() calls realloc() on memory that was allocated using the “new” keyword. Since memory allocated with “new” cannot be reallocated, a C++ re-implementation of getline() will always copy memory every time the buffer is resized. getline() is a good example of why C++’s palimpsest-like layering over C was perhaps not such a good idea.

Rather than allowing memory to be allocated outside of the function and populated within the function, a different approach is to handle all memory allocation internally:

char* getline(FILE* file)

This version of getline() would be more familiar to programmers using higher-level languages such as C# and Java. It doesn’t perform magic with pointers-to-pointers. It will allocate memory within itself, grow it as necessary, and simply return a pointer to the string once it has been retrieved.

Outside of memory-managed languages, though, this function has problems. How has the memory been allocated - new or malloc()? We can document this in the function’s comments, and we can also follow the C++ standard of using “new” to make this more intuitive, but the question still needs to be asked before the function can be used. More importantly, who is responsible for deleting the memory when it is no longer being used? If this is a standalone function it is fairly obvious that it is the caller’s responsibility to delete the string once it has been used, but what if the function is actually a public method of a class? Does the class use the memory internally, or has it allocated the memory solely for the caller’s exclusive use?

My reason for pondering this is directly related to the last post. I’m writing a simple textfile class that abstracts away the filesystem and can do useful things such as reading individual lines of text. The problem I’m coming up against is how to make the API for the class familiar to C and C++ programmers, embody the ease-of-use of a Java or C# class, whilst simultaneously trying to make it efficient for the DS hardware.

Following the fgets() pattern makes sense since fgets() is one of the functions offered by libfat. The standard getline() method is considerably more user-friendly, however, as it eliminates the tedium of ensuring that fgets() does really return a complete line. The alternative version of getline() is easier to use than both of the others, but ownership of the resultant memory is not at all clear solely from the code itself.

It is worth mentioning that all of the usual C++ ways of handling this are off-limits due to Woopsi’s strict adherence to binary-size-reducing “no STL” policy.