Unicode support in Woopsi is coming along nicely. Most of the gadgets now expect to receive WoopsiString objects instead of chars or char arrays when dealing with text. As the WoopsiString now works with UTF-8, that means most of the gadgets now also support UTF-8.
I’ve overloaded the WoopsiString’s “=” operator for WoopsiString objects, char arrays and u32s, which means that any parameter that should be a WoopsiString can accept char arrays or u32s instead. Converting from string literals to WoopsiStrings is therefore seamless and automatic.
The glyphs are now stored in a separate “GlyphFont” font. The default glyph font is stored in the defaultGadgetStyle global variable, in the same way that the default system font is stored. Converting this unearthed a couple of bugs in the Bmp2Font program - it doesn’t write out the resultant font constuctor properly (it misses out the transparent colour and uses the raw bitmap data instead of the bitmap wrapper object). I still need to fix those.
The most problematic changes were convincing buttons that displayed glyphs to centre their text properly (turns out that they were centring based on the wrong font) and getting the MultiLineTextBox to redraw at a reasonable speed. After some cunning hacking about with the WoopsiString and the MultiLineTextBox I’ve got it back up to something approaching its old speed.
Regarding speed, unicode definitely has a negative impact. Having to scan through an entire string to locate a codepoint is a horrible performance killer. WoopsiString::getCharAt() is definitely not a function you should use if you’re reading out sequential characters, as it looks for the codepoint from the start of the string each time it is called. It’s much more advisable to use getToken() to get a pointer to the first token, read out the codepoint, then increase the pointer by the size of each character as it is read. Before I went back to optimise things, the MultiLineTextBox was using getCharAt() when drawing and so was the PackedFontBase, meaning the string was being scanned through twice for each character being printed. Xcode’s “Shark” profiling utility noted that the MultiLineTextBox was using 31% of the total CPU time when printing a line of text solely because of all the UTF-8 parsing. Yikes.
There are still a lot of gadgets to convert over, and probably a hefty set of bugfixes and optimisations to make.