2009-12-13

Font Dilemma

Jeff’s PackedFont classes can do everything that my Font and MonoFont classes can do. They store their data more efficiently and require far less processing time to render their glyphs than my fonts. Why not just remove my fonts and simplify things?

I think I’ll do just that unless I can think of a good reason not to.

Comments

Lakedaemon on 2009-12-13 at 23:15 said:

Is there a way to make woopsi work with Japanese/Chinese fonts ? Or better yet, with freetype 2 ?

Ant on 2009-12-14 at 07:43 said:

I’d suggest talking to Chase-san about that - he seems to have solved the Japanese problem. Woopsi’s font system is flexible and extendible - it defines an interface for fonts, and implementing classes can do any wacky things they need to. To support FreeType, for example, you’d need to create a font class that used FreeType to render glyphs to a 16-bit bitmap.

Thinking about it, since the font classes do all of the text rendering, there’s nothing to stop someone from working in Unicode support too - the font could convert from ASCII to Unicode before it rendered the text.

However, this isn’t something I’m interested in adding at the moment. For the time being, I’m focused on improving the existing code and coding experience rather than adding in new features. If someone were to write a patch, though, I might include it…

Lakedaemon on 2009-12-14 at 15:46 said:

I’m trying to port to woopsi a quite simple nintendo ds app that already supports utf-8 (through ascii to codepoint conversion) and that uses freetype to display characters.

considering font.h and this method :

s16 drawChar(MutableBitmapBase* bitmap, char letter, s16 x, s16 y, u16 clipX1, u16 clipY1, u16 clipX2, u16 clipY2);

It looks like creating a font class won’t be enough (as this api only gobbles 1 char …and you need to be able to gobble 1~4 char for unicode), so I will also have to tinker with the text parser too.

I’m not sure I’ll be able to do it, as I’m only starting in c++ (and I was looking at woopsi NOT to have to do that), but well…as I have an already working model, I might just manage it.

Ant on 2009-12-15 at 07:27 said:

Ahh, yes. That’s a problem. I’ll need to think about that one.

Lakedaemon on 2009-12-15 at 10:51 said:

In fact, I have started implementing utf-8 support in woopsi :

that way, users will be able to use any fonts through freetype and they’ll still be able to use the built-in packedfonts

My plans are to :

1) slightly change the woopsistring class so that it stores in char array (as before) only valid utf-8 string (this is new) and to make it return u32 instead of char for the following functions :

u32& operator[](const u32 index) const {return _text[index]; 

virtual inline const u32 getCharAt(u32 index) { return _text[index]; }; 

2) modifying the 21 woopsifiles that are dependent on the woopsistring class to accept u32 instead of char (through a cast at first) if needed

3) modify the text functions, that use string length/count char/try to wrap things, to make it work (they’ll still see 1 utf-8 token made of 1~4 bites as a single char… so that’s one less problem).

I am a real beginner in c++ but I learn fast and I have quite a lot experience in python/programming complex stuff (japanese parser/xml parser) in really really horrible languages(TeX ^^;, so well, who knows, I might do some good job…

by the way, is there a reason why, inside methods, you implement

    char* newText = new char[2];
    newText[0] = text;
    newText[1] = '';

    setText(newText);
    delete[] newText;

instead

    char* newText[2];
    newText[0] = text;
    newText[1] = '';

    setText(newText);

Ant on 2009-12-15 at 20:40 said:

Hmm, seems to be a longwinded way to do it, but then I’ve never looked at implementing UTF-8. As I understand it, UTF-8 is designed to be a variable-length code, using between 1 and 4 bytes per character. If you store each character in a u32 then you lose the advantage of the variable-length - all characters are immediately 4 bytes whether they need to be or not.

A simpler approach would be to change the drawChar() method in the Font classes.

I could change the method to this:

s16 drawChar(MutableBitmapBase* bitmap, char* letter, u8* consumedBytes, s16 x, s16 y, u16 clipX1, u16 clipY1, u16 clipX2, u16 clipY2);

Instead of passing it a single char, pass it a char array. The “consumedBytes” parameter would be an output parameter and would contain the number of bytes used by the function in drawing the character. For the existing ASCII strings, this would always be 1 (since a single byte is consumed when an ASCII char is drawn). For a 2-byte UTF-8 character, this would be 2. Etc.

The text drawing methods of the Graphics classes would also need to be changed so that they work like this:

stringPointer = string to print
while stringPointer != terminator
    drawChar(stringPointer, consumedBytes)
    stringPointer += consumedBytes

The string rendering methods rely on the Font classes to tell them how many bytes have been consumed in drawing a character and advance their string pointer by that amount in order to reach the next character.

There’d be some more things to change around in the Font classes. Basically, any char parameters would need to be swapped to char arrays.

Any thoughts on that? As I say, I haven’t done any work in implementing UTF-8, so this might be completely the wrong way to go about it.

Regarding the heap/stack string issue - you’re right. I should change that!

Ant on 2009-12-15 at 21:33 said:

Hmm, reading this through again I think you’re probably on the right track anyway!

Lakedaemon on 2009-12-15 at 22:12 said:

Update : today, I worked on the woopsistring class and “converted it internally to uf8” (phase 1 is basically done) and tested it : multilinebox still works for me. I’m going to tinker a bit further into phase 2.

I’m no specialist about uf-8 either, so take what I say with a grain of salt: I tend to favour u32.

1) char* and u32 should allow us to pass 1~4 bites utf-8 tokens to the font class.

2) In the freetype library, it looks like there are lots of types (int and bytes) and lots of macros to convert from one type to another. http://www.freetype.org/freetype2/docs/reference/ft2-toc.html

It seems to me though that u32 is the prefered way to get glyphs, as shows this snipet of code of use of freetype for the nintendo ds :

FTC_SBit FontCache::GetGlyph(u32 c) {
    u32 flags = FT_LOAD_RENDER|FT_LOAD_TARGET_NORMAL;
    u32 index = FTC_CMapCache_Lookup(cmapCache, scaler.face_id, font->cmapIndex, c);

    FTC_SBitCache_LookupScaler(sbitCache, &scaler, flags, index, &sbit, NULL);
    return sbit;
}

s8 FontCache::GetAdvance(u32 c, FTC_SBit sbit) {
    if (c == '\t') return (GetAdvance(' ') < xadvance;

3) I may be mistaken but it should be possible to cast u32 into char which means that with u32 the other font class should still work. With char*, it is possible to make the conversion too : I’m using this for the moment :

/* Returns the numbers of bytes in the codepoint that has been read. Returns 0 on an error. */
u8 getCodePoint(const char* string, u32* codepoint) {
    char char0 = *string;

    u8 result = 0;

    if (char0 = 0xC2 && char0 <= 0xDF) {
        if (char0 == '' || string[1] == '') {
            return 0;
        }
        *codepoint = ((char0-192)<= 0xE0 && char0 <= 0xEF) {
        if (char0 == '' || string[1] == '' || string[2] == '') {
            return 0;
        }
        *codepoint = ((char0-224)<<12) | ((string[1]-128)<= 0xF0 && char0 <= 0xF4) {
        if (char0 == '' || string[1] == '' || string[2] == '' || string[3] == '') {
            return 0;
        }
        *codepoint = ((char0-240)<<18) | ((string[1]-128)<<12) | ((string[2]-128)<<6) | (string[3]-128);
        result = 4;
    }

    return result;
}

4) in the woopsistring class : the strings aren't stored as arrays of u32 because it would waste space, they are still stored as arrays of char.

To support utf8 this way, it is needed to implement 2 functions :

1) for Loops/iterators on the string : given a position in the string, one function that is able to return the next utf-8 token

2) for random read a fonction that is able to return the ith utf-8 token in the string.

function 2 will be slow but it doesn't matter because function 1 is very fast and much much more important as it allows you to loop other the utf_8 string and so to

render the string compute it's length (for wrapping purposes) compute it's width even if the font is not of fixed width (there are apis in the freetype to get how much the advance the font, without rendering it)

Ant on 2009-12-15 at 23:02 said:

Sounds good!

Wordpress has mangled the formatting in your comment, I’m afraid. When posting code wrap it in the following HTML:

<pre lang=“cpp”> Code here </pre>

That’ll stop it getting mangled, and syntax highlight it too.

Ant on 2009-12-16 at 04:07 said:

Oh, note that the stack/heap thing should look like this:

char newText[2];
newText[0] = text;
newText[1] = '';
 
setText(newText);

If you declare the array as “char* newText[2]“, what you get is an array of pointers to chars, not an array of chars.

Lakedaemon on 2009-12-18 at 22:09 said:

Today, I have succeeded in displaying a Kanji (a Japanese character) inside a MultilineTextBox.

Because I still have some problems with global/local variables (I’m learning C++ as I go), I can’t print a comlete string yet, but it is just a matter of time (I know what the problem is as well as a way to solve it ..but I have to learn how to implement it in c++).

Anyway, displaying unicode strings through the freetype library is possible in woopsi with only slight changes :

Got to use the woopsistring class to store unicode strings and return a u32 codepoint for char number i (done).

in graphport.cpp, got to modify the font->drawChar function that is called inside

void GraphicsPort::clipText(s16 x, s16 y, FontBase* font, u16 length, const char* string, const Rect& clipRect)
to use freetype (nearly done)

got to implement a nice font class for freetype.

But this will have to wait. I ran out of time. Next week is for snowboarding in the Alps. See you next week !

ps : Does anyone know a way to make libfat work in dsemume ? More precisely, I’m looking for a way to access a default.ttf font file at the root of the fat filesystem ? I read through the dsemume forum but couldn’t make it work.

Ant on 2009-12-19 at 22:33 said:

Sounds like you’ve made some good progress. Interested in sharing the code when it’s done?

As for libfat - you need to create a FAT image with Linux’s dd tool. As to how you achieve this, I have no idea. Let me know if you find out…

Lakedaemon on 2009-12-20 at 15:46 said:

Well, of course, i’m going to share the code. I’ll send you what I have done when I manage to make the content of a multilinetextbox work well with freetype (not that far off)…and I’ll collaborate to get freetype support for the remaining gadget.

I got the fat image created using a java program found in this thread : http://forums.desmume.org/viewtopic.php?id=102

My problem is that I haven’t succeeded in making it work with desmume… have tried a few things, but none worked for me…

Lakedaemon on 2009-12-28 at 21:35 said:

Update : I am now able to use freetype to display arbitrary utf8 strings inside a multiline text box.

Next step is modifying the spacing and wrapping algorithm to accomodate the variable width/height characters

Ant on 2009-12-29 at 10:17 said:

More good progress! Do you know the ROM size of the UTF and FreeType build of Woopsi versus the standard build?

Lakedaemon on 2009-12-29 at 11:20 said:

The size of the libfreetype.a that I use is 2322 kb.

But as I’m only using a subset of the library, it only adds around 400kb to the rom size (libfat included)). The fontcache the code is using should add like 32kb (you can choose it’s size)

And you can store gazillion of fonts on the sd flash card. The one I’m using is like 3700ko (high quality japanese font), there are much smaller fonts of course.

I have had a quick look at the wrapping code. It’s going to be a pain to modify it and it’ll take time to make it support variable width/height chars and variable height lines.

Maybee a first step would be to modify the fontbase class to allow non fixed width/height chars.

The following virtual methods should be ok : getStringWidth getCharWidth

But I think some problems will arise from this one :
getHeight

Maybee it should be getStringHeight getCharHeight

Because the fontbase class should only worry about char heights or about the height of a substring formatted on 1 line but should’nt be concerned with wrapped text heights

Here is a link to what I have done till now (that’s not much and it uses some of jake probst code from the ankidssync application but it should be a good base for more improvment) : http://www.lakedaemon.org/libwoopsifreetype.zip

In fact, I don’t have much time and I have a lot of ongoing projects (work, learning japanese, rewriting from scratch an app using woopsi), so I fear that I’ll have to let you do most of the integrating work… but I’ll help if I can :

The way it is, it should be possible to use fixed height/width freetype fonts without modifying the wrapping algorithm…which is already quite nice…but it would be nice if, in the future variable width/height fonts were supported in woopsi.

One thing that I can (and will try to) do today/tomorrow is : making use of the 255 levels of gray of the freetype font to make the chars display with respect to opacity (i.e “alpha chanel”).

That way, it should be able to write white chars on black background or write black chars ‘transparently” on an image.

Lakedaemon on 2009-12-29 at 11:22 said:

put the files ankids.nds and default.ttf that are in the release folder at the root of your nintendo ds, for an example of use

Lakedaemon on 2009-12-29 at 13:29 said:

Ok “transparency/alpha channel” is done, it was just a matter of coding it like :

                      rgb = 0x8000;
                      rgb = rgb |  ((((bitmapColour & 0x7C00) * (maxGrays-grayLevel) + (sourceColour & 0x7C00) * grayLevel) /maxGrays) & 0x7C00); 
                      rgb = rgb |  ((((bitmapColour & 0x03E0) * (maxGrays-grayLevel) + (sourceColour & 0x03E0) * grayLevel) /maxGrays) & 0x03E0); 
                      rgb = rgb |  ((((bitmapColour & 0x001F) * (maxGrays-grayLevel) + (sourceColour & 0x001F) * grayLevel) /maxGrays) & 0x001F); 
 

This is obviously not optimised as it could be computed with

agL/mG + b (mG-gL)/mG = b + (a-b)*gL/mG;

That is half the number of multiplication and division…but I wanted to keep it simple to make it more maintainable (and for debugging purposes…it didn’t work at the first try)

I’ll update the link and then I”ll begin rewriting my app from scratch using woopsi… and try to improve the utf/freetype supportof woopsi on the way… (I’m waiting to see how you’ll change woopsi too)

Ant on 2009-12-29 at 17:00 said:

The wrapping code should already handle proportional width fonts, as it gets the width for each letter from the font itself. Regarding non-fixed height fonts - I don’t think I’ve ever seen an application that supports that. The .NET API, which is probably the most modern font implementation I’ve looked at recently, has a fixed line height for all fonts. Having randomly-sized rows in text would make it extraordinarily difficult to read. Do you know of a program that uses proportional height fonts?

Other than that, the example seems to work really well. I’ll have a look through the code and see how it works. My one concern is the file sizes involved. The DS has 4MB of RAM, minus the size of the ROM. If the ROM is 745K just for “hello world” and the font is 3.8MB, that leaves a grand total of, um, -540K for actual functionality. This has been a concern of mine since unicode/TTF support was suggested. If we add in support for this, is there any scope for making an application as well as showing nice typefaces?

Lakedaemon on 2009-12-29 at 17:15 said:

The font file isn’t loaded in memory : it sits on the sd card and stays there.

The Rom and Freetype uses a cache whose size you set (30kb in my example), in which it puts bitmaps of the fonts. It refreshes the cache if needed.

You’ll still get like 3.3 Mo on a DS and 15.3Mo on a DSi to build your App I say 400k of Freetype & libfat + 30kb of font cache is a little price to pay to make software coded with woopsi usable by 6+ billions people ^^ and… it spares the developpers from worrying about fonts (and bring antialiasing… and maybee kerning later).

The App I want to rewrite from scratch with woopsi uses Freetype and the 3.3 Mo font and only uses a total of 1Mo of memory.

Here is the promised link to the improvment I did today : http://www.lakedaemon.org/libwoopsi2.zip I cleaned my code a bit, I implemented the transparency and I started modifying things in text.cpp/text.h to make the text class work with utf8 string.

Lakedaemon on 2009-12-29 at 17:27 said:

I’m coming from the (La)TeX World which has had a very gorgeous/professional font implementation/Text formating from the last 30 years… ^^

I don’t even know if Word/other programs have catched up to the quality of TeX yet ^^

Basically, at one point, I might want to display lines where each token is a box (be it a Font or an image) and those might have different height/depth relative to the baseline. The height of this line should be maximum of all heigth (and the depth of the line, the max of all depth)…

This is why I asked about height. But this can come in the far far future. At this point, I’m mainly concerned about width…

It’s good to know that it should handle variable width. I have just begun looking at the code, I”ll give it a harder look on the next few days. (Even if the dimensions are wrong, I still have to make a multilinetextbox break text into multiple lines before I can consider moving further).

On the next days, I’ll maybee try to implement a Freetype font class as a child of the fontbase class, if it’s not too hard for me and if it’s possible.

Ant on 2009-12-29 at 19:24 said:

Just tried out the latest demo - that does look pretty spectacular!

Lakedaemon on 2009-12-30 at 13:38 said:

Update : http://www.lakedaemon.org/Release.zip Try it with the 53kb default.ttf font

Finally, I have managed to write my font class for the freetype fontcache. it’s in myfont.cpp and myfont.h (and it uses the fontcache.h and fontcache.cpp code that I shoul clean too)

It solves a few issues but brings some other (notably I can’t display non ascii chars till I have resolved problem 2 below). The next steps (not necessarily in this order) are :

1) improve the support for utf8 strings in woopsi :

There are a lot of places that use : a) char* b) getCharArray()[i] c) strlen() d) loops …

and that must be adapted to work with utf-8 strings

2) Make the font class pass unicode code point u32 instead of char

3) fix the wrapping code : it doesn’t work yet

4) fix the “bug” that makes blank char [ ], not get dispalyed (i.e the pen doesn’t advance)

Could you please look at the last demo and give me a hint where to look/what to look for to fix 3 & 4 ?

Ant on 2009-12-30 at 14:12 said:

Point 4 is probably related to the getCharWidth() method in myfont - the blank char presumably has a width of 0, so that’s what Woopsi is being told to advance by when it hits a space. You should catch any zero widths and return a default width instead.

Lakedaemon on 2009-12-30 at 16:26 said:

Mmmmh, and it looks like the wrapping algorithm works just fine… when TEXT_ALIGNMENT_HORIZ_LEFT is set but not when TEXT_ALIGNMENT_HORIZ_CENTRE is set…

I got my clues… investigating further…

Lakedaemon on 2009-12-30 at 16:47 said:

Ok, I got the wrapping in the multilinebox to work fine for an Ascii String + Freetype.

The fault lied with an inverted boolean value in both of my MyFont::getStringWidth functions, which made them NOT increase the string width ^^;

4) Wrapping is ok

Let’s turn to the “mysterious disappearance of the blank space chars” Case now…

Lakedaemon on 2009-12-30 at 18:03 said:

Just found the bug that messed space chars too…

it was another of my brilliant boolean tests. I hadn’t realised that freetype would give the space char metrics but would not give it a buffer…Which is quite logical.

So… the only remaining roadblock for total freetype goodness in woopsi is unicode support. Starting tomorrow, I’ll try to get everything to work with utf-8 strings and to pass u32 codepoint instead of char.

Lakedaemon on 2009-12-30 at 23:14 said:

Lat update : it looks quite good now http://www.lakedaemon.org/Working.zip

Still got artefacts when redrawing the multiline box (and with \n that are displayed by the japanese font but this will be fixed very quick)

The code in fontcache.cpp and .h must still be cleaned up and integrated into myfont.h and myfont.cpp

Ant on 2010-01-03 at 17:05 said:

The latest archive seems to be missing a prebuilt ROM file and the FreeType code. Also, do you have a makefile for the project?

Ant on 2010-01-03 at 19:46 said:

I’m adding your UTF-8 additions to the WoopsiString class into the main Woopsi codebase. I’ve come across a few problems that I’ve managed to fix.

First off, the overloaded [] operator. In C++, it’s possible to return a reference to a value. It isn’t the value itself, nor is it a pointer to the value, but it provides immediate access to the value (I think Stroustrup describes it as an “alias”). This operator should return a reference, so that this is possible:

mystring[2] = 'a';

The operator function returns a reference, and can therefore be used as an lvalue. If, however, you try the same thing with an overload that doesn’t return a reference you should get a compiler error. Since it isn’t really possible to set a char at given index (since each char is of a different length) I’ve removed the operator overload entirely.

I’ve renamed a couple of the functions to fit with Woopsi’s coding style. unicodeGetCharAt() should really be called getUnicodeCharAt(), to follow typical conventions in which functions start with a verb. I’ve swapped that (and a few others) around.

The getCodePoint() signature seemed a bit backwards to me. This was the original signature:

u8 getCodePoint(const char* string, u32* codepoint)

It returns the number of bytes in the codepoint as its return value, and populates the codepoint pointer as an output parameter. However, the main point of the function is to return the codepoint, so having it return the count (a side-effect) is counter-intuitive. I’ve swapped it around to this:

u32 getCodePoint(const char* string, u8* numBytes)

The codepoint is returned as the primary parameter, and the number of bytes is an output parameter. This has enabled me to remove the “trash” value from the filterString() method.

Some of the functions seemed to be floating around in the global namespace (unicodeGetCharAt() and getCodePoint()), whilst others were defined in the .cpp file but not given a signature in the .h file. I’ve merged these into the WoopsiString class as protected methods.

Lastly, I’ve fixed the insert() method so that it only allocates memory if it needs to. It also keeps the _allocatedSize value up to date.

This is available in the SVN repository. Hopefully I haven’t introduced any new bugs!

Lakedaemon on 2010-01-04 at 10:31 said:

Here is the last snapshot of my code for utf8 & freetype & a demo & a Makefile (showing of at least 2 remaining bugs…that shouldn’t be too hard to fix) http://www.lakedaemon.org/Multiline.zip

The overloaded operator was only used in the woopsi code to read codepoint instead of chars in the legacy code.If you removed it, it’ll probably break things but it should be easy to fix… by replacing string[i] with string->getCharAt(i).

About the getCodePoint function, I hesitated between the two syntaxes that are basically equivalent. I chose the first one because I liked the code to check for errors and looping and the (function I took inspiration from in fontcache.cpp did it that way) but it’s more logical with your way.

About the namespace…you are right as It all started as a hack (with global variables to pass things around and test stuff) and I have only recently understood how to use namespaces…

I counted on you to do the “integrate it well in the woopsi library” part ^^

Beware… I had to do weird things with the “const” word to make things work for me, because I have only just understood what it was for too… So you might have to change a few signatures to make the utf8/freetype code consistent too

Basically, the utf8 code should be ok…but the freetype code (in fontcache.h/.cpp) needs more work as it creates a new fontcache for each new myfont instance…and I’m petty sure it’s not the way to go (but as I only do experiments with 1 font for now, it’s ok for hacking/tryng to implement initial freetype support in freetype)

Lakedaemon on 2010-01-04 at 11:02 said:

freetype support will slow a bit starting today : Work resumes tomorow and I make a break : these last three days, I have begun porting/rewritting my app to woopsi (already around 33%-50% done ).

I should be able to do it in 1 week (maybee 2 weeks)…and I’ll resume focusing on freetype support after that. Besides, it’ll help and motivate me to improve utf8/freetype support in woopsi.

I haven’t looked at your changes yet but I’ll do it when this is done