2011-03-28

Woopsi 1.1 Released

A new version of Woopsi is now available:

If you’ve been following the threads on the forum, you’ll have a good idea of the changes in this latest release. A new Woopsi user - carpfish - has contributed some neat features to Woopsi, such as natural string sorting, string splitting and formatting, and many others. Most of the changes in this release are either based on his work or solely his work. He has also contributed a number of bugfixes.

So, many thanks to carpfish for his contributions. Thanks also to MBMax who I believe introduced him to Woopsi.

The full changelog can be found in the archive, as usual.

2010-01-19

A Bugfix Bonanza

If you have been following the comments for the previous few posts, or the forum, you’ll have seen work progressing at a frantic pace. Most of the thanks for this goes to Lakedaemon, who has been zapping bugs throughout Woopsi’s text rendering system. Working so quickly and not regularly blogging about it means that I’ve probably got a lot of changes to list here. Let’s see if I can collate them into some sort of order…

The WoopsiString class has seen a lot of changes. I’ve added indexOf(), lastIndexOf() and subString() methods. The first will find the first index of a particular character in the string and return its index. The second does the same thing, but in reverse - it finds the last index of a character. The last will return a pointer to a WoopsiString containing a subsection of the original string.

WoopsiStrings are no longer null-terminated. They already included a member variable that stores the string length, so appending a terminator was a waste of a byte.

The Text class’ wrap() function has had some long-standing bugs fixes. The strangest was probably an invalid comparison between a line index and a char index that was supposed to prevent the function from wrapping the entire text, but actually just screwed up the function most of the time, and made it slower the rest of the time. Not sure how that stayed in there for so long.

The next most significant bugfix was to the GraphicsPort’s clipScroll() method. This is used primarily in the ScrollingPanel gadget. It performs two functions. If it is possible to copy any of the existing visible region to the new location, the function copies it. This prevents redrawing, which is slow, if the required visual data is already available. Secondly, it remembers any rectangular regions that require redrawing and passes the list back to its caller. The ScrollingPanel uses the list to redraw any non-copied regions of itself.

The clipScroll() method worked perfectly if scrolling in just one plane, but as soon as horizontal and vertical movement was attempted simultaneously, it fell apart. Rectangular regions were missed out or copied to the wrong locations, particularly in the corners.

The function was originally tackling the problem backwards. It calculated the size of the source and destination rectangles then clipped those rectangles. However, this introduces two problems:

  • The rectangles could become different sizes;
  • The rectangles could acquire different offsets from their original locations.

Neither problem is too difficult to solve, but it does mean that if either problem arises, the function will not be able to copy the regions and must therefore fall back on the much slower redraw option.

The “correct” approach to the problem is to clip before calculating the sizes of the source and destination rectangles. That will ensure that neither of the two highlighted issues arises, which means that the copy can be used in preference to redrawing much more frequently.

I’ve finally got around to removing the glyphs from the Topaz and NewTopaz fonts. As the glyphs are now an entirely separate font, keeping that data in the bitmaps was a waste of space. Neither font requires colour now that the glyphs are gone, to I’ve converted Topaz from a Font to a MonoFont, and converted NewTopaz from a PackedFont16 to a PackedFont1. This has made both fonts considerably smaller.

Related to this, PackedFontBase::isCharBlank() returns the correct value (false) if the requested character is outside the range of the available bitmap data.

Dragging a background screen below the level of the topmost screen no longer crashes the system. The Screen class was attempting to copy a region with a negative height, which the DMA hardware really doesn’t like.

The FileRequester has had yet more fixes since the last post. When running in SDL mode, it now shows a dummy list of files and directories. I had intended to get it to show a real list of directories, but I ran into The Windows Problem - every major operating system but Windows has POSIX-compliant file system access. Unfortunately, of those major operating systems Windows is the most prevalent. I really couldn’t be bothered to write the POSIX code if most Woopsi developers are using Windows, and I couldn’t be bothered to write Windows code when I only run the SDL version in OSX. The compromise, which will work on all platforms, POSIX or not, was just to create the dummy list.

Finally, both the MultiLineTextBox and the standard TextBox respond to d-pad repeats. Holding down left or right on the d-pad when either of these two gadgets has focus and a visible cursor will cause the cursor to move repeatedly until the d-pad is released.

2010-01-07

A Better UTF-8 String

The WoopsiString Class - Optimised for ASCII

The current WoopsiString class stores its data in the usual way: an array of chars. When the string is initially set, the object creates an array large enough to hold the data it should store. It then copies the data into that new array.

When appending to the string, the object checks to see if it contains enough free space within the existing array to contain the new data. If so, the data is just copied into the free space. If not, a new array is allocated large enough to contain both the existing data and the new data. The existing data is copied into it, and the new data is copied to the end.

Insertion works in a similar way, except two portions of the existing data will be copied separately into disjointed sections of the new array if the insertion point falls within the existing string.

In contrast, removal is very simple. A new terminator character is placed at the new endpoint of the string, and everything following it is deemed to be free space, ready for future insert/append operations.

Although it involves a lot of copying when inserting and appending, this system works exceptionally well for the two most frequent string operations - random access to a character within a string, and iterating sequentially over the characters in a string. Keeping all of the characters in a sequential block of RAM makes these two operations trivial.

Reducing Data Copying Via Linked Lists

When I was writing the WoopsiString class, however, the amount of copying annoyed me. I came up with a solution that performed a lot less copying by breaking the string data up into a linked list of string fragments. For example, imagine that we have this string:

hello

Drawn with pipe characters as byte boundaries, it looks like this:

| h | e | l | l | o | \0 |

Suppose we want to append the text “ world!” onto that string. The current WoopsiString class would do this:

  1. Allocate a new char array of length 13.
  2. Copy “hello” from existing array into new array.
  3. Copy “ world!” from second array into new array.
  4. Append terminator to end of new array.
  5. Delete existing array.
  6. Update array pointer to point to new array.

We end up with this arrangement:

| h | e | l | l | o |   | w | o | r | l | d | ! | \0 |

The proposed new WoopsiString class would avoid all of the tedious copying by appending the “ world!” text as a new string fragment:

  1. Allocate a new linked list item containing a char array of length 7.
  2. Copy “ world!” into the new array.
  3. Point the “next” pointer of the “hello” fragment to the new “ world!” fragment.

The data structure we end up with looks like this (“-->” indicates a pointer):

| h | e | l | l | o | --> |   | w | o | r | l | d | ! |"

This new system has clear advantages over the current class, at least when it comes to appending. Very little copying needs to be performed.

Inserting has similar benefits. If we want to insert a comma after the word “hello” we would need to perform the following operations in the current class:

  1. Allocate a new char array of length 14.
  2. Copy “hello” from existing array into new array.
  3. Copy “,” from second array into new array after the word “hello”.
  4. Copy “ world!” from existing array into new array.
  5. Append terminator to end of new array.
  6. Delete existing array.
  7. Update array pointer to point to new array.

We end up with this:

| h | e | l | l | o | , |   | w | o | r | l | d | ! | \0 |

Again, lots of copying. The new class would do this instead:

  1. Allocate a new linked list item containing a char array of length 1.
  2. Copy “,” into the new array.
  3. Point the “next” pointer of the “,” fragment to the “ world!” fragment.
  4. Point the “next” pointer of the “hello” fragment to the “,” fragment.

We get this:

| h | e | l | l | o | --> | , | --> |   | w | o | r | l | d | ! |"

No copying involved. However, we are cheating a little here; we’re taking advantage of the fact that we’re inserting between two existing fragments. What happens if we try to insert “llo he” into our “hello, world!” string just after the second character?

  1. Allocate a new linked list item containing a char array of length 2.
  2. Copy “he” from the “hello” fragment into the new array.
  3. Allocate a new linked list item containing a char array of length 6.
  4. Copy “llo he” from the second array into the new array.
  5. Allocate a new linked list item containing a char array of length 3.
  6. Copy “llo” from the “hello” fragment into the new array.
  7. Point the “next” pointer of the “he” fragment to the “llo he” fragment.
  8. Point the “next” pointer of the “llo he” fragment to the “llo” fragment.
  9. Point the “next” pointer of the “llo” fragment to the “,” fragment.
  10. Delete the “hello” fragment.

The final data structure looks like this:

| h | e | -- > | l | l | o |   | h | e | --> l | l | o | --> | , | -->
|   | w | o | r | l | d | ! |"

(We could optimise that slightly by making use of the existing “hello” fragment instead of deleting it, which would ultimately give us “hello”-->” hello”->“,”-->” world!” instead. It makes the algorithm slightly more complex but slightly more efficient.)

The existing class would need to make 1 memory allocation, copy 19 characters and perform 1 memory free to be able to perform this operation. The new class would make 3 memory allocations, copy 11 characters and perform 1 memory free to do the same thing. Though this doesn’t look like much of an improvement, imagine that the string we’re inserting into is 5,000 characters long, and has been created by concatenating five 1,000 character strings together. The existing class would copy all 5,000 characters during the insert. The proposed new class would copy just 1,000, as it could work with an individual fragment instead of the entire string.

Problems with Random Access and Iteration

If this is such a great improvement, then, why didn’t I implement it? The answer lies right at the top of this post. The two most frequent operations performed on strings are extracting substrings and iterating over their characters. If a string is stored non-contiguously, these two operations become much more complex. In a sequential string, extracting a character from a string can be done in O(1) time, and is as simple as getting a pointer to the start of the string and adding an offset. Iterating over the string is simply a matter of adding 1 to the pointer.

In the proposed class, extracting a character is far more laborious. Assuming each fragment includes an int that identifies its length, the algorithm for finding a character looks like this:

Store a pointer to the head of the list in "listPointer"
Store its length in a temporary variable "totalLength"
While the desired index is greater "totalLength"
    Point "listPointer" to its own "next" pointer
    Add the length of the "listPointer" fragment to "totalLength"

Retrieve the character from the "listPointer" char array at desired index
minus "totalLength"

The worst-case scenario, in which the desired index is the last character in a string that has been built by appending single characters, will result in a retrieval time of O(n).

For ASCII strings, this is disastrous.

When dealing with UTF-8 strings, however, this approach starts to look far more appealing. In the best-case scenario for retrieving a character from a UTF-8 string the time is O(n), because there is no way of knowing where a character is stored without iterating over the entire string.

In Defense of UTF-8

To explain this, here’s a quick description of the problem that UTF-8 tries to solve. In ASCII text, each character is represented by a single byte. The most significant bit is not used, so each byte can represent one of 128 different characters. This is great for the English-speaking world, as ASCII text can represent all characters from the standard English alphabet, all numerals, and most of the important symbols (though us Brits are screwed when trying to talk about money, as the symbol for the British pound is not in the ASCII set).

The rest of the world is not so lucky. The ASCII set does not include accent marks, so the French are stuck. It doesn’t include umlauts or the Eszett (double ’s’) so the Germans are stuck. And it definitely doesn’t include the tens of thousands of characters in any of the multiple Chinese languages, so they’re really stuck.

The solution is obviously to make characters longer than a single byte. Suppose we make all characters 32-bits wide, for example. Now we have another problem - all of our existing applications, that assume characters are 8-bits wide, no longer work. Plus we’ve quadrupled the size of all strings in all future applications. If those applications are for American users, ¾ths of that is wasted space.

UTF-8 solves this by using multiple bytes for characters if necessary. It will represent ASCII characters as single bytes, but other characters will be represented as either 2, 3 or 4-byte sequences.

The problem with variable-length encodings is that it is impossible to know where a particular character is. For example, suppose we decide that the letter “w” should be represented by 4 bytes. “Hello world” now looks like this:

| H | e | l | l | o |   | w | w | w | w | o | r | l | d | \0 |

In order to find the first “r” in the string (character 8 ) we need to retrieve byte 11. Since all of the characters in the string could be multibyte characters, the only way we can possibly guarantee that we retrieve the correct character is by starting at “H” and working forwards, a character at a time.

Optimising UTF-8 with Linked Lists

This is where the new WoopsiString idea becomes extraordinarily helpful. Suppose that the string above has been created by concatenating the strings “Hello ” and “world”:

| H | e | l | l | o |   | --> | w | w | w | w | o | r | l | d |

Locating the “r” in the string is now considerably faster. We can bypass the entire “Hello” portion of the string simply by comparing the index we’re looking for (8) with the length of the “Hello” fragment (5), which tells us that the character we’re looking for isn’t contained in that fragment. For ASCII strings, this was a horrible performance decrease, but for UTF-8 strings represents a considerable performance increase.

If strings were automatically divided into fragments (32 characters - not bytes - wide, for example), and if the fragments included “previous” as well as “next” pointers (ie. a doubly-linked list), random access performance can be improved even further.

Iterating with Linked Lists - The StringIterator Class

As always, there is a penalty to pay for this performance increase. Iterating over a UTF-8 string is more complex than iterating over an ASCII string, but not much more complex. Once the starting character has been located, iterating over the string is simply a question of consuming bytes until a valid codepoint (the “value” of the character, sans the UTF-8 multibyte markers) has been identified, returning the codepoint, moving to the next byte and starting the process again.

If linked lists are introduced into the mix, finding the starting character becomes much faster, but iterating over the string from that point (assuming the string section being iterated over crosses a fragment boundary) becomes more complicated.

At the moment, Woopsi handles iteration over strings by extracting a char pointer from the WoopsiString object, then iterates over it by adding the size of each character to the pointer (the same algorithm as outlined above). If linked lists were introduced, this would not be possible. In fact, the only practical and sturdy solution I can some up with is to have a separate StringIterator class which would handle all string iteration. You’d need to give it a starting character to locate, then use its “MoveNext()” and “MovePrevious()” methods to iterate over the string.

Summary

In summary, if it’s possible to sum up these disparate thoughts, it’s possible to create a string class that uses linked lists in order to require far fewer copy operations than the existing WoopsiString class for standard string functions. Unfortunately, for ASCII text, it would be counterproductive to do so. Conversely, the same design presents a significant performance improvement for UTF-8 text, at the cost of making iteration more complex and time consuming.

2010-01-05

Unicode News

Unicode support in Woopsi is coming along nicely. Most of the gadgets now expect to receive WoopsiString objects instead of chars or char arrays when dealing with text. As the WoopsiString now works with UTF-8, that means most of the gadgets now also support UTF-8.

I’ve overloaded the WoopsiString’s “=” operator for WoopsiString objects, char arrays and u32s, which means that any parameter that should be a WoopsiString can accept char arrays or u32s instead. Converting from string literals to WoopsiStrings is therefore seamless and automatic.

The glyphs are now stored in a separate “GlyphFont” font. The default glyph font is stored in the defaultGadgetStyle global variable, in the same way that the default system font is stored. Converting this unearthed a couple of bugs in the Bmp2Font program - it doesn’t write out the resultant font constuctor properly (it misses out the transparent colour and uses the raw bitmap data instead of the bitmap wrapper object). I still need to fix those.

The most problematic changes were convincing buttons that displayed glyphs to centre their text properly (turns out that they were centring based on the wrong font) and getting the MultiLineTextBox to redraw at a reasonable speed. After some cunning hacking about with the WoopsiString and the MultiLineTextBox I’ve got it back up to something approaching its old speed.

Regarding speed, unicode definitely has a negative impact. Having to scan through an entire string to locate a codepoint is a horrible performance killer. WoopsiString::getCharAt() is definitely not a function you should use if you’re reading out sequential characters, as it looks for the codepoint from the start of the string each time it is called. It’s much more advisable to use getToken() to get a pointer to the first token, read out the codepoint, then increase the pointer by the size of each character as it is read. Before I went back to optimise things, the MultiLineTextBox was using getCharAt() when drawing and so was the PackedFontBase, meaning the string was being scanned through twice for each character being printed. Xcode’s “Shark” profiling utility noted that the MultiLineTextBox was using 31% of the total CPU time when printing a line of text solely because of all the UTF-8 parsing. Yikes.

There are still a lot of gadgets to convert over, and probably a hefty set of bugfixes and optimisations to make.

2009-11-05

Strings, C and C++

A function to read a single line of text from a text file is a tricky thing to write in C++, at least when one is limited to fgetc() and fgets() from the standard C library.

Memory allocation is the main problem. A readline() function must obviously store its output somewhere. There are three possible approaches. In the first, and most usually recommended, the function should receive a block of pre-allocated memory. In the second, the function creates and returns a new block of memory. An in-between method could receive pre-allocated memory but enlarge it as needed.

The fgets() function is the most primitive example of a readline() function in the standard C library. It relies on the caller to supply it with pre-allocated memory in which to store its output, and it looks like this:

char* fgets(char* buffer, int count, FILE* stream)

The function reads from stream until it hits either a newline, end of file, or has read the number of characters indicated by count (this should be at most the size of buffer). It stores the output in buffer, and also returns a pointer to buffer when the function ends.

fgets() has a major problem, of course. How many characters are there in the line being read? Answer: No idea. fgets() is a low-level function that may read a whole line or just part of a line, and it is the programmer’s responsibility to identify when a partial line has been returned and repeat the process until all characters are retrieved.

A function that solves this problem is getline() from the GNU C library (not the same as the getline() method available to C++ istreams). It looks like this:

ssize_t getline(char** lineptr, size_t* count, FILE* stream)

The function is guaranteed to read an entire line of text. lineptr performs the same function as buffer in fgets() - the text retrieved by the function is stored here. count contains the size of the lineptr buffer. stream is a pointer to the file being read. In this function, memory can be allocated before the function is called, but it does not need to be (in that case - deep breath - the pointer to which lineptr points must point to NULL and count must be 0). The function uses realloc() to increase the size of the lineptr buffer as necessary. The eventual size of the buffer is stored in count and returned as the output of the function.

It can be thought of as an example of the “in-between” approach described earlier. It relies on externally-created memory but can enlarge the memory area to accommodate the full string. Choosing a generous initial buffer size will reduce the number of realloc() operations performed. Assuming realloc() is usually successful in increasing a block of allocated memory, rather than needing to allocate a new region and copy the old data into it, getline() should be comparable in speed to fgets(). The minor speed penalty is worth the ease of use the function affords.

However, this function has its own set of problems, at least within the scope of Woopsi. It is not available for starters - devkitARM does not include the GNU C library. Even if I were to borrow the code, it really highlights one of the problems with C++.

getline() expects to be passed a pointer to either memory allocated via malloc() or NULL via the lineptr parameter. C++, however, essentially deprecates the malloc/free method of allocating heap memory. A C++ programmer should instinctively use the new/delete memory allocation system. This will potentially cause all manner of chaos when getline() calls realloc() on memory that was allocated using the “new” keyword. Since memory allocated with “new” cannot be reallocated, a C++ re-implementation of getline() will always copy memory every time the buffer is resized. getline() is a good example of why C++’s palimpsest-like layering over C was perhaps not such a good idea.

Rather than allowing memory to be allocated outside of the function and populated within the function, a different approach is to handle all memory allocation internally:

char* getline(FILE* file)

This version of getline() would be more familiar to programmers using higher-level languages such as C# and Java. It doesn’t perform magic with pointers-to-pointers. It will allocate memory within itself, grow it as necessary, and simply return a pointer to the string once it has been retrieved.

Outside of memory-managed languages, though, this function has problems. How has the memory been allocated - new or malloc()? We can document this in the function’s comments, and we can also follow the C++ standard of using “new” to make this more intuitive, but the question still needs to be asked before the function can be used. More importantly, who is responsible for deleting the memory when it is no longer being used? If this is a standalone function it is fairly obvious that it is the caller’s responsibility to delete the string once it has been used, but what if the function is actually a public method of a class? Does the class use the memory internally, or has it allocated the memory solely for the caller’s exclusive use?

My reason for pondering this is directly related to the last post. I’m writing a simple textfile class that abstracts away the filesystem and can do useful things such as reading individual lines of text. The problem I’m coming up against is how to make the API for the class familiar to C and C++ programmers, embody the ease-of-use of a Java or C# class, whilst simultaneously trying to make it efficient for the DS hardware.

Following the fgets() pattern makes sense since fgets() is one of the functions offered by libfat. The standard getline() method is considerably more user-friendly, however, as it eliminates the tedium of ensuring that fgets() does really return a complete line. The alternative version of getline() is easier to use than both of the others, but ownership of the resultant memory is not at all clear solely from the code itself.

It is worth mentioning that all of the usual C++ ways of handling this are off-limits due to Woopsi’s strict adherence to binary-size-reducing “no STL” policy.

2008-10-10

More Strings and Things

Minor changes to the WoopsiString class. It now reallocs with the necessary size plus 32 chars, which should reduce the number of reallocs required. I’ve fixed an off-by-one bug in the remove() method, and added some more bounds checking in the same place.

The Text class, used by the MultiLineTextBox to store its contents, now inherits from the WoopsiString class. This means all of the string functions are stored in a single class instead of being duplicated in two places, and it also means that the Text class gets all of the new memory management code.

As a result of this, the MultiLineTextBox now has a removeText() method. The keyboard’s backspace key can now be wired up, which I’ve demonstrated in the keyboard example.

Next on the list - finish cursor support in the TextBox. I can either leave the way it works as it is, or follow Jeff’s advice and have multi-character selection in addition to the cursor. In any case, it might be handy if tapping a textbox with the stylus jumped the cursor to the click position.

Once that’s done, I need to add cursor support to the MultiLineTextBox.

2008-10-09

Strings and Things

The Text class is fixed. It now wraps lines of text that are wider than the available display window but have no natural breaking points. The wrapping function was skipping over characters when attempting to force breaks into non-broken text and drifting off into memory after the end of the string.

Next up, there’s now a WoopsiString class. This wraps around a char array and provides all of the functionality I’ve been doing manually in the TextBox class - setting the string (and moving memory about), concatenating strings (and moving memory about), inserting one string into another (and moving memory about), and removing sections of a string (and moving- well, you get the idea).

I’ve taken the opportunity afforded by this rethink to improve the way the memory allocation is handled. Instead of blindly reallocating a new char array every time the string changes length, the new WoopsiString class attempts to work within the memory it has already allocated. For example, imagine we make this string:

WoopsiString* s = new WoopsiString("abcd");

Now we append another string to end:

s->append("efgh");

We’ve got no choice here but to expand the memory used, which means allocating a new char array of length 9 (initial string length plus second string length, plus terminator) and copying the two strings into it. However, if we then truncate the string there’s no reason for us to allocate a smaller array:

s->remove(2);   // Remove chars from index 2 until end of string

All we really need to do here in order to truncate the string from index 2 onwards is put a terminating character in the 3rd array element. We’ve then got 6 empty slots in the char array. Now, should we append anything else to our string, we can either fill in the empty slots (if there are enough slots) and save time, or we can go through the memory allocation and string copy process again (if there aren’t enough slots):

s->append("xyz");

In this instance, we can append “xyz” to the “ab” we’ve got stored in the WoopsiString s without allocating any new memory.

I imagine every string class ever written does something like this, so it’s not an amazing innovation, but it’s a definite improvement over the char array mangling and memory fragmenting that the TextBox was doing before.

There’s no change to the way the TextBox interfaces with the outside world; that’s still all done with char arrays or chars. This is just an internal change.

One more class done today - Label. The Label is a borderless, read-only (from the user’s point of view) gadget that allows text to be put onto the screen. Basically, it’s the TextBox with GADGET_BORDERLESS set in its flags minus the cursor stuff I’ve been adding to TextBox. TextBox now inherits from Label.

2008-10-07

Line Clipping

I was browsing through a book on computer graphics the other day when I came across a section on clipping. The book didn’t really go into any great detail, so I googled it and discovered there are three main 2D line clipping techniques. The first is very simple; draw a line pixel by pixel, and if the pixel to be drawn falls outside the visible region, don’t draw it. Simple, effective, slow.

A better algorithm is the “Cohen-Sutherland” method, which imagines the display as a 3x3 grid of rectangles, uses a bitmask to identify which regions the two points of the line are in, then uses the parametric equation of the line to reduce its length so that it fits within the visible region. Clever. I would explain the routine in detail, but it’s hugely popular in Google already, and it seems that you’re not a Real Programmer unless you implemented it a dozen times before you had breakfast today. Of course, I’d never heard of it before.

The third method is the “Liang-Barsky” algorithm, which I confess to not really looking into. It’s allegedly faster than the Cohen-Sutherland method, but I lost interest as soon as I realised it required floating-point maths to work.

There was another method, which claimed to be several times faster than Liang-Barsky and didn’t need floats, but as that one was patented I lost interest even faster. Software patents are a very bad thing.

Anyhoo, Woopsi’s GraphicsPort class now has a drawLine() routine. It uses the same Bresenham line drawing routine that the SuperBitmap uses, but it’s wrapped inside an implementation of the Cohen-Sutherland clipping routine. I’ve had to modify it a little to replace its use of floats with fixed-point maths, but that doesn’t seem to have impacted its accuracy.

The next thing on my to-do list is to extract the basic functionality of the TextBox class into a separate “Label” class, .NET-style. Read-only textboxes really don’t need to carry around the extra baggage of the new cursor system. Also, I’m thinking of moving the string manipulation code out of the TextBox class and into the Text class, thus separating the presentation of the data from its manipulation. This, of course, means I’m effectively writing my own string class, which is one of the classic mistakes people make with C++ (the STL obviously has a perfectly good implementation) but as the STL is off-limits I think I have a good excuse.