A bit of research yesterday turned up this site:
Seems to be a collection of 15 year old docs about C64 programming. The second issue is most pertinent here, as it contains - hurrah! - a description of a dynamic memory allocation technique. It’s designed for the C128, and is intended to support system RAM, expansion RAM, and some other sort of memory the name of which I can’t remember, but all I need to know is how it works.
In brief, the technique works like this (modified slightly to remove the RAM expansion stuff). We create a linked list of free memory blocks, each with a 4-byte header. The header contains the 16-bit (2 byte) address of the next free memory block, and the 16-bit (2 byte) length of the current free memory block. When the routine starts, we have a single header followed by lots of empty RAM. To allocate memory:
- Jump to the first memory block’s header
- Check the length of the block - is it large enough to contain our data?
- If yes:
- Jump x places from the end of the memory block (where x is the length of our data)
- Insert our data into that position and return a pointer to the start of our data
- Reduce the length of the memory block (the header value) by x
- If no:
- Jump to the next address (stored as the first two bytes in the block’s header), which moves us to the next free memory block
- Repeat until memory allocated
The original article suggests that each chunk of memory should be aligned to the next 8-byte boundary; this would make each chunk a multiple of 8-bytes long. I’d have to think about that some more before I decided that it was definitely the block size to use.
When we de-allocate memory, we try to join the freed block up to either a chunk immediately to the left or right (or both), so that we don’t have chunks of contiguous free memory separated by useless headers.
Compared with my FAT system, this has pros and cons. On the positive side, huge chunks of memory can be allocated with very little waste. We could allocate a block of 16K and use only 4 bytes of memory to manage that block, whereas the FAT system would need 2K to manage the block. It’s probably faster than the FAT system, too. On the negative side, allocating a single byte would eat up at least 5 bytes of RAM, which is incredibly wasteful.
This got me thinking. What if you could group variables together and allocate them all in one go? If you were writing Pong, for example, you’d know that each player has an X value (1 byte), a Y value (1 byte), a velocity (1 signed byte) and a score (1 byte stored as a binary coded decimal). That’s 4 bytes. Allocating them individually in the linked list system would use 20 bytes (more if we align along 8-byte boundaries). However, if we decided that we’d structure it so that we knew the sequence of bytes, we could allocate it as one chunk, then use indexed addressing to get at the correct value. To get at the X value, we just use:
To get at the Y value:
Etc. We can store this in just 8 bytes - 4 header bytes plus 4 data bytes. I think I must be on the right track again - I’ve just re-invented the struct.