2007-05-17

Dynamic Memory Allocation, or Reinventing the Wheel

In my quest to learn an obsolete flavour of an ancient programming language, the biggest question I’ve found myself asking is what to do if you need to manage memory dynamically. Static memory allocation is easy enough. If I’m writing a game, I know how many enemies will be on-screen, how many players there are, how many levels; in short, I can manually map out the memory usage of the game and leave placeholders in my assembly code to fill with that data as the program is running.

It gets more complicated when you’re trying to write an application whose sole purpose is to store increasing amounts of data, such as a spreadsheet, word processor, paint package, etc. If you need to be able to dynamically allocate and free memory as it is used, how do you go about doing it?

So, I got to pondering. My first thought was that you could decide upon a large block of memory to use as dynamic storage (say the addresses $6000-$9FFF on the C64, which is at the end of the BASIC storage space). You specify that another memory location is used as a lookup table for data in that memory block. We’d use each bit in the lookup table to specify whether or not a whole byte of the memory block is free. So, say our lookup table starts at $4000, and the first value is $FF, that means that the first 8 bytes of our memory block are in use. Allocating memory is a question of scanning along the lookup until we find a clear bit; we then set the bit and use the byte in memory. If we’re trying to allocate more than a single byte, we scan along until we find enough clear bits (ie. empty bytes) to contain our data. When we free memory, we simply clear the relevant bits.

This has two major problems. First of all, our memory block uses up just under 16K of RAM, which means our lookup table uses 2K of RAM. When you’ve only got 64K to start with, and when much of that is taken up with the operating system (simple though it may be), you don’t really want to waste that much memory. One solution would be to divide the memory into chunks of 2 bytes and just remember whether we’re using a chunk or not. That would reduce the size of our lookup table by half, but it’d double the storage space needed for a single byte to two bytes. We’re not doubling the size of a byte, but we’ve reduced the resolution of our lookup to prevent us from seeing that we’ve only used one of the bytes in that 16-bit block. This solution becomes troublesome if we’re storing predominantly 8-bit data instead of 16-bit data, because we’ve effectively halved the available memory.

It occurred to me that this table is, essentially, nothing more than a FAT system implemented in RAM instead of on disk. The “chunks” are, in FAT terminology, “blocks”. Windows uses something like a 4K block size, so if you save a 2K file you end up with 4K used on the disk. The Amiga had variable block sizes, defaulting to a small size which made it great at storing small files efficiently, but hopeless at retrieving large files. First re-invention of the day! Anyway, let’s call this lookup table the “FAT”.

The second problem is that we’ll quickly end up with fragmented memory, with empty bits interspersing the used data, which will greatly increase the amount of work done by the memory allocation routine.

It’s obvious that we need some sort of defragmentation routine here. The simplest way to do this is to write a subroutine that will scan along the FAT until it hits a clear bit. It then scans along until it finds a set bit, and moves all of the back to remove the gap. It manipulates both the FAT and the data to ensure they both stay in sync.

This is, basically, what Tetris does when the game detects a full row - remove the full row, then move the rows above it down to fill the gap. And they say you can’t learn anything from videogames!

The problem with this is that any existing pointers will be thrown out - they’ll now point to the wrong location as all of the data has moved. To solve this, we need a second lookup table. The new lookup table will store pointers to the main data block, and the main program uses these pointers to refer to data (in C terminology, we double-dereference the pointer, or in asm terms we treat the address as an indirect address).

So, when we allocate memory, we store a pointer to the data in the pointer table, and pass the pointer table address back to the program. To access the data, the program would need to go to the pointer table address, go to the memory address stored there, and finally get at the desired data.

Now, there is a huge problem with this. First of all, the obvious way of storing this (one address for every byte in the memory block) would use twice as much memory as the memory block itself. The memory block is 16K in size, which is 16384 bytes of information. The pointer table would, therefore, need to store 16384 addresses, and as each address is a 16-bit, or 2 byte, value, the pointer table would use 32K of RAM. That’s half of the memory in the entire computer.

At this point I realise that there’s already a function in existence to do all of this - the malloc() function in C must do something similar. So, I look it up. It transpires that one of the more popular ways to manage memory dynamically is to produce a memory map, or “mmap”, which works in a very similar way to the FAT system outlined above (it may even be identical; I’d have to read some more to say definitively). It seems that memory fragmentation is one of the major problems with this method. The guy who wrote the Enlightenment window manager for Linux came up with a system almost identical to my pointer system above to solve this.

So, that’s twice more I’ve re-invented the wheel today. Still, it’s nice to know that I’m working on the right lines. I’ve done some more thinking, and have a few more ideas for auto-defragmenting dynamic memory allocation. The main problem with all of them is that they’re fairly complex and therefore aren’t really suited to the C64. Still, the fragmented memory map system would be easy to implement. I might try that as my first useful 6502 project.