2008-02-25

Woopsi Bugfixes

A few minor bugfixes now in the SVN repo. First of all, I’ve removed the post-DMA operation checks for DMA activity. They’ve been replaced by calls to DC_FlushRange() which should fix all of the DMA problems, which I’ve also added into the SDL code. I’ve also fixed the vector pointer problem in Gadget::moveHiddenToChildList().

Comments

Jeff on 2008-02-26 at 08:37 said:

I’m a bit nervous at the way you put those DC_FlushRange() calls everywhere.

I’m inclined to think that for the most part, they are the callers problem, not yours. The only places I’d have put them was where you were creating a bitmap, then DMA’ing it. For example, setting the first byte in a line, then DMA’ing it into the rest of the line, or drawing a solid line, then DMA’ing it down the page.

Constantly flushing the cache like that will impact on performance, though hopefully negligibly if there is nothing cached.

Jeff on 2008-02-26 at 08:46 said:

I’ve also been thinking about STL, specifically the use of vectors. As has been observed elsewhere, a Woopsi binary is quite big, and I wonder whether

a) STL is fairly expensive for the benefit you get b) STL is clumsy (when you look at the delete operators that forces you to create an iterator to find something whose index you actually know)

I also wonder if including printf() in non-inline members is overly expensive, since printf() will haul in full floating point support - it might be better to use the vsniprintf() which apparently only does integer stuff. If someone really wants floating point, they can do it themselves - Woopsi diagnostics shouldn’t need it.

ant on 2008-02-26 at 09:43 said:

There are only four calls to DC_FlushRange(), two in the GraphicsPort and two in the SuperBitmap. Two of the calls flush out a single “row” of memory so that I can duplicate the top row when drawing a filled rect. The other two flush a bitmap’s memory before DMA-ing it to the framebuffer. Basically, just what you’ve said. There’s no need to put them anywhere else, as far as I can tell, as these are the only areas where the DMA could run into cached memory. It would be possible to remove the call in GraphicsPort when blitting a bitmap, but it makes things much easier to do it automatically. The call in SuperBitmap is pretty much essential as all of the manipulation that can be performed on its bitmap is done with the DMA hardware.

Thinking about it, it would also be possible to scrap the two calls when drawing a filled rect - I can just replace the copy with a force, which means I don’t care if the blitter can see the latest memory state or not.

Am I using printf()? I know I’ve got a function called “printf()” in the Debug class, but I can’t remember how it works (need to look at the code). That’s the only place it would be used, I think, unless it’s crept into the TextWriter too. In any case, swapping for vsniprintf() would be easy. Will do.

Regarding STL, the vector is the only standard library class that I use. The vector class has various benefits:

  • It already exists, so using it lets you get on with more important things than recreating functionality that someone else wrote dozens of years ago;
  • It’s been optimised already;
  • It works.

However, I could swap it out for a simple gadget linked list class fairly easily. Might be interesting to see if it has any benefit.

Jeff on 2008-02-26 at 20:17 said:

I think it was the dma inside graphic port that concerned me.

After all, if you are going to do the source bitmap, why not the destination bitmap as well? That might be sitting in the cache waiting to be written, so the cpu might flush the cache after the dma controller has copied values into it.

(ie, I’m not seriously suggesting that, just that once you start doing things ‘just in case’, its a never ending parade for negligible benefit)

As to STL, try adding the link map to your makefile

LDFLAGS := -g $(ARCH) -mno-fpu -Wl,-Map -Wl,$(OUTPUT).map

and then have a look at how much gets hauled in. I mean, who calls all that ‘stream’ stuff? My code definitely doesn’t whereas I bet that exception paths through vector probably think they can.

Actually, I’m a bit surprise at the amount of gunk there is in there - I suspect there must be a ‘dead-code-removal’ switch missing from the link - have to pursue that further.

Jeff on 2008-02-27 at 09:16 said:

Actually, I changed my makefile to do this:

MINUSG  :=      #-g
CFLAGS  :=      $(MINUSG) -Wall -O2\
                -mcpu=arm9tdmi -mtune=arm9tdmi -fomit-frame-pointer\
                -ffunction-sections -fdata-sections \
                -ffast-math \
                $(ARCH)

#CFLAGS +=      $(INCLUDE) -DARM9 -I$(DEVKITPRO)/PAlib/include/nds
CFLAGS  +=      $(INCLUDE) -DARM9 -I$(DEVKITPRO)/libfat/include

ASFLAGS :=      $(MINUSG) $(ARCH)
#LDFLAGS        :=      $(MINUSG) $(ARCH) -mno-fpu -L$(DEVKITPRO)/PAlib/lib -Wl,-Map -Wl,$(OUTPUT).map
LDFLAGS :=      $(MINUSG) $(ARCH) -mno-fpu -Wl,-Map -Wl,$(OUTPUT).map -Wl,--gc-sections

in an attempt to get it to do dead-code-removal, and that knocked it down by less than 100K. But the linkmap shows some horrid stuff, related directly to STL.

animation.o hauls in the exception handler for bad allocations?

/Developer/NDS/devkitARM/bin/../lib/gcc/arm-eabi/4.1.1/../../../../arm-eabi/lib/libstdc++.a(functexcept.o)
                              animation.o (_ZSt17__throw_bad_allocv)

which then insists on having the standard ios failure handler

/Developer/NDS/devkitARM/bin/../lib/gcc/arm-eabi/4.1.1/../../../../arm-eabi/lib/libstdc++.a(ios_failure.o)
                              /Developer/NDS/devkitARM/bin/../lib/gcc/arm-eabi/4.1.1/../../../../arm-eabi/lib/libstdc++.a(functexcept.o) (_ZNSt8ios_base7failureD1Ev)

slidervertical.o also does something with ios_base, and that hauls in locale handling.

                              slidervertical.o (_ZNSt8ios_base4InitD1Ev)
/Developer/NDS/devkitARM/bin/../lib/gcc/arm-eabi/4.1.1/../../../../arm-eabi/lib/libstdc++.a(locale.o)
                              /Developer/NDS/devkitARM/bin/../lib/gcc/arm-eabi/4.1.1/../../../../arm-eabi/lib/libstdc++.a(ios_init.o) (_ZNSt6localeD1Ev)

Like, I’m ever going to use locale stuff in a DS binary???

Locale handling hauls in money handling.

/Developer/NDS/devkitARM/bin/../lib/gcc/arm-eabi/4.1.1/../../../../arm-eabi/lib/libstdc++.a(monetary_members.o)
                              /Developer/NDS/devkitARM/bin/../lib/gcc/arm-eabi/4.1.1/../../../../arm-eabi/lib/libstdc++.a(locale-inst.o) (_ZNSt10moneypunctIcLb0EED0Ev)

.. and time .. and punctuation .. and collating sequences

It spirals out of control amazingly. Someone even hauls in ‘cin’ - yes, C++ console stream input. Like thats going to do anything in this environment…

Please, please, please, consider ditching <vector>

ant on 2008-02-27 at 10:26 said:

Ditching vector is already on the to-do list, don’t worry. You’ve just re-enforced the fact that it definitely needs to happen.