Monday, September 25, 2006

Texture compression and register hacking

Well, today I'll talk a bit about how the ds texture compression works and what I've accomplished with it (not that much, really). It's actually quite simply, if I understood it correctly (not sure I did). Well, a compressed texture is theorically, quite simple. We use one texture slot (even if not fully) for actual texture data, being it texture slot 0 or slot 2 (there're 4 possible slots in the DS).

For every 4x4 pixels block, we'll use one 32bit number from slot0 or slot2, that'll define the actual pixel data, 8bit a row per 4 columns, makes the total 32bit. So we define every pixel with a 2bit number, that'll use after for palette indexing. Now we need some more data, a 16bit value: the palette offset and a mode how this will be used, that'll will be located in texture slot1.

So, the general idea is quite simple in the end. Read a 32bit value from either slot0 or slot2, read a 16bit value from slot1. Then, use the 16bit value to determine the mode (there 4 possible modes, each filling each row in different manners) and palette index. With this palette index, go through each of the 2bit values, contained in the 32bit, and add each of them (separately, of course, for each colour) to the palette index, to get the final palette index to be used for each pixel. Then, with this value, you can just index in the normal palette data to get the colour. Depending on the mode, colours will be (or not) treated in different manners. Seems it's not that simple :P

My implementation is far from complete, I've severe bugs that just show lots of garbage. I'll try to fix it tomorrow.

I also have been doing some severe register hacking, that is just, that some memory locations are mapped to specific hardware of the DS. So, for example, writing to 4000490h will push a new 10b vertex to the render queue. So, I've just hacked a few of the registers so make games/demos work better, even if it's not correct emulation.

Just a simple screenshot today:

Seems today's posts was a bit complex, anyway, have fun :)

Monday, September 18, 2006

GPU transforms

Today I'll talk about some stuff that I hate about how the DS 3D hardware works. One of the problems with DS vertex submission, its that they are sent in fixed point, either 1.3.12 (16b per vertex) or 1.3.6 (10b per vertex). What that means, is that the you have signed numbers, with a integer range of [-8,8], and a fractional part of either 4096 parts a whole number (16b vertexes), or either 64 whole parts (10b vertexes). The joy of ancient hardware, today.

So, if you want to define bigger objects than 16 units height/width/depth, or simply, and object that is not in this 16x16x16 cube, what has to be done? Transformations come to the rescue. So, before you send a vertex/es, you just supply one or various transforms to the ds "transform unit" (it's just a matrix stack) that, for example will double your vertex positions. Even if I find this a shitty way to work (and more important, I thought this type of severely limited hardware dissapeared over a decade ago), it's not the worst part.

The worst part, is that this transform changes are permitted INSIDE the vertex submission list, so it's quite cpu/gpu intensive to emulate this behavoiur. Probably some caching will help a lot here, atleast for games that do not modify display lists while in game.

So, my first implementation of the DS 3D gpu didn't support changing some parameters inside a "vertex submission block". As an example, I'll just show what supporting that fixed:

As usual, the left one is the one with the bug, the right one is fixed :)

Lately I've not been able to work that much on desmume, so the progress is being slow, but didn't stop.

Tuesday, September 12, 2006

Slowdown of the day

Next version of desmume will have more accurate emulation, even if some new stuff will still be kind of hacked (for example, the 3D core uses openGL, even if I think a software renderer would be more accurate). The main problem with accurate emulation (or, to make it short, more stuff being emulated) is that it requires more processing power.

Today's post is about one of the features I added, that'll make the emulator slower in certain situations. To make it simple, I just added fading. The DS hardware, can fade in/out the backgrounds / sprites, and after these backgrounds and sprites are correctly layered, there's a master brightness (more fading in out) that affects the final mix. The problem with that, is that involves a multiplication and an addition per pixel. And, to make things worse, you've 3 components in a pixel (red, green, blue), and every pixel is a single 16bit value, so some bit shifting and masking is also needed. I'll probably get it faster in the future (simd instructions come to mind to make it way faster), but for now, it'll only make rendering slower.

Some screenshots, as usual, one at the beggining of the fade-in, and one at the end of the fade:

It's way better on runtime, fades don't make good screenshots. More stuff coming in the next days, if I get enough motivation to explain more stuff.

Saturday, September 09, 2006

No news are good news

I know I've been quiet for a few, but as some of you might have guessed, I (more or less) hate talking about comercial games working on an emulator. The reason of the lack of updates, is that the more impressive updates only show results on comercial games.

Well, there's a lot of technical details about how I got this working, but today I'll forget about the details. I'll talk about how I got it working another day, if anyone's interested. You'll have to just remind one thing, while watching this screenshots:


Seriously, I hate being rude, but: I love emulation, but as a coder, it's a though job, and I don't want to be worried about release dates. And, more important, it's annoying working on a though problem and just seeing forums post about: "Were I can download it?". So keep in mind, this are only Work In Progress screens.

Anyway, an image (or two) it's worth a thousand words:

You'll have to guess which games are those. Don't ask about framerates.