After abandoning desmume development for a few months, I've been working on it quite a lot lately. Of course, not as much as when I started working on the project, as motivation to develop desmume is rare this days (just check the official desmume CVS, and you'll see what I'm talking about). But anyway, I've done some improvements on a very specific side of the project that I've avoided working on for the last 6 months: speed.
I've to be clear: desmume current way of handling most of the interrupts sucks. For example, timers, dmas and other, are checked PER opcode emulated. In an ideal world, you'd run a batch of instructions (the exact ammount would be determined by some heuristics or whatever), and then you'd do the processing needed by the pending interrupt / timer handling, then again back to running a batch of instructions. Of course, that'd be a less exact than current implementation, but it'd be WAY faster. I did something similar on my Gameboy emulator, and it gave good results, we'll see how it works on desmume, whenever I find time to implement it.
Anyway, I'll start with two screenshots, one before the speed ups I implemented the past 3 days (left), and another one with them (right):
I've to be clear: desmume current way of handling most of the interrupts sucks. For example, timers, dmas and other, are checked PER opcode emulated. In an ideal world, you'd run a batch of instructions (the exact ammount would be determined by some heuristics or whatever), and then you'd do the processing needed by the pending interrupt / timer handling, then again back to running a batch of instructions. Of course, that'd be a less exact than current implementation, but it'd be WAY faster. I did something similar on my Gameboy emulator, and it gave good results, we'll see how it works on desmume, whenever I find time to implement it.
Anyway, I'll start with two screenshots, one before the speed ups I implemented the past 3 days (left), and another one with them (right):
As you can see, it's a practical 33% speed up on this particular case. I must say that it affects every game that I've tested on a considerable ammount, being the minimum a 14% (I'll explain later why). Just as a real world example, tested "Castlevania: Dawn of Sorrow" and it's 30% faster now... So let's talk about the speed ups.
First, desmume used/s GDI to draw the DS screens onto the main window. GDI setup is as easy as it can get, but seems that it's not as fast as I would wish (more keeping in mind that it's only task is to take an image and draw it on the screen). I considered a few options, namely DDraw, openGL or GDI+. I excluded openGL at the beggining, because it would be quite a lot of work to make the current 3D core work correctly with it. So after some discussing it with some friends (more exactly, hearing some DDraw bashing and why I should stop using DirectX7 features), I started working on a GDI+ blitter. In the end it was easier than I expected. Reading the documentation, implementing the blitter and a bit of adjusting was done in an hour. In fact, searching through the documentation was what took longer. The good part, is that it gave a free 14% perfomance gain.
The rest was a matter of fixing some tidbits of how memory is accesed, avoiding a lot of conversions from floating point to integer (which is way more expensive that you'd guess), changing the way the texture cache works, and more stuff I can't remember. I won't be any more concise about this stuff, as it would take quite a few pages, which I'm not willing to write now. Maybe I'll go back to them in a future post.
Just as an ending, it seems that my obsession about fixing all regression bugs that I see is giving good results, as most games that I test have perfect 3D. Here's some proof, Hotel Dusk running perfectly, even with the motion blur effect (I rotated the image with an image editor, because I don't have screen rotation implemented):
First, desmume used/s GDI to draw the DS screens onto the main window. GDI setup is as easy as it can get, but seems that it's not as fast as I would wish (more keeping in mind that it's only task is to take an image and draw it on the screen). I considered a few options, namely DDraw, openGL or GDI+. I excluded openGL at the beggining, because it would be quite a lot of work to make the current 3D core work correctly with it. So after some discussing it with some friends (more exactly, hearing some DDraw bashing and why I should stop using DirectX7 features), I started working on a GDI+ blitter. In the end it was easier than I expected. Reading the documentation, implementing the blitter and a bit of adjusting was done in an hour. In fact, searching through the documentation was what took longer. The good part, is that it gave a free 14% perfomance gain.
The rest was a matter of fixing some tidbits of how memory is accesed, avoiding a lot of conversions from floating point to integer (which is way more expensive that you'd guess), changing the way the texture cache works, and more stuff I can't remember. I won't be any more concise about this stuff, as it would take quite a few pages, which I'm not willing to write now. Maybe I'll go back to them in a future post.
Just as an ending, it seems that my obsession about fixing all regression bugs that I see is giving good results, as most games that I test have perfect 3D. Here's some proof, Hotel Dusk running perfectly, even with the motion blur effect (I rotated the image with an image editor, because I don't have screen rotation implemented):
12 comments:
Nice progress there.
I know its hard to keep a project alive, but nor can we ask you to do something you dont want to.
I for one would love to see you continue this project.
Thank you for allowing us to experience your work
Sounds good :p
Your the best Shash! :)
man great work there you have fans all around the world wishing you god work me who is in brasil wish all of great things of the world good luck
wonderful job dude ppl are really backing u up now and seems the DS scene is picking up momentum quickly
Good Luck for future works
Speed up is very good thing to get! Keep up the good work!
Could you release your changes as a patch so we can try them out? Deving 3D stuff with OS X is a PITA right now because the 3d support isn't so great in CVS.
jevin: I'm not interested on sharing any source code atm, there're several reasons which someday I'll might explain, but that's how it's now, so it'll only a tool for personal use for the time being
Shash you are great
but you only answer insults
but when we say something good for your great job, you never say nothing,
anyway, I think you are making a great job, you are a professional, and congratulations for your own development and personal use, enjoy!
Nice improvements, thanks for share these shots
but you only answer insults
In fact, it's rare that I insult anyone, atleast on emulation forums. I'm often harsh and not that flexible, but that's far from insulting. So better call that by it's name :P
but you only answer insults
In fact, it's rare that I insult anyone, atleast on emulation forums. I'm often harsh and not that flexible, but that's far from insulting. So better call that by it's name :P
What he meant was that you mostly
answer those who are either insulting you and doing a negative comment.
He never meant that you were insulting anyone as you are definitely correct on all accounts.
It was just to highlight that you rarely react upon positive feedback and kudos.
Just wanted to clarify that one out.
stephane: I'm usually more "motivated" to answer whatever I find is wrong that what I already think it's ok :P
Post a Comment