You are not logged in.
Pages: 1
I have a macbook pro with a intel processor (2.33GHz, Core 2 Duo).
I tried to compile desmume with some multi-thread addition.
First, I tried to add one thread for the arm9 and one for arm7. But I didn't
succeed to achieve a working patch, as the code is too much intricated ( or I made mistakes ?) ....
Then, I tried to patch desmume-cli. I add on thread for graphics and sounds (SPU and Draw)
and let the main emulation process in the main thread. It's very easy, as the code
is already well splited into different functions. (less than 10 lines to add, and some other ones to move)
For bomberman I was at ~42fps, and with thread I obtain ~54fps, but only 115% use of cpu
(remember that I have to core, so I can theoreticaly obtain 200%). I have some few graphical
glitchs that must be easy to cancel by adding some simple bufferizing code for the screen.
So, if I could split the core emulation, I could obtain better speed, bit I don't know how...
Offline
Does your new threading code introduce new bugs? If not, feel free to send a patch, and whoever if able to test if will commit it to the CVS
In my humble opinion, it's a bit early to add threading, as it'll probably make the code a mess, more keeping in mind there's so many basic stuff still left to do.
Offline
A while back I did some timings on Desmume (before the 3D core went in) and found the following to be a common split between where it spends its time (dependent upon the .nds file).
40% ARM9
40% 2D Graphics emulation (rendering)
20% other stuff
Multi threading is sometime that is definitely worth doing at some point especially with 2 and 4 core CPUs becoming more popular. Maybe splitting Desmume by threads could be something like:
1. 2D graphics emulation
2. 3D graphics emulation
3. ARMs
4. Other stuff
Also a while back, I did a quick test of implementing the 2D graphics emulation in another thread and it did produce a good speed improvement. Originally it was coordinated on a line by line basis, although this produced correct render results it did not produce a good speed up as there was little overlap in the Graphics thread and the other stuff thread with the other stuff thread spending a lot of time waiting for the graphics thread to complete before sending it another line to render.
Next I rendered the entire screen in one go in the graphics thread only coordinating once per frame. This gave a good improvement in speed but produced render errors, for example graphics changing half way through the render.
From this, I feel that some of Desmume will need to be redesigned in order to get good performance and correct emulation when multi threading.
Maybe keeping two copies of the graphics render state, one for the current screen render and another for the ARMs to update but this would add the complication of keeping the states coordinated and the overhead of copying the ARMs state to the render state.
Splitting the ARM9 and ARM7 maybe of not much benefit as in most cases the ARM7 does not do much and there could be problems of coordinating the ARMs' execution. A better split maybe separating the ARM memory access from the instruction execution.
These are just a couple of ideas with my main point being that it will require some thought and design to get right.
Just for information, the GDB stubs, if active, run in their own threads. This was done to make them easier to write (they sit blocked on a couple of sockets waiting for messages) rather than performance so they will hardly stretch a Core 2 Duo.
Offline
A while back I did some timings on Desmume (before the 3D core went in) and found the following to be a common split between where it spends its time (dependent upon the .nds file).
40% ARM9
40% 2D Graphics emulation (rendering)
20% other stuff
This has changed quite a lot, if my profiler isn't failing, due to the the changes on the 2D core, it isn't so much bottlenecking (I think it went to 5-15% tops the last time I profiled, which was a bit before 0.7.1).
Offline
Does your new threading code introduce new bugs? If not, feel free to send a patch, and whoever if able to test if will commit it to the CVS
In my humble opinion, it's a bit early to add threading, as it'll probably make the code a mess, more keeping in mind there's so many basic stuff still left to do.
Yes. It is not perfect now and it would add some complexity in some "unfinished" code....
But it is important to keep it in mind to allow easy threading addition later !
Offline
Yes. It is not perfect now and it would add some complexity in some "unfinished" code....
Yep, that's my only concern, nothing is more annoying than modifying optimized code
But it is important to keep it in mind to allow easy threading addition later !
I couldn't agree more
Offline
I had my own multi-threaded version of DeSmuME on my computer, and I even tried to build a second one after the data was corrupt for some Micro$ofty reason... It had something like 40% increase in the total speed.
So, if I could split the core emulation, I could obtain better speed, bit I don't know how...
Though I don't think it would be a problem(I think it'd even be easier), programming something similar to linux wouldn't be easy. My old program used the windows 64's logical_processor API and as far as multi-coring, it sometimes failed and worked on the same core. If I only knew the correct syscalls I might've been able to help on this area. (After all, this IS the initial question/query)
In the meantime, I've been experimenting with a VERY VERY simple dynarec system. Not very helpful.
Last edited by XTra KrazzY (2007-07-17 22:53:33)
If you are reading this signature, you SERIOUSLY need to get a life.
Offline
XTra, if im not asking too much, could you make a patch file of you Multi-thread implementation, but instead of X64 on x86?
Just wanted to see how it would perform here.
If you dont feel ok, dont worry, ill understand.
Athlon 64 X2 3800+ / 2Gb DDR2 800Mhz
Geforce 8600GT 256MB / Windows XP PRO SP3
http://desmume.org/compatibility-list/
Offline
x86 eh? Most of the speed came from the 64-bit effectiveness of the registers.
Even if you had an x64 machine you wouldn't notice any difference, and with the new 2D engine optimizations(that I haven't tried nor adapted my version to)
Plus, I don't have the first version, nor the second, because I bought a new hard drive.
Sorry(I guess)... I'll build something as soon as I have the time.
If you are reading this signature, you SERIOUSLY need to get a life.
Offline
Oh dont bother, i was just curious.
Athlon 64 X2 3800+ / 2Gb DDR2 800Mhz
Geforce 8600GT 256MB / Windows XP PRO SP3
http://desmume.org/compatibility-list/
Offline
Pages: 1