You are not logged in.

Read the FAQ and Knowledge Base before posting.
WiFi not emulated and not supported!!
We won't make a 3DS/2DS emulator.



#1 2010-04-26 04:27:39

lig
Member
Registered: 2010-04-24
Posts: 20

Linux num_cores command line argument

For anyone who cares:

When you type in --num_cores=2 the argument gets parsed in commandline.cpp. Line 121 says:

if(_num_cores != -1) CommonSettings.num_cores = _num_cores;

Yet when SoftRast gets initialized, in rasterize.cpp, CommonSettings.num_cores == 1. Manually setting this to 2 does not, however, make desmume use 2 cores according to System Monitor.

Offline

#2 2010-04-26 05:49:05

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 4,995

Re: Linux num_cores command line argument

yeah the faq or manual are wrong. the linux port accepts --num-cores but doesnt actually use it due to some multicore code only being stubbed out in the linux port. sorry. If youve made it that far, you may want to finish it and uncomment the pthreads code in utils/task.cpp and verify that it works and fix whatever problems you run into.

Also if youre using the CLI port the that commandline code doesnt actually get used (CLI doesnt use the common commandline parsing yet) so youll need to attend to that.

Offline

#3 2010-04-30 22:54:01

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

It's strange that desmume doesn't complain when you type in --num_cores instead of --num-cores but I figured out today, after I read the faq again, that --num-cores=2 does cause CommonSettings.num_cores to equal 2. Most of the time it complains like if you type in --num-core or make some other mistake.

Last edited by lig (2010-04-30 23:01:17)

Offline

#4 2010-05-01 05:25:34

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

I uncommented the pthreads code in utils/task.cpp and removed the line:

if(!initialized) init( );

which seems to be a win32 code fragment.

Uncommenting this code does not help in fact I think it actually hurts performance a little. I can't prove it but it seems like the threads wait at the locks and then just run one at a time when they're not waiting. I'm going to try to figure this one out but it could take a long time.

Offline

#5 2010-05-02 09:10:49

kouteiheika
Member
Registered: 2010-05-02
Posts: 5

Re: Linux num_cores command line argument

I'm not sure if you still need it (I haven't checked desmume's SVN for some time), however I've fixed the phreads code in task.cpp a few months back and never got around to submitting it.

http://pastebin.com/ktDuPRkA

Just copy and paste everything after the #else.

Offline

#6 2010-05-02 14:02:23

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

Is the multi-threading supposed to accomplish something other than running multiple threads at a time? (like responsiveness or something). Long story short I think I figured out that your code only runs one thread at a time. I modified your code a little bit:

void Task::Impl::taskProc()

{

    for( ;; )

    {

        if( killed ) break;

        pthread_t tid = pthread_self( );

        pthread_mutex_lock( &work_mutex );

        pthread_cond_wait( &work_incoming, &work_mutex );

        printf("thread# %u has started work\n", (unsigned int) tid);
        fflush(stdout);

        param = work(param);
        printf("thread# %u has finished work\n", (unsigned int) tid);
        fflush(stdout);
       

        pending = false;

        pthread_cond_signal( &work_done );

        pthread_mutex_unlock( &work_mutex );

    }

}

/**************************
end code
**************************/

and the output is:

thread# 2912476016 has started work
thread# 2912476016 has finished work
thread# 2920868720 has started work
thread# 2920868720 has finished work
thread# 2912476016 has started work
thread# 2912476016 has finished work
thread# 2920868720 has started work
thread# 2920868720 has finished work
thread# 2920868720 has started work
thread# 2920868720 has finished work
thread# 2912476016 has started work
thread# 2912476016 has finished work

/**************************
end output
**************************/

My system monitor does claim a 10% increase in CPU usage and there is one weird spot in the output:

thread# 2920868720 has started work
thread# 2920868720 has finished work
thread# 2920868720 has started work
thread# 2920868720 has finished work

Offline

#7 2010-05-02 19:06:44

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 4,995

Re: Linux num_cores command line argument

That code doesnt even maintain a global resource to coordinate multiple threads. If youre not getting any concurrency then something is else going wrong. Such as using the opengl core instead of the soft rasterizer, which is the only thing that supports multiple threads. Or, initializing the 3d core before the num_cores gets set

Offline

#8 2010-05-02 20:46:13

kouteiheika
Member
Registered: 2010-05-02
Posts: 5

Re: Linux num_cores command line argument

lig wrote:

Long story short I think I figured out that your code only runs one thread at a time.

Is it? I was pretty sure it was running concurrently the last time I've tried it. Oh well, maybe not. <; Rendering doesn't take *that* much time anyway; what this emulator needs is a recompiler. Anyway, since I've managed to make every game more-or-less playable on my 2Ghz C2D by rewriting the main loop in the CLI frontend to include some frame skipping magic I stopped really caring. (If you want it and don't mind a CLI frontend then I can share it; although it 'ain't pretty since it was a pretty quick hack.)

Offline

#9 2010-05-02 21:17:56

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

Zeromus which code are you talking about the little bit I wrote to test kouteiheika's code or the one kouteiheika posted on pastebin? The code that kouteiheika wrote is just a cleaned up version (that doesn't crash) of what you get when you check out from the svn.

I'm fairly certain that only one thread is running at a time. When the first thread comes in to run taskProc it sets the lock for mutual exclusion to the work function. When the second thread comes in it gets stopped at the lock until the first thread unlocks when the work is completed and so on.

I'm not using the opengl core by the way.

Which function or set of functions initializes the 3d core? I can tell you that CommonSettings.num_cores is set to 2 at the beginning of SoftRastInit(void) before the threads fork.

Here is something else that I find interesting:
pretty much when you get down to it the function that is being executed by the threads is mainLoop(SoftRasterizerEngine* const engine) and theres a for loop in that function

for(int i = 0;i < engine->clippedPolyCounter;i++)

which is where I figure most of the work gets done. Now if you change it to:

for(int i = 0;i < engine->clippedPolyCounter / 2;i++)

to try to simulate what it might be like running 2 threads at a time. It doesn't really help performance that much. Which makes my believe that something else should be done concurrently instead of the soft rasterizer.

Last edited by lig (2010-05-02 21:18:51)

Offline

#10 2010-05-02 23:59:13

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

kouteiheika can I see that code you wrote for frame skipping with the cli frontend?

Offline

#11 2010-05-03 00:40:19

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 4,995

Re: Linux num_cores command line argument

im talking about all that code. everyone's code. the mutex is once per task instance, theres no exclusion between tasks.
if only one thread runs at a time then youre not going to see any speedups in rasterizer whether its 4 or 2.
how much it helps depends on the game.
i am open to suggestions for what else can run concurrently, but i will probably shoot them all down. "instead of" is a poor word to use. I KNOW the multithreaded rasterizer speeds up some games. It's staying.

Offline

#12 2010-05-03 03:22:15

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

"Instead of" was a poor use of words. I didn't mean get rid of it or replace. I was just doing experiments and was meaning to say that benefits were slight. I have no idea about what else could be run concurrently I would not even know where to start.

Offline

#13 2010-05-03 04:04:45

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 4,995

Re: Linux num_cores command line argument

It will fairly routinely give me a 15% speedup in windows with a quadcore system

Offline

#14 2010-05-03 07:43:10

kouteiheika
Member
Registered: 2010-05-02
Posts: 5

Re: Linux num_cores command line argument

Here: http://pastebin.com/t7571HiE

The interesting parts are: 1) I nuked SDL_HWSURFACE since you *should not* use it with OpenGL (it's a flag for plain SDL and IIRC I got a pretty big performance hit when it was there), 2) I've changed the texture filter to GL_NEAREST, since I think it looks better this way. 3) You have the rewritten main loop from the '#define FPS' onwards.

As I've said, this code is quiet messy, it's not submission quality; and, I've nuked some GDB stubs along the way, since they were an eyesore and this was supposed to be only a quick, personal hack to play some Harvest Moon. <; That said, it would be quiet easy to make a proper patch from it.

Offline

#15 2010-05-03 16:03:37

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

It didn't compile for me I got this error:

main.cpp: In function 'void desmume_cycle(int*, int*, configured_features*)':
main.cpp:387: error: cannot convert 'short unsigned int*' to 'ctrls_event_config*' for argument '2' to 'void process_ctrls_event(SDL_Event&, ctrls_event_config*)'

Offline

#16 2010-05-03 22:18:23

kouteiheika
Member
Registered: 2010-05-02
Posts: 5

Re: Linux num_cores command line argument

lig wrote:

It didn't compile for me I got this error:

main.cpp: In function 'void desmume_cycle(int*, int*, configured_features*)':
main.cpp:387: error: cannot convert 'short unsigned int*' to 'ctrls_event_config*' for argument '2' to 'void process_ctrls_event(SDL_Event&, ctrls_event_config*)'

No wonder, since my main.cpp is from an older SVN snapshot. I don't have time to dabble in this for now, but you might want to copy the whole 'desmume_cycle' function from current main.cpp into my main.cpp, that might fix it.

Offline

#17 2010-05-17 04:35:11

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

Kouteiheika I got your frame skipping code to work on the svn checkout I have and I must say bravo! Works perfectly. This IS the way things should be. If you ever read this message I just want to say thanks.

Offline

#18 2010-05-17 04:59:29

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

Nevermind its not the perfect solution it needs some adjustments because sometimes all it does is skip frames making games unplayable.

Offline

#19 2010-05-17 06:05:24

lig
Member
Registered: 2010-04-24
Posts: 20

Re: Linux num_cores command line argument

If you take your code for frame skipping and apply it to the gtk interface instead of cli it will limit the amount of frames skipped so that it doesn't just skip frames and make games unplayable. This is the best I have seen this emulator run.

Offline

#20 2010-05-18 21:04:20

kouteiheika
Member
Registered: 2010-05-02
Posts: 5

Re: Linux num_cores command line argument

lig wrote:

Nevermind its not the perfect solution it needs some adjustments because sometimes all it does is skip frames making games unplayable.

Yep, I think that's pretty much a corner case which I didn't really handle. On my 2Ghz C2D I don't get that behavior at all (well, it *only* happens when I'm saving my game in Pokemon), so I assume you must have something more modest. (Or, you just make an inefficient binary; profile-guided optimization and '-march=native -mtune=native' is a must.)

As to an additional explanation - my main loop enforces 60 frames per second, always, it only skips as many frames as necessary to achieve that 60 FPS; when it's impossible to do that we have a problem (when a skipped frame itself takes more than 16.6ms) - it stops drawing any frames at all and starts skipping all the time; as I've said, I don't get that on my hardware, so I didn't bother to handle it. Still, if you *do* get that behavior then you are basically screwed, since even with frame skipping voodoo it's impossible to run a game full speed on your hardware. (:

(This emulator desperately needs a recompiler.)

Offline

Board footer

Powered by FluxBB