You are not logged in.
I have been trying out DeSmuME, compiling the newest 0.9.11 trunk version as well as the 0.9.10 version from sources. I have devkitPro's devkitARM tools and all the libnds stuff. (devkitARM_r43-i686-linux, libnds-1.5.9, default_arm7-0.5.26, libfat-nds-1.0.13, dswifi-0.3.16, maxmod-nds-1.0.9, libfilesystem-0.9.11, nds-examples-2014-04-01) I compiled DeSmuME like this, just in case it makes a difference
CFLAGS='-O2 -march=native' CXXFLAGS=$CFLAGS ./configure --enable-gdb-stub ; make ; sudo make install
I'm able to compile and run the libnds examples just fine, but DeSmuME's GDB stub functionality seems to be a bit less sturdy than I'd like. For example, using /examples/Graphics/Sprites/allocation_test, I do this
In a shell, let's call it Shell 1, I start desmume-cli
$ desmume-cli --arm9gdb=20000 allocation_test.nds
Failed to set format: Invalid argument
Microphone init failed.
DeSmuME 0.9.10 svn0 dev+ x86-JIT
Created stub on port 20000
SoftRast Initialized with cores=2
ROM game code: ####
ROM developer: Homebrew
Slot1 auto-selected device type: Retail MC+ROM
Slot2 auto-selected device type: PassME (0x08)
CPU mode: Interpreter
UNSTALL
Desmume SDL window opens, and then in another shell, let's call it Shell 2, I start the GDB from the devkitARM toolset, and tell it to connect to the DeSmuME GDB stub on port 20000
$ arm-none-eabi-gdb allocation_test.elf
GNU gdb (GDB) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=i686-pc-linux-gnu --target=arm-none-eabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from allocation_test.elf...done.
(gdb) target remote :20000
Remote debugging using :20000
0x02000000 in _start ()
After giving the command "target remote :20000", the following lines are printed in Shell 1:
Processing packet q
Processing packet H
Processing packet q
Processing packet ?
Processing packet H
Processing packet q
Processing packet q
Processing packet q
Processing packet g
'g' command PC = 02000000
Processing packet m
Processing packet q
Then in Shell 2, I set GDB to automatically show the instruction at the current PC position, and I start stepping instructions with the SI command. I only give the SI command once and after that I repeat it by just pressing Return repeatedly.
(gdb) display/i $pc
1: x/i $pc
=> 0x2000000 <_start>: mov r0, #67108864 ; 0x4000000
(gdb) si
0x02000000 in _start ()
1: x/i $pc
=> 0x2000000 <_start>: mov r0, #67108864 ; 0x4000000
(gdb)
0x02000004 in _start ()
1: x/i $pc
=> 0x2000004 <_start+4>: str r0, [r0, #520] ; 0x208
(gdb)
0x02000008 in _start ()
1: x/i $pc
=> 0x2000008 <_start+8>: mov r0, #19
(gdb)
0x0200000c in _start ()
1: x/i $pc
=> 0x200000c <_start+12>: msr CPSR_fc, r0
(gdb)
0x02000010 in _start ()
1: x/i $pc
=> 0x2000010 <_start+16>: mov r1, #50331648 ; 0x3000000
(gdb)
0x02000014 in _start ()
1: x/i $pc
=> 0x2000014 <_start+20>: sub r1, r1, #4096 ; 0x1000
(gdb)
0x02000018 in _start ()
1: x/i $pc
=> 0x2000018 <_start+24>: mov sp, r1
This results in the following output in Shell 1
Processing packet m
Processing packet v
Processing packet H
Processing packet s
Stepping instruction at 02000000
UNSTALL
Step watch: waiting for 02000000 at 02000000
Step hit -> 02000000
UNSTALL
Break from Emulation
Processing packet g
'g' command PC = 02000000
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000004
UNSTALL
Step watch: waiting for 02000004 at 02000004
Step hit -> 02000004
UNSTALL
Break from Emulation
Processing packet g
'g' command PC = 02000004
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000008
UNSTALL
Step watch: waiting for 02000008 at 02000008
Step hit -> 02000008
UNSTALL
Break from Emulation
Processing packet g
'g' command PC = 02000008
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 0200000c
UNSTALL
Step watch: waiting for 0200000c at 0200000c
Step hit -> 0200000c
UNSTALL
Break from Emulation
Processing packet g
'g' command PC = 0200000c
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000010
UNSTALL
Step watch: waiting for 02000010 at 02000010
Step hit -> 02000010
UNSTALL
Break from Emulation
Processing packet g
'g' command PC = 02000010
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000014
UNSTALL
Step watch: waiting for 02000014 at 02000014
Step hit -> 02000014
UNSTALL
Break from Emulation
Processing packet g
'g' command PC = 02000014
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000018
UNSTALL
Step watch: waiting for 02000018 at 02000018
Step hit -> 02000018
UNSTALL
Break from Emulation
Processing packet g
'g' command PC = 02000018
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
...and so on. So far so good. BUT the problem comes when I just press and hold Return in Shell 2. I won't copy-paste the output here, but eventually, when I keep stepping long enough, DeSmuME goes to some kind of a deadlock state from which it doesn't recover. How long it takes for this to happen, is different every time, but in this example, the situation is like this in Shell 2. I finally press Ctrl-C.
0x02014310 in build_argv ()
1: x/i $pc
=> 0x2014310 <build_argv+44>: beq.n 0x2014318 <build_argv+52>
(gdb)
^C
Shell 1 does show a text "Breaking execution", so it has received the signal resulting from my pressing Ctrl-C, but aside from that, there's no sign of life anymore.
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02014310
UNSTALL
Breaking execution
In Shell 2, pressing Ctrl-C again makes GDB ask if we want to give up waiting.
^C
^CInterrupted while waiting for the program.
Give up (and stop debugging it)? (y or n) y
Quit
(gdb)
If I now try to get GDB connected to port 20000 again, it will complain about packet errors
(gdb) target remote :20000
Remote debugging using :20000
Ignoring packet error, continuing...
warning: unrecognized item "timeout" in "qSupported" response
Ignoring packet error, continuing...
Ignoring packet error, continuing...
Bogus trace status reply from target: timeout
(gdb) target remote :20000
Remote debugging using :20000
Ignoring packet error, continuing...
warning: unrecognized item "timeout" in "qSupported" response
Ignoring packet error, continuing...
Ignoring packet error, continuing...
Bogus trace status reply from target: timeout
(gdb)
This reconnection attempt generates the following output in Shell 1
Processing packet q
Processing packet q
Processing packet H
Processing packet H
Processing packet q
Processing packet q
Processing packet q
Processing packet q
Processing packet H
Processing packet H
Processing packet q
Processing packet q
I haven't found a way to get DeSmuME back on track again. I can break away from the situation by suspending desmume-cli by pressing Ctrl-Z, and killing it with "killall -9 desmume-cli".
My question is, is this how it works for everybody, and am I doing something wrong?
I won't go to source-level stepping with GDB's S command at all, because it doesn't work even as well as instruction stepping. I also cannot get breakpoints to trigger at all. I can set breakpoints at functions, and e.g. tab-completion for function names works (which has nothing to do with DeSmuME as such), but the program execution just passes through the breakpoints.
I'm not saying this has to be a bug in DeSmuME at all. Maybe GDB is sending it commands too fast? I don't know.
Last edited by xabccode (2014-12-28 19:49:01)
Offline
I wonder why both of these routines in armcpu.cpp print "UNSTALL"? I think it would feel kind of more logical if stall_cpu() printed "STALL".
static void
stall_cpu( void *instance) {
armcpu_t *armcpu = (armcpu_t *)instance;
printf("UNSTALL\n");
armcpu->stalled = 1;
}
static void
unstall_cpu( void *instance) {
armcpu_t *armcpu = (armcpu_t *)instance;
printf("UNSTALL\n");
armcpu->stalled = 0;
}
I'll try looking into the whole GDB debugging protocol. Maybe the things talked about in this other thread still exist
http://forums.desmume.org/viewtopic.php?id=6106
Without knowing pretty much anything about GDB or DeSmuME, this randomly occurring locking feels like a potential concurrency issue. Something that might require using a mutual exclusion mechanism.
Offline
Okay, stall_cpu() now prints "STALL" as of r5061.
Offline
I'm now trying to find out what's wrong with source-stepping in Linux. On the Mac port it seems to work. I had a suspicion it's a thread-safety issue, and maybe the stalled flag is being written from two places simultaneously or something, so I made a test, I changed all writes of the stalled flag into Stall()/Unstall() function calls, and in the functions I protected the writing of the variable with mutexes...But no change in behaviour, it still jams like before. Then I replaced even all _reads_ of the stalled flag with mutex-protected functions, which shouldn't be needed, but still nothing. After that I made some quick tests running DeSmuME in a debugger (command-line GDB actually, which made the situation feel a bit weird, having GDB debug a GDB stub, controlled by another GDB), to see what the program is actually doing when it jams. Depending on how and when I set my breakpoints (on DeSmuME, not the emulated NDS CPUs), I got different phenomena. If I just let it run to the jam without breaking, then it seems that both emulated ARM CPUs are stalled and it runs in the NDS_exec() / for(;;) loop endlessly, waiting for the CPUs to become unstalled. I'll have to try and figure out the mechanism that's supposed to be triggered here: when and by which piece of code are the CPUs supposed to become unstalled. But I'll continue with this tomorrow.
Offline
More info. For some reason, on Linux, DesMuME basically tells itself to enter an endless loop waiting for someone to unstall it from the outside, but that someone has no idea about the situation. Now I'm trying to understand how the system is supposed to work as a whole, and what are the assumed unwritten preconditions, postconditions and invariants over all the variables, and what states the emulator and the stub are supposed to have if you think of them as a pair of state machines. Here's a comparison of what happens on the Mac vs. Linux, when giving command 's' in the controlling GDB. On the Mac it works, and is able to run until the next source line, but on Linux it enters an endless waiting loop. All lines prior to these snippets are identical on both systems. Source code version is the same, the GDB "arm-none-eabi-gdb" is taken from the newest DevkitPro package on both systems, printing version name "GNU gdb (GDB) 7.7.1".
Mac:
--- Mac ------------------ NOW GIVE COMMAND 's' IN GDB
Processing packet v
Processing packet H
Processing packet s
Stepping instruction at 02000000
UNSTALL
Step watch: waiting for 02000000 at 02000000
Step hit -> 02000000
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000004
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000004
UNSTALL
Step watch: waiting for 02000004 at 02000004
Step hit -> 02000004
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000008
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000008
UNSTALL
Step watch: waiting for 02000008 at 02000008
Step hit -> 02000008
STALL
Break from Emulation
Processing packet g
'g' command PC = 0200000c
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 0200000c
UNSTALL
Step watch: waiting for 0200000c at 0200000c
Step hit -> 0200000c
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000010
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000010
UNSTALL
Step watch: waiting for 02000010 at 02000010
Step hit -> 02000010
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000014
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000014
UNSTALL
Step watch: waiting for 02000014 at 02000014
Step hit -> 02000014
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000018
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000018
UNSTALL
Step watch: waiting for 02000018 at 02000018
Step hit -> 02000018
STALL
Break from Emulation
Processing packet g
'g' command PC = 0200001c
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 0200001c
UNSTALL
Step watch: waiting for 0200001c at 0200001c
Step hit -> 0200001c
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000020
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000020
UNSTALL
Step watch: waiting for 02000020 at 02000020
Step hit -> 02000020
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000024
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000024
UNSTALL
Step watch: waiting for 02000024 at 02000024
Step hit -> 02000024
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000028
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000028
UNSTALL
Step watch: waiting for 02000028 at 02000028
Step hit -> 02000028
STALL
Break from Emulation
Processing packet g
'g' command PC = 0200002c
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 0200002c
UNSTALL
Step watch: waiting for 0200002c at 0200002c
Step hit -> 0200002c
STALL
Break from Emulation
Processing packet g
'g' command PC = 02000030
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet s
Stepping instruction at 02000030
UNSTALL
Step watch: waiting for 02000030 at 02000030
Step hit -> 02000030
STALL
Break from Emulation
Processing packet g
'g' command PC = 02003d68
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
Processing packet m
On Linux we notice differences very soon:
--- Linux ------------------------ NOW GIVE COMMAND 's' IN GDB
Processing packet v
Processing packet H
Processing packet s
Stepping instruction at 02000000
UNSTALL
Step watch: waiting for 02000000 at 02000000
Step hit -> 02000000
STALL <--------- Here! After this something goes wrong
Break from Emulation
Processing packet g
'g' command PC = 02000000
Processing packet m
Processing packet s
Stepping instruction at 02000000
UNSTALL
For some reason, the program counter (PC) is still at 0x02000000, even though on the Mac it proceeded to 0x02000004 at that point. If I understand this correctly, DeSmuME is supposed to run instruction-by-instruction, reporting the current PC counter to GDB and let it decide if that PC location corresponds to a source code line or not, and if it is a source code line, let GDB break execution. And that's not happening, because the PC doesn't move anywhere.
By the way, even though the last log line is "UNSTALL", after printing that, DeSmuME internally calls NDS_debug_break(), which stalls both CPUs and after that it's jammed.
Edit: I added lots of debug logging (which isn't shown above yet) and changed the gdbstub.cpp / DEBUG_LOG macro to also always write out ARM9's program counter value like this:
#define DEBUG_LOG( fmt, ...) fprintf(stdout, "R[15]:%x, instruct_adr:%x ", NDS_ARM9.R[15], NDS_ARM9.instruct_adr); fprintf(stdout, fmt, ##__VA_ARGS__)
After that, the behaviour changed a little bit. Now the instruction-stepping proceeds a few times, until PC = 0x0200001c, and then it jams. I guess there's a timing-dependent race condition or something - one of the threads is sometimes able to write some value somewhere before a critical moment and sometimes it is not. At least it happens after the STALL log row I marked with an arrow above.
Last edited by xabccode (2015-01-10 12:51:41)
Offline
After adding some more log messages, I was even able to run a source-step once. But then the next one jammed. Maybe it's a race condition on the CPU registers and instruct_adr. My earlier analysis log with Valgrind's DRD tool has stuff like the following:
==13042== Conflicting load by thread 3 at 0x082f5428 size 4
==13042== at 0x8053DD4: read_cpu_reg(void*, unsigned int) (armcpu.cpp:97)
==13042== Location 0x82f5428 is 0 bytes inside NDS_ARM9.instruct_adr,
==13042== a global variable declared at armcpu.cpp:46
==13042== Other segment start (thread 1)
==13042== at 0x402CD94: pthread_mutex_unlock (drd_pthread_intercepts.c:667)
==13042== by 0x7B6AED6: _nss_nis_endpwent (nis-pwd.c:137)
==13042== by 0x7B60060: internal_endpwent (compat-pwd.c:316)
==13042== by 0x7B60C13: _nss_compat_getpwnam_r (compat-pwd.c:875)
==13042== by 0x45C6E34: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:256)
==13042== by 0x425FE87: ??? (in /lib/i386-linux-gnu/libglib-2.0.so.0.3200.4)
==13042== by 0x444E4956: ???
==13042== Other segment end (thread 1)
==13042== at 0x402C597: pthread_mutex_lock (drd_pthread_intercepts.c:615)
==13042== by 0x812FF64: Task::Impl::finish() (task.cpp:296)
==13042== by 0xFFE43403: ???
==13042==
==13042== Conflicting load by thread 3 at 0x082f5470 size 4
==13042== at 0x8053DC4: read_cpu_reg(void*, unsigned int) (armcpu.cpp:101)
==13042== Location 0x82f5470 is 0 bytes inside NDS_ARM9.CPSR,
==13042== a global variable declared at armcpu.cpp:46
==13042== Other segment start (thread 1)
==13042== at 0x402CD94: pthread_mutex_unlock (drd_pthread_intercepts.c:667)
==13042== by 0x7B6AED6: _nss_nis_endpwent (nis-pwd.c:137)
==13042== by 0x7B60060: internal_endpwent (compat-pwd.c:316)
==13042== by 0x7B60C13: _nss_compat_getpwnam_r (compat-pwd.c:875)
==13042== by 0x45C6E34: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:256)
==13042== by 0x425FE87: ??? (in /lib/i386-linux-gnu/libglib-2.0.so.0.3200.4)
==13042== by 0x444E4956: ???
==13042== Other segment end (thread 1)
==13042== at 0x402C597: pthread_mutex_lock (drd_pthread_intercepts.c:615)
==13042== by 0x812FF64: Task::Impl::finish() (task.cpp:296)
==13042== by 0xFFE43403: ???
==13042==
==13042== Conflicting load by thread 3 at 0x082f5430 size 4
==13042== at 0x8053DB4: read_cpu_reg(void*, unsigned int) (armcpu.cpp:94)
==13042== Location 0x82f5430 is 0 bytes inside NDS_ARM9.R[0],
==13042== a global variable declared at armcpu.cpp:46
==13042== Other segment start (thread 1)
==13042== at 0x402CD94: pthread_mutex_unlock (drd_pthread_intercepts.c:667)
==13042== by 0x7B6AED6: _nss_nis_endpwent (nis-pwd.c:137)
==13042== by 0x7B60060: internal_endpwent (compat-pwd.c:316)
==13042== by 0x7B60C13: _nss_compat_getpwnam_r (compat-pwd.c:875)
==13042== by 0x45C6E34: getpwnam_r@@GLIBC_2.1.2 (getXXbyYY_r.c:256)
==13042== by 0x425FE87: ??? (in /lib/i386-linux-gnu/libglib-2.0.so.0.3200.4)
==13042== by 0x444E4956: ???
==13042== Other segment end (thread 1)
==13042== at 0x402C597: pthread_mutex_lock (drd_pthread_intercepts.c:615)
==13042== by 0x812FF64: Task::Impl::finish() (task.cpp:296)
==13042== by 0xFFE43403: ???
Add logging and bug disappears: http://en.wikipedia.org/wiki/Heisenbug
Last edited by xabccode (2015-01-10 16:13:55)
Offline
Another remark: if I step through DeSmuME's code in the debugger IDE Code::Blocks I'm using now, then occasionally it happens that DeSmuME misses the instruction step breakpoint and continues running forever. A bit like suggested by Zeromus in this post:
http://forums.desmume.org/viewtopic.php … 649#p13649
Anyway, I'm coming to the conclusion that the interaction system between the emulation main loop and the GDB stub threads is broken, because there's no real synchronization. There's just a bunch of flags that both threads read and write, and there are some waiting loops. I think that when it works, on the platforms where it does work, it is only by chance.
I see two alternative fixes:
A) with semaphores or events that have well-defined meanings, like PypeBros did in the thread linked above (if I understood correctly).
B) with a message queue system where the auxiliary threads (GDB stub in ths case) ask the main thread (emulation main thread) to execute small pieces of code for them, while the auxiliary thread (sender of the message) waits until its request has been fulfilled by the main thread. And the GDB stub code must not access or touch the main thread's memory in any way outside the "please execute this piece of code in the main thread" functions.
Now I see that the first fix I already tried (without understanding even what little I do now), protecting the stalled flag with mutexes, didn't address the real problem. As I see it, the problem is that because threads poke at the same set of variables, the state in which e.g. the main thread is, is ambiguous. Inside each thread, the program code is supposed to be set of transitions from one state to another, and now I think the main loop's code regarding e.g. the wait-for-unstall loop is based on some assumptions that aren't reliable. This cannot be fixed by thread-protecting individual variables, because the set of variables is a whole, i.e. the thread state, that's expected to be consistent.
For every piece of code in the GDB stub that reads any of the common memory, there should be a clearly visible explicit explanation for why it is sure and clear that the memory access can be done safely, and that the main thread's state is not messed up. For example, the main thread must be waiting at a known location - and I guess that was the intention with some of those "debug break" calls, but there's no actual synchronization. In gdbstub.cpp / break_execution(), for example, there's a call to NDS_debug_break(), but it doesn't actually wait for the NDS to break? Did miss something, is there an inter-thread sync wait somewhere?
Last edited by xabccode (2015-01-10 18:27:36)
Offline
its always been broken, i think, or else it was broken shortly after someone other than the original author touched the insanity without undestanding it
Offline
Whoah! It seems that I managed to fix it with a mutex that governs which party is allowed to do something. I made a mutex "cpu_mutex", which the function NDS_exec() locks as the first thing it does, and unlocks when it's done, and also unlocks it in the waiting loop. Maybe it's pretty coarse-grained control from the GDB stub's point of view, but it seems to work.
template<bool FORCE>
void NDS_exec(s32 nb)
{
#ifndef HOST_WINDOWS
pthread_mutex_lock(&cpu_mutex);
#endif
...
for(;;)
{
//trap the debug-stalled condition
#ifdef DEVELOPER
singleStep = false;
//(gdb stub doesnt yet know how to trigger these immediately by calling reschedule)
if ((NDS_ARM9.stalled || NDS_ARM7.stalled) && execute)
{
driver->EMU_DebugIdleEnter();
while((NDS_ARM9.stalled || NDS_ARM7.stalled) && execute)
{
#ifndef HOST_WINDOWS
pthread_mutex_unlock(&cpu_mutex);
#endif
driver->EMU_DebugIdleUpdate();
#ifndef HOST_WINDOWS
pthread_mutex_lock(&cpu_mutex);
#endif
nds_debug_continuing[0] = nds_debug_continuing[1] = true;
}
driver->EMU_DebugIdleWakeUp();
}
#endif
...
if (cheats)
cheats->process();
#ifndef HOST_WINDOWS
pthread_mutex_unlock(&cpu_mutex);
#endif
}
Then in the GDB stub, I put lock/unlock around the processPacket_gdb() routine, like this
/**
* Returns -1 if there is a socket error.
*/
static int
processPacket_gdb( SOCKET_TYPE sock, const uint8_t *packet,
struct gdb_stub_state *stub) {
// uint8_t remcomOutBuffer[BUFMAX_GDB];
struct debug_out_packet *out_packet = getOutPacket();
uint8_t *out_ptr = out_packet->start_ptr;
int send_reply = 1;
uint32_t send_size = 0;
DEBUG_LOG("Processing packet %c\n", packet[0]);
#ifndef HOST_WINDOWS
pthread_mutex_lock(&cpu_mutex);
#endif
switch( packet[0]) {
case 3:
...
#ifndef HOST_WINDOWS
pthread_mutex_unlock(&cpu_mutex);
#endif
if ( send_reply) {
return putpacket( sock, out_packet, send_size);
}
return 0;
}
There's still some code outside these two routines that might access the CPU structures, but even the code above seemed enough to separate the fighters. Now I'm able to source-step for as long as I want, and DeSmuME keeps running happily. I said 's' and held down Enter for several minutes and it just went on correctly. I'm able to set source breakpoints and everything, and it works. Set breakpoint to P$ALLOCATIONTEST_UPDATESPRITES, Continue and hold down Enter, and the sprites start moving slowly on the emulator screen. Great.
I think I'll check out a fresh SVN trunk and make a minimal fix-pack on top of that.
The mutex doesn't address the busy-loop waiting issue. If that's some sort of an issue... I guess it just wastes some watts, compared to a proper event signaling system.
Btw, like you said, it's easy to break a system like that if you don't understand the principles of how it's supposed to work, the governing rules and roles between components, or what the components and domains really are. Each function, file, variable, struct etc. has rules, and if the design principles aren't clearly written out and explained, it's easy to miss them. It took me quite many hours of tinkering and trial and error to make that mutex duct-tape fix.
Offline
LOL now that think of it, that was alternative C) because I didn't use a whole set of "well-defined" events, and I didn't make a full-fledged synchronization message/callback thingy either. At first I thought of a "raise your hand when you want to talk" system, but then I thought why do something even that complicated, just keep the GDB stub from performing any NDS CPU related operations while the emulation engine is in the middle of executing an instruction. Instructions run pretty fast, so the stub won't have to wait too long anyway.
Edit: well, NDS_Exec() runs until the next NDS vblank (if I understood correctly), so I guess that sets some kind of a granularity boundary for the GDB stub. GDB commands are only handled once per frame. Unless it's single-stepping, in which case the GDB stub is allowed to do something in the busy-loop even once per NDS instruction, if it wants to. I guess this shouldn't be any kind of a problem. By default, GDB's time-out for remote command execution is two seconds, so one NDS frame is nothing.
Last edited by xabccode (2015-01-10 22:21:33)
Offline
Maybe this forum isn't a good way to do this, but here's an "svn diff" report from adding only the mutex fix on top of r5068. Now source-stepping and everything works with the GDB stub, without jamming the emulator.
I haven't tested this on the Mac, let alone Windows, and it will probably break on Mac, because the main program doesn't declare or init the mutex variable. Looking at it now I think it would have been better to place the mutex declaration and allocation in a central location like gdbstub.cpp, and make the main programs just call some functions. There was already a bunch of structure clean-ups between 5067-5068. I'm also not particularly fond of spreading platform-named #ifdefs all over the place, but at least this shows the main idea of how the fix works.
Index: src/gdbstub/gdbstub.cpp
===================================================================
--- src/gdbstub/gdbstub.cpp (revision 5068)
+++ src/gdbstub/gdbstub.cpp (working copy)
@@ -527,6 +527,9 @@
uint32_t send_size = 0;
DEBUG_LOG("Processing packet %c\n", packet[0]);
+ #ifndef HOST_WINDOWS
+ pthread_mutex_lock(&cpu_mutex);
+ #endif
switch( packet[0]) {
case 3:
@@ -899,6 +902,10 @@
break;
}
+ #ifndef HOST_WINDOWS
+ pthread_mutex_unlock(&cpu_mutex);
+ #endif
+
if ( send_reply) {
return putpacket( sock, out_packet, send_size);
}
Index: src/cli/main.cpp
===================================================================
--- src/cli/main.cpp (revision 5068)
+++ src/cli/main.cpp (working copy)
@@ -62,8 +62,17 @@
#ifdef GDB_STUB
#include "../armcpu.h"
#include "../gdbstub.h"
+
+#ifndef HOST_WINDOWS
+// Now both the GTK main and CLI main have this mutex variable defined, allocated
+// and destroyed, just like all mains (Cococa and Windows included) also have code
+// for creation and destruction of the GDB stubs. It would be better to place all
+// that in a common location to avoid unnecessary duplication of logic.
+ pthread_mutex_t cpu_mutex; // to access the CPUs in any way, a thread has to get a lock on this first
#endif
+#endif
+
volatile bool execute = false;
static float nds_screen_size_ratio = 1.0f;
@@ -596,6 +605,11 @@
driver = new BaseDriver();
#ifdef GDB_STUB
+
+#ifndef HOST_WINDOWS
+ pthread_mutex_init(&cpu_mutex, NULL);
+#endif
+
/*
* Activate the GDB stubs
* This has to come after NDS_Init() where the CPUs are set up.
@@ -826,7 +840,12 @@
destroyStub_gdb( arm7_gdb_stub);
arm7_gdb_stub = NULL;
+
+#ifndef HOST_WINDOWS
+ pthread_mutex_destroy(&cpu_mutex);
#endif
+
+#endif
SDL_Quit();
NDS_DeInit();
Index: src/gdbstub.h
===================================================================
--- src/gdbstub.h (revision 5068)
+++ src/gdbstub.h (working copy)
@@ -19,12 +19,29 @@
#ifndef _GDBSTUB_H_
#define _GDBSTUB_H_ 1
+// For cpu_mutex
+#ifdef HOST_WINDOWS
+#include <windows.h>
+#else
+#include <pthread.h>
+#if defined HOST_LINUX
+#include <unistd.h>
+#elif defined HOST_BSD || defined HOST_DARWIN
+#include <sys/sysctl.h>
+#endif
+#endif // HOST_WINDOWS
+
+
#include "types.h"
typedef void *gdbstub_handle_t;
struct armcpu_t;
struct armcpu_memory_iface;
+#ifndef HOST_WINDOWS
+extern pthread_mutex_t cpu_mutex;
+#endif
+
/*
* The function interface
*/
Index: src/gtk/main.cpp
===================================================================
--- src/gtk/main.cpp (revision 5068)
+++ src/gtk/main.cpp (working copy)
@@ -67,8 +67,17 @@
#ifdef GDB_STUB
#include "armcpu.h"
#include "gdbstub.h"
+
+#ifndef HOST_WINDOWS
+// Now both the GTK main and CLI main have this mutex variable defined, allocated
+// and destroyed, just like all mains (Cococa and Windows included) also have code
+// for creation and destruction of the GDB stubs. It would be better to place all
+// that in a common location to avoid unnecessary duplication of logic.
+ pthread_mutex_t cpu_mutex; // to access the CPUs in any way, a thread has to get a lock on this first
#endif
+#endif
+
#if defined(HAVE_LIBOSMESA) || defined(HAVE_GL_GLX)
#include <GL/gl.h>
#include <GL/glu.h>
@@ -2919,6 +2928,11 @@
* where the cpus are set up.
*/
#ifdef GDB_STUB
+
+#ifndef HOST_WINDOWS
+ pthread_mutex_init(&cpu_mutex, NULL);
+#endif
+
gdbstub_handle_t arm9_gdb_stub = NULL;
gdbstub_handle_t arm7_gdb_stub = NULL;
@@ -3277,8 +3291,13 @@
destroyStub_gdb( arm7_gdb_stub);
arm7_gdb_stub = NULL;
+
+#ifndef HOST_WINDOWS
+ pthread_mutex_destroy(&cpu_mutex);
#endif
+#endif
+
return EXIT_SUCCESS;
}
Index: src/NDSSystem.cpp
===================================================================
--- src/NDSSystem.cpp (revision 5068)
+++ src/NDSSystem.cpp (working copy)
@@ -55,6 +55,10 @@
#include "SPU.h"
#include "wifi.h"
+#ifdef GDB_STUB
+#include "gdbstub.h"
+#endif
+
//int xxctr=0;
//#define LOG_ARM9
//#define LOG_ARM7
@@ -1828,6 +1832,13 @@
template<bool FORCE>
void NDS_exec(s32 nb)
{
+ #ifdef GDB_STUB
+ #ifndef HOST_WINDOWS
+ pthread_mutex_lock(&cpu_mutex);
+ #endif
+ #endif
+
+
LagFrameFlag=1;
sequencer.nds_vblankEnded = false;
@@ -1860,7 +1871,17 @@
while((NDS_ARM9.stalled || NDS_ARM7.stalled) && execute)
{
+ #ifdef GDB_STUB
+ #ifndef HOST_WINDOWS
+ pthread_mutex_unlock(&cpu_mutex);
+ #endif
+ #endif
driver->EMU_DebugIdleUpdate();
+ #ifdef GDB_STUB
+ #ifndef HOST_WINDOWS
+ pthread_mutex_lock(&cpu_mutex);
+ #endif
+ #endif
nds_debug_continuing[0] = nds_debug_continuing[1] = true;
}
@@ -1961,6 +1982,12 @@
DEBUG_Notify.NextFrame();
if (cheats)
cheats->process();
+
+ #ifdef GDB_STUB
+ #ifndef HOST_WINDOWS
+ pthread_mutex_unlock(&cpu_mutex);
+ #endif
+ #endif
}
template<int PROCNUM> static void execHardware_interrupts_core()
Last edited by xabccode (2015-01-11 14:46:51)
Offline
do you have a sourceforge account
Offline
Yes. But before it makes sense to make a commit or patch, I have to do this properly
- Make it work on the Mac/Cocoa port as well
- Maybe test on Windows.
After those things and the commit (patch submission), next in line could be adding event/signal based waiting to replace the current busyloop. I think the original idea might have been to put the emulator into some kind of idle mode while it's waiting, but it doesn't seem to be doing that.
I have a Windows box that has Visual Studio Express 2008 and 2010. I guess one of those will do? Or was it Visual C Express. It's the variety that produces plain old x86 code for "desktop" applications, without any .net stuff.
Last edited by xabccode (2015-01-12 18:03:35)
Offline
some of the very first things we did when TASers raided the emulator wouldve been to change how the frame loop works, so anything you see which looks like the emulator is schizo would date back to that. We werent gonna let the debugger get in our way, and people kept telling us it kept working.
visual c++ express works.
Offline