This game seems to be doing something with its graphics to cause some severe slowdown. In either the software renderer or the OpenGL renderer, it runs at half speed for me when I start moving around. With the OpenGL renderer and texture scaling set to 2x, the FPS drops down to below 5(see https://i.imgur.com/01TeL0A.png). Seems odd for such a simple looking game. No other games drop to below 55FPS for me with the same settings.
Just so you know, just because a game looks simple doesn't always means it is simple to emulate. This is actually a fully 3D game that just so happens to use 2D sprites within the 3D engine.
But to answer your question: It's the texture thrashing that makes this game slow. Texturing thrashing can occur if the textures are changed within emulated VRAM or if a palette change occurs, causing the emulator to have to reload those textures. The situation becomes much worse if you are using texture upscaling, since all textures have to undergo an additional (and computationally expensive) processing step. The situation further becomes worse with OpenGL, since the textures then have to be uploaded to the host GPU before use. In other words, the settings that you are using right now are the worst possible scenario for this type of game.
The solution is to use SoftRasterizer, which doesn't need to upload textures to the host GPU, nor does it support texture upscaling to begin with. SoftRasterizer easily makes this game playable.
SoftRasterizer easily makes this game playable.
Not really(well, maybe if you have the latest Intel® processor). Like I've already mentioned, for me the software renderer gets the same speed as OpenGL without scaling(~30fps) once I move around the level a bit. The only place where it's running at a playable speed is the very start of the level.
Guess I'll just dig my physical copy and play on the real thing.
Last edited by windwakr (2016-06-30 04:44:46)
Well admittedly, I did test this issue using the latest SVN (r5477 as of this writing) on my 2008 Mac Pro. The 2.8GHz Xeon E5462 CPUs it contains aren't the latest and greatest Intel CPUs by any stretch, but maybe the fact that it does sport 12MB of L2 cache per CPU probably helps hide some of the performance issues. Also, getting the latest SVN revision certainly helps, as I have been actively working on optimizations for DeSmuME's graphics rendering and the performance gains over v0.9.11 are significant.
I retested on a 2.53 GHz Core 2 Duo and did experience severe slowdown issues on native-resolution SoftRasterizer. I haven't done any real profiling on this issue, but I'm sure most of the problem does come from texture thrashing, which is why OpenGL and texture upscaling cause the frame rate to drop. But this frame rate issue is still pretty bad compared to all the other stuff I've experienced so far. More research and some profiling should help clear up exactly what is going on here, since it is possible that this performance issue is more than just texture thrashing.
Also, if your CPU emulation isn't set to dynamic recompiler, you should use it to help with performance.
Okay, I see the problem. It's nothing more than A LOT of texture thrashing. That is what's causing all of the slowdown.
It seems like the texture thrashing is occurring because the texture data in the emulated VRAM is getting changed very frequently. While there are some things we can do to optimize our texture handling system, there really is no way around dealing with frequent VRAM changes. This is an issue with the game itself.
The solution? There really isn't one at this time. The current texture handling system works by loading textures on an as-needed basis -- whenever a 3D renderer needs a texture, the texture handling system checks both emulated VRAM and palettes for any changes, unpacks the texture data into a format suitable for the 3D renderer, and then caches the unpacked texture in some host memory buffer (host RAM in SoftRasterizer's case, or the host GPU's VRAM in OpenGL's case).
But this game constantly changes what's in emulated VRAM, forcing our texture handling system to continually reload and unpack the textures. As you see, this is very computationally expensive if it happens too much.
The true solution is to eliminate the texture handling system altogether and sample the raw data directly from emulated VRAM. Since there would be no load/unpack process, we can avoid texture thrashing completely. However, keep in mind that the current texture handling system is a major optimization in a lot of ways -- by caching the unpacked texture data in a preferred format, it speeds up texture sampling immensely. Also, unpacking the texture data also allows us to perform whatever post-processing we want on it, such as texture upscaling. If we sample the raw data directly, then additional post-processing features, such as texture upscaling, are no longer possible.
I don't feel like throwing out the current texture handling system is the way to go, as the current benefits greatly outweigh any issues with slowdown due to texture thrashing. However, it may be possible to do some hand-coded SSE2 optimizations on the texture unpacking loops in order to make texture thrashing a little less painful.