You are not logged in.

Read the FAQ and Knowledge Base before posting.
We won't make a 3DS/2DS emulator.



#1 2011-01-22 05:01:41

swiftless
Member
Registered: 2011-01-22
Posts: 5

OpenGL Integration?

Hey everyone,

I've been interested in the DeSmuME project ever since I found out it worked on OSX, however looking at the source code it is using a drawing library that doesn't support hardware acceleration, if my memory is correct, it is a C++ library of some sort last time I checked.

I'm not sure on the development progress of DesmuME, or if it is even still being developed, but was wondering if there were any plans to integrate possibly OpenGL? The big kick in the guts is the rendering stage on my Macbook, however I haven't had an in-depth look at the code, just a peek at this stage.

I might have a go at optimising it myself for OSX if there aren't any plans at this stage.

Cheers,
Swiftless

Offline

#2 2011-01-22 06:04:35

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 6,169

Re: OpenGL Integration?

its antigrain. it's only used for the gui and even then barely (if at all) in the unix ports. if you can't find oglrender.cpp that is already in the codebase, then youre in the wrong forum.

if you want to optimize it for osx i would start by finding out what libraries it uses and which ones are impacting the speed (hint: its probably none of them)

if you still want to optimize it for osx i would start by making the pthreads code work for the multithreaded software rasterizer, which does a better job than opengl and runs faster in many cases. further optimizations would involve gcc versions of the SSE and SSE2 optimized functions, as well as work on the cocoa or wx ports, since the GTK osx port sucks. nobody plans to do any of this, except for some people who occasionally talk about working on the wx port a little bit. the cocoa port should be deleted, it is quite out of date and likely always to be that way.

Offline

#3 2011-01-22 06:43:32

swiftless
Member
Registered: 2011-01-22
Posts: 5

Re: OpenGL Integration?

Hey Zeromus,

Thanks a lot for your response. I had mentioned I only took a quick look, so I probably saw the aggdraw and gfx3d files and assumed these were all that it was using.

Having a look at the OpenGL section of the code, I see for example it uses the fixed function pipeline for rendering triangles and even though it says in the comments that it is using display lists, it simply isn't (maybe a mix-up in terminology in the comments?). This is exactly the kind of thing I was looking at finding and optimising.

However if I can't speed this up, I might as well take a look at the pthreads integration. That shouldn't be too hard.

Offline

#4 2011-01-22 07:25:43

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 6,169

Re: OpenGL Integration?

There is a mix-up in your interpretation of the terminology. The NDS produces a display list. It is a list of things. To be displayed. That's what gets drawn.

The opengl isn't slow enough for any of that to matter. NDS scenes are trivial for modern hardware. The bulk of the cost comes from the framebuffer fetch.  The individual glVertex commands don't help. Vertex batch(es) would help--though noticeably?? I doubt it. They need to be rendered in a certain order, and they need to switch between triangles and quads, so you can't necessarily batch them up very well. You're wasting your time though, this is all speedup chump change.

You may find some bug in the use of opengl on OSX or some suboptimality in how GTK or SDL interact with it that provide a speedup with regards to the framebuffer fetch. I don't know how carefully that has been addressed. I suspect not much. But even in the best case we'd be talking 2fps probably.

The opengl renderer is not dominated by speed concerns. It is dominated by compatibility concerns. Maybe you can find some things to speed up (very slightly) in it, but we won't accept quality penalties, and we haven't exactly ignored its speed.

Offline

#5 2011-01-22 07:48:36

swiftless
Member
Registered: 2011-01-22
Posts: 5

Re: OpenGL Integration?

I assumed that was the meaning, it just seemed out of place with display lists being a cache of fixed function commands in OpenGL.

Vertex batching is what I am thinking at the moment. Vertex Buffer Objects can provide drastic improvements, usually at least 200%, in my personal projects, over 1000% in some cases. Even several different batches is faster than no vertex batches. Don't get me wrong, this is where I am going to start as it's my specialty, but I plan to look at a lot more things to change. Also the code I have only renders triangles, the quad code is commented out so I assumed gfx3d was already converting the quads to triangles.

You've got me curious about the framebuffers now, I'm doubting the use of asynchronous readbacks from the GPU from what I saw in the code, which can significantly increase performance when used correctly.

I don't plan on sacrificing quality smile

Offline

#6 2011-01-22 09:33:24

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 6,169

Re: OpenGL Integration?

you must be looking at old code. I have no clue what the code on date X known only to you looks like.

a 1000% speedup of something that is 1% of the entire workload only gains you 0.999% of the entire workload. The entire 3d workload is some fraction of the entire emulation workload. Maybe 20%, maybe 50%. I'll be generous and call it 50%. Then, good luck fighting for that 0.4995% fraction of an FPS boost.

Asynchronous readbacks don't work when you need the results immediately. To know why you need the results immediately, you need to know things about the NDS and how we choose to emulate it. Although I think we are supposed to be displaying 3d a frame later than we are.

Ok, now that I'm through exaggerating:

I commented out glBegin, glVertex, glViewport, etc. and moved a 66 fps scene to 69.
I commented out glReadPixels and it moved from 66 to 71.
For control, I disabled 3d rendering altogether and it moved from 66 to 73.

Now, something else you need to consider: opengl is for old, crappy computers. New computers should all be using software rasterizer anyway. If you use newer opengl features than what we're already using, someone may arbitrarily decide not to check it in, because breaking opengl for old computers is a bad plan. Not to mention, touching the opengl code is portability trouble.

Offline

#7 2011-01-22 10:37:24

swiftless
Member
Registered: 2011-01-22
Posts: 5

Re: OpenGL Integration?

Oh ok, I downloaded the 0.9.6 source on the download page, modified 25th of March last year I believe. I'll download the svn as I just found that smile

Asynchronous readbacks give the results immediately, unlike synchronous which waits. In OpenGL you can get back the framebuffer instantaneously if you know how, it actually makes the glReadPixels call asynchronous to be exact, and you can also map that directly to the memory on the GPU, the risk being that it is so instantaneous, you can get back an incomplete buffer if your OpenGL calls have not completed at that stage, which you can ensure anyway.

A 66fps to 73fps boost isn't bad in my opinion, on my Macbook Pro I get about 40-50fps at times and an extra 7 gets you that much closer to 60, although I doubt I would see that increase on my system, so maybe a 4fps increase if I'm lucky.

I don't want to start a flame war, I'm here because I'm interested in DeSmuME and am keen on wanting to do a little to help. But OpenGL is far from old and in no way meant for crappy computers. I'm a computer graphics developer primarily, and I work extensively with OpenGL on performance critical applications. That's like saying Direct3D is dead and should be replaced with a software rasterizer, which in turns makes all GPU's redundant. Nothing beats GPU acceleration in my opinion and OpenGL code itself is 100% portable, if not, I would be in a world of pain tongue

Offline

#8 2011-01-22 19:44:58

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 6,169

Re: OpenGL Integration?

We mean it for crappy computers and crappy video game consoles which are too slow to run the software rasterizer. It is inferior in desmume to the software rasterizer which is more accurate and quite nearly as fast (faster in some cases).  I'm not making some grand sweeping statement about the usefulness of opengl in every context to you and to me and to every single other person without exception. That seems to be your department.

Here is a grand, sweeping statement: opengl code is not 100% portable, you live in a fantasy world.

You might find that its the pixel format conversion in the framebuffer-fetch step that is slow. I think we have to do that there because we don't know exactly what format the opengl device rendered.

Offline

#9 2011-01-22 23:15:25

swiftless
Member
Registered: 2011-01-22
Posts: 5

Re: OpenGL Integration?

Ah. That makes a lot more sense. Definitely interpreted that wrong. I get touchy when it sounds like people bag the main library I've used for the last 5 years. Although what you say holds solid for me wanting to boost the speed on my laptop. It doesn't have the grunt of my desktop, hence the GPU acceleration.

I'll give you half of that statement though. OpenGL code is perfectly portable, but limited by the quality of the implementation. For example I've seen a few bugs in Apples implementation over the years. Luckily DeSmuME uses it fairly little, meaning I shouldnt run into any problems if it's all basic OpenGL. Most of the issues I've seen are with rarely used methods.

I saw that code as well, you can usually tell OpenGL to store framebuffers in a specific format, but now that I think about it, I didn't see actual OpenGL framebuffer object code being used so I'll double check that and start there. Thanks!

Offline

Board footer

Powered by FluxBB