You are not logged in.

Read the FAQ and Knowledge Base before posting.
We won't make a 3DS/2DS emulator.



#1 2024-07-27 03:56:00

Cyanea
Member
Registered: 2024-07-27
Posts: 1

What is the optimum sample rate when re-encoding audio for DS games?

Hi, wasn't too sure whether to stick this in Support or General, so feel free to move if necessary.

I'm currently working on a mod/romhack for FFIV DS that aims to replace the low quality audio used for the voice lines with higher quality encodes.
Source for these encodes is the PC version. I already know how to extract the pc files and then encode them for the DS version (and the results look quite promising).
But I am uncertain on what would be the best sample rate to use for the newly encoded files. Hope I can find the answer here since DeSmuME would be my preferred way to play the game.

Some details first:
- The PC version uses 44100Hz sample rate for voices
- The DS original uses 16384Hz sample rate for voices
- (As a side note, DS intro cinematic uses 32000Hz for its music)

From my research it seems like (please correct me if wrong) :
- real DS hardware outputs audio at 32768Hz
- DeSmuME's emulated system outputs audio at 44100Hz
- MelonDS's emulated system outputs audio at 32768Hz

Subjective tests by me listening to the replaced voices at various sample rates:
- No audible difference between 32000Hz and 32768Hz on either emulator
- DeSmuME sounds better than MelonDS (and also best overall) IF using 44100Hz encodes
- in MelonDS 44100Hz encodes sound worse than 32768Hz encodes
- MelonDS sounds better than DeSmuME IF using 32768/32000Hz encodes

Obviously disabled interpolation in both emulators before testing.
Do these observations make sense from a technical point of view? Because if so, then it seems like it'd be best if I created multiple versions of my mod (1 for each emulator/sample rate)?
What about playback on real hardware though? Going to assume that the best sample rate in that case would be 32768Hz?

And what happens if DeSmuME or another emulator decides to change their audio subsystem to reproduce audio at the host system's rate (e.g. 48000Hz)?
Would a 32000Hz encode actually be preferable in that case since going from 32000Hz to 48000Hz should be less prone to (audible) errors than 44100Hz to 48000Hz?
Guess that last one would depend on what kind of resampler the emulator uses.

Offline

#2 2024-07-28 09:48:50

zeromus
Radical Ninja
Registered: 2009-01-05
Posts: 6,211

Re: What is the optimum sample rate when re-encoding audio for DS games?

You shouldn't turn off the interpolation since no workstation outputs at "32768hz" so resampling must occur.  Melonds isn't outputting at 32768hz, or if it is, it's relying on your OS to resample to 44100 or 48000 internally and you can't control that interpolation (therefore it's not a fair comparison between the emulators).

No situation should be prone to any audible errors when interpolation is used as intended.

If you could pick timer values that yielded exactly 44100 or 48000 then they could play 1:1 on an emulator, but it is impossible to pick such timer values. The timing is either based on dividing the ugly value of 33513982 (as desmume does) or the prettier value of 33554432 (0x2000000) (which is defensible but not what we do). Neither of those have 44100 or 48000 as divisors. You can only play, for instance, at 33513982 /759 ~= 44155.444hz. Consequently, interpolation must occur under all circumstances.

Personally, I'd just use the unmodified 44100 sounds from the PC version, for both emulators. Once your listening test is no longer conducted in a mistaken manner with the interpolation disabled, I doubt you will hear much difference. Of course melonds may sound different from system to system since it relies on the OS to do resampling work so that's an impossible goal to chase.

Now as for 1:1 results on the device itself.. now is the time to say I believe all the extant documentation is wrong. The audio is NOT actually 32768hz. It is simply CLOCK/1024: 32728.49.. or whatever. It depends on the oscillator. What you want to do is is set the timer so that everything is evenly divided. In other words, set it to be divided by 1024 (give or take a factor of 2 in the register configuration). This way, every input sample from the voice will be played for exactly one "output sample". In this case I would downsample the 44100 from the PC game to 32728hz using the best tool you have, and this is simply for the sake of making the pitch match expectations once played 1:1. In terms of fidelity, choosing that channel clock divider will make sure whatever waveform you have is played in a relatively comprehensible way. Although I have my doubts that anyone could hear any minor discrepancies on the NDS even with headphones after it's sent through the system's other strange audio circuitry to make it all harsh and cheap sounding.

Offline

Board footer

Powered by FluxBB