Dr.Venom, what fd value are you using? I believe that with the new fd implementation it'll be benefitial to increase fd as much as possible. The wait for vblank is a tight loop that'll eat all cpu cycles so the shorter the time we are in it the more idle cpu cycles there'll be for other stuff. Just in case.
I was using fd2, purely for testing purposes, as I was still under the impression that fd negatively affected the asio latency (as it did with the earlier builds). With the latest release I see this is no longer the case, on the contrary even
Edit just one more time:
I've uploaded the build used for these tests to here https://mega.nz/#!jsREGTKT!v0lQ2TUziRnFrOpzEn5VuYo3i5G8ShFKnXNjlTG7i1w , but it's really, really broken and should only be used to check my statements above. Also, with this particular build, asio_holdoff works the other way around, and should be set to 672 with a latency setting of 48. Also I've attached the Octave script used for calculating mean/max etc.
Thanks for posting all of your test results. After testing this release, I can say one thing: truely awesome performance indeed! The Genesis driver now runs with fd8 (!) and a ~96 samples buffer (!!) consistently without any issues in D3D! I've attached a picture with the octave profile*, see:
genesis-d3d_fd8-2x48-mt.png* note that I'm deleting the very first unstable values from the log, to get the scale as small as possible for these comparison purposes.
Even more interesting is that at these settings, when it comes to the "screenswitching" done by Mega Turrican (using super resolution 1280x0), d3d is clearly outperforming ddraw. Apparently ddraw is not as quick doing these x4 and x5 width switches as d3d, resulting in a far less stable soundbuffer at these switching moments. You can see this by comparing the d3d picture with the large spikes at the switching moments in the audio buffer for ddraw, as shown in
genesis-ddraw_fd8-2x48-mt.pngNote that I use a trick here to get the game to switch screenmode more often. If you'd like to replicate that:
when the intro runs press fire -> title screen (switch)-> wait for about 10 seconds -> level 1 starts to demo (switch)
-> press fire -> title screen (switch)-> wait for about 10 seconds -> level 2 starts to demo (switch)
-> rinse and repeat, and you'll get level 3 and level 4 demo'd also (enjoy the great Chris Huelsbeck level 3 tune!).
So within a test of 2 minutes you can get it to generate a number of screenswitches. These are exactly the big spikes you're seeing in the ddraw graph genesis-ddraw_fd8-2x48-mt.png. To make this even more convincing just compare it to the graph
genesis-ddraw_fd8-2x48-mt_v2.png, where I let the whole intro run (so you only have one screenswitch in the beginning). This also shows in the graph, as there's only one spike in the audio buffer in the beginning, but for the rest it's smooth.
Verdict: d3d clearly outperforms ddraw in this test. Add these to your extensive set of d3d/ddraw comparisons, and (at least to me) it seems clear that
D3D is the winner with this latest build. I think Calamity will be happy now
I've also attached a run of 1944 using framedelay 8 and the same insanely small soundbuffer. It runs completely without issues, see
1944-d3d_fd8-2x48-mt.pngWith regards to using multithreading or not, I can confirm your test results, that there doesn't seem to be any adverse effect from using it. At least when it comes to asio stability. My preference now is to actually keep it on by default. Calamity I guess you may not need to change the multithreading implementation given these test results?
Also, I think now is the time to get a better understanding of how the sound system works, since I think there's intermediary buffering going on, which may or may not be driver-dependent. I find it difficult to believe that the actual pbobble hardware would buffer as many samples as shown in the topmost plot in this post http://forum.arcadecontrols.com/index.php/topic,141869.msg1469093.html#msg1469093, however actual hardware tests would need to be conducted. I've got a NeoGeo sitting here that I haven't hooked up yet, to use for comparison (though not with pbobble).
I would love to see the real hardware compared to the current implementation and see where we really are in the total latency chain now. Hopefully you come around do doing these tests.
Lastly, could you possibly explain how the current ASIO implementation keeps the audio buffer stable? Does this affect sound quality in any way?