Hi Dr.Venom, long time no hear!
Yes it's been a while, good to see you around also
I've improved the setup since the last post, I can now actuate the button deterministically at a predetermined time from vblank.
It's quite simple - an MCU (Teensy) is monitoring vblank (from the UMSA), and pulls the button low on the RAP for 125 ms, x ms after vblank. x is configurable to any delay. Runtime operation is monitored by the oscilloscope.
Other than that, the testing procedure is the same as previous endeavours (240 fps camera + LED). So next-frame response still need to be determined manually by looking at the video. It's quite easy with this setup though, since the button actuation interval is deterministic, a fixed number of video frames can be present between actuations. This makes it possible to quickly fast-forward between actuations.
Ah now I understand, thanks for explaining. Great setup.
OK, so I did a few runs.
I set the button actuation to about 3.6 ms from vblank (window is about 1.9 ms), and set the polling rate to Default, 500 Hz and 1000 Hz using the SweetLow driver and logged the result. The timing can be seen in the attached image. I started counting from about frame 250 and onwards.
1000 Hz: 51 pass, 6 fail, 89% next frame response.
500 Hz: 21 pass, 36 fail, 37% next frame response.
Default: All fail, 0% next frame response.
So even though it's still unknown if the Hori in actuality handles 1000 Hz polling in a graceful manner, it performs a lot better with 1000 Hz polling set. With multiple button presses and the like, who knows. Maybe it'll choke.
I uploaded the videos to here: mega.nz.
I use VLC to convert the videos to avi (MJPEG) and then VirtualDub with ffdshow to view them, which allows very fast scrubbing.
That's a pretty awesome finding. Thanks for sharing, I'll take a look at the videos later!
The below got a bit longer than I anticipated but since it may have some relevance to the topic on hand..
Some time ago I did some more elaborate testing on the latency of WinUAE versus real Amiga (which I still own). With the help of the WinUAE author a small application for the Amiga side was made which allows to set the rasterline where the Amiga polls for input and then shows a color on screen after vsync when a button is pressed (Very tiny Amiga DOS program that takes rasterline number as input
). I.e. you can have the Amiga side poll
early or
late in the frame and see how it affects the latency measurements on the Windows host.
One of the takeaways was that WinUAE when compared
like for like with the real hardware shows a lag of 1.2 frames when the Amiga side acquires input early in the frame and about 1.8 frames when it acquires input late in the frame. WinUAE's D3D9Ex implementation as far as I know is quite similar to that of GM, the main difference being that WinUAE doesn't have a framedelay feature like GM.
I guess the above may have similar implications for GM, especially when there is "some" inherent lag on the host (USB stack?) this can become a factor.* (Framedelay cannot fully compensate, especially when game polls late in the frame). So depending on which rasterline an Arcade game reads its input, it will be easier or more difficult to match the response with emulation. It may also imply that if we find next frame response for one game, that it doesn't have to be the same for all games.
*If I remember correctly some tests done by Calamity way back suggested something like half a frame of inherent host lag (next frame response was only seen when LED active and rasterbeam had not yet crossed 1/3rd of the screen.. but correct me if I'm wrong..)
In case you're interested, a (very long) thread spawned from it on the EAB board:
Input latency measurements (and D3D11), see here:
http://eab.abime.net/showthread.php?t=88777.
The WinUAE testing brought up an interesting other point, which has to do with a Microsoft comment that mentioned that in most situations there is 1 frame of inherent video latency in Windows applications (this may be interesting to Calamity especially). With Windows 8.1 a new feature is introduced called "waitable swap chains" that has the potential to implement next frame response to input, see this post out of the earlier mentioned thread specifically:
http://eab.abime.net/showpost.php?p=1188236&postcount=19How does waiting on the back buffer reduce latency?
With the flip model swap chain, back buffer "flips" are queued whenever your game calls IDXGISwapChain::Present. When the rendering loop calls Present(), the system blocks the thread until it is done presenting a prior frame, making room to queue up the new frame, before it actually presents. This causes extra latency between the time the game draws a frame and the time the system allows it to display that frame. In many cases, the system will reach a stable equilibrium where the game is always waiting almost a full extra frame between the time it renders and the time it presents each frame. It's better to wait until the system is ready to accept a new frame, then render the frame based on current data and queue the frame immediately.
It's part of the reason why the thread derailed into in a DGXI/D3D11 thread
, which Toni is currently implementing into WinUAE. Time will tell, once the low latency vsync stuff is implemented, whether it will result in shaving off another frame of latency compared to its (WinUAE's) D3D9Ex implementation.
Here's the descriptor from the Hori stick, the MSDN descriptor page, in conjunction with the information from USBView seems to suggest that the device in fact uses a standard 8 ms polling rate (bInterval = 0x0A and Full Speed).
Right, that's a configuration seen on many "normal" (non gaming) USB2 input devices.
I wasn't aware of the fact that modern devices actually set a faster polling rate. Maybe the proper thing to do would be to find an encoder that operates at 1 kHz, and not use the SweetLow hack!
Sure that seems like the best solution. Not easy to find though as there is some nuance to modern devices setting a faster polling rate. It seems to be the modern (expensive) gaming gear that are almost without exception full speed devices with 1 ms polling (or configurable as 1, 2, 4 or 8ms via a hardware switch, like e.g. my Corsair K70 gaming keyboard). But opposite to that is the category of modern "cheap" gear, like most of the gamepads and joystick / gamepad adapters from China, which sadly are mostly low speed 8ms, or worse...
I'll elaborate a bit on this as we're all freaks right
This is irrefutably the case! Albeit in a good way.
Definitely. It's great to have people apply a little science to these topics which have been so obscure for many years.