Main Restorations Software Audio/Jukebox/MP3 Everything Else Buy/Sell/Trade
Project Announcements Monitor/Video GroovyMAME Merit/JVL Touchscreen Meet Up Retail Vendors
Driving & Racing Woodworking Software Support Forums Consoles Project Arcade Reviews
Automated Projects Artwork Frontend Support Forums Pinball Forum Discussion Old Boards
Raspberry Pi & Dev Board controls.dat Linux Miscellaneous Arcade Wiki Discussion Old Archives
Lightguns Arcade1Up Try the site in https mode Site News

Unread posts | New Replies | Recent posts | Rules | Chatroom | Wiki | File Repository | RSS | Submit news

  

Author Topic: Successfully reducing MAME input lag via 120Hz monitor (applies to LCD and CRT)  (Read 19660 times)

0 Members and 2 Guests are viewing this topic.

mdrejhon

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 35
  • Last login:September 12, 2022, 01:24:28 am
  • I want to build my own arcade controls!
Hello,

Historically MAME adds a bit of input lag compared to original machines, because MAME needs to create a framebuffer before outputting to the display.   So you get a minimum added +16.7ms input lag (1/60second).   One big improvement that 120Hz provides, is the large reduction in input lag, down to only 8.3ms.   This creates a big improvement in games such as Street Fighter.

There are two ways to run MAME at 120Hz:
-- Use a 31.5Khz CRT, running at 240p @ 120Hz
-- Use a desktop CRT supporting 120Hz
-- Use a 120Hz LCD

However, to eliminate the double-frame effect at 120Hz, you need to add black-frame insertion (black frames between refreshes), which keeps the 60Hz CRT style effect at 120Hz.  It also reduces motion blur on LCD's -- see http://www.blurbusters.com/mame/

Here's a source code diff by cpharlock
Command Line: mame.exe romname -nomultithreading -nothrottle -video d3d -syncrefresh -strobe

But that is hacky.  Less input lag in MAME is always good.  Less input lag is not niche.
When will black-frame insertion become an official feature of the main MAME source code tree?

Thanks,
Mark Rejhon
« Last Edit: July 10, 2013, 11:53:11 am by mdrejhon »

jdubs

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 61
  • Last login:January 03, 2018, 09:06:27 am
Curious about this.  Is there a list of CRTs that do 240p at 120hz that ALSO accept RGB input - arcade or desktop?

-Jim
« Last Edit: July 10, 2013, 07:32:04 pm by jdubs »

Silverwind

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 807
  • Last login:September 26, 2022, 12:49:09 am
Why do you disable the multi-threading? (sorry, I have been out of the loop for a while)

Falkentyne

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 14
  • Last login:July 11, 2013, 11:20:50 am
  • I want to build my own arcade controls!
I think you get a flashing image or the speed is completely broken if you don't disable multithreading. 

jdubs

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 61
  • Last login:January 03, 2018, 09:06:27 am
Super intrigued by this.  I don't think my Sony PVM will accomplish it....but which arcade or other CRTs can handle this signal?

-Jim

rCadeGaming

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 1256
  • Last login:April 13, 2025, 12:14:40 pm
  • Just call me Rob!
I'm really interested in trying this for MAME, but my problem is that I don't think I can find a monitor that works with both 120Hz 31kHz 240p AND normal 60Hz 15kHz 240p that I'll be getting from my consoles.  Any ideas?

jdubs

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 61
  • Last login:January 03, 2018, 09:06:27 am
I'm really interested in trying this for MAME, but my problem is that I don't think I can find a monitor that works with both 120Hz 31kHz 240p AND normal 60Hz 15kHz 240p that I'll be getting from my consoles.  Any ideas?

This is precisely what I am after as well!  I think MAYBE the trimode Makvision 27 29?  At least the Makvision 2929D works....its 31khz only, though....

-Jim

mdrejhon

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 35
  • Last login:September 12, 2022, 01:24:28 am
  • I want to build my own arcade controls!
By the way, here's a web animation of software-based black-frame insertion.
(Blur Busters Motion Test website, requires Chrome, IE 10+ or FireFox 24+)

http://www.testufo.com/#test=blackframes

If you run at 60Hz, it will demonstrate 30fps + black frames insertion.
If you run at 120Hz, it will demonstrate 60fps + black frames insertion.

Calamity

  • Moderator
  • Trade Count: (0)
  • Full Member
  • *****
  • Offline Offline
  • Posts: 7461
  • Last login:May 23, 2025, 06:07:25 am
  • Quote me with care
Hi Mark,

Historically MAME adds a bit of input lag compared to original machines, because MAME needs to create a framebuffer before outputting to the display.   So you get a minimum added +16.7ms input lag (1/60second).   One big improvement that 120Hz provides, is the large reduction in input lag, down to only 8.3ms.   This creates a big improvement in games such as Street Fighter.

This is true, frame-based emulation is, by definition, one-frame laggier than the real thing. However this doesn't seem to be the most expensive source of lag: http://www.mameworld.info/ubbthreads/showthreaded.php?Cat=&Number=307291&page=&view=&sb=5&o=&fpart=1&vc=1

So running at 120 Hz can, in theory, remove half frame of input lag, but although it's definitely an additional benefit, I wouldn't call this a "large" reduction of lag that you're going to notice, as compared to the 3 frames of lag that may be introduced by undesired frame queues built in the drivers.

Quote
But that is hacky.  Less input lag in MAME is always good.  Less input lag is not niche.
When will black-frame insertion become an official feature of the main MAME source code tree?

In my opinion, it's unlikely that something like this finds its way into main line MAME, though I might be wrong. It will indeed be added to the new Switchres patch (whenever it's finished) so it will be a feature of GroovyMAME. But as you probably found MAMEdevs will have a good time mocking this if you post at MAMEWorld.

Important note: posts reporting GM issues without a log will be IGNORED.
Steps to create a log:
 - From command line, run: groovymame.exe -v romname >romname.txt
 - Attach resulting romname.txt file to your post, instead of pasting it.

CRT Emudriver, VMMaker & Arcade OSD downloads, documentation and discussion:  Eiusdemmodi

mdrejhon

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 35
  • Last login:September 12, 2022, 01:24:28 am
  • I want to build my own arcade controls!
So running at 120 Hz can, in theory, remove half frame of input lag, but although it's definitely an additional benefit, I wouldn't call this a "large" reduction of lag that you're going to notice, as compared to the 3 frames of lag that may be introduced by undesired frame queues built in the drivers.
Trust me, I didn't believe it until I owned a 120Hz computer monitor.  When I play my 60Hz games in *anything*, not just MAME, I actually noticed the slight reduction in input lag.  8ms is a large difference when you're teetering near "can feel it" vs "can't feel it"; Micrsoft did some tests and found humans were consistently able to detect eye-hand-coordination "input lag differentials" even with just 10ms, see article with Microsoft YouTube video, and also Competitive online FPS gamers claim they feel one frame differences -- I now believe them!   

It is not noticeable in PAC MAN, but it's definitely noticeable in Street Fighter type games.
Not everyone notices, but I've received at least 3 testimonials of people who could tell it "felt more connected".

And Calamity, it benefits you too, for a different reason, even if you can't "feel it": Also, sometimes input lag is a "win-the-race" game.  If you press a button 1ms too late, you lose (e.g. enemy shoots you first).  Having 8ms less input lag allows you to win if you're racing your reaction time against the game, and 8ms can "push you over the edge" in evenly-matched situations, even if you cannot feel the input lag difference.   That's an additional reason why competition gamers often care about small input lag differences; it allows them to "shoot first" in a draw.  It "adds" on top-and-beyond your reaction time.  If your reaction time was 180ms, you're now averaging 172ms if you shave off 8ms of input lag.  A big improvement when competing against another in "shoot simultaneously" situations in FPS.

Yes, human reaction time are often 200ms, but humans are actually more sensitive than expected to input lag differentials than you think.  During fast motion (e.g. like a fast moving "laser bullet" in a space shoot-em-up game), moving at 20 pixels per frame, having an input lag reduction of half a frame means the bullet is visually 10 pixels behind (in terms of where your eye-tracking position is)!  If it's at the very edge of your reaction time, this can mean the difference between pressing the "shields up" button on time, or being killed. 

(Note: 20 pixels step per frame is 1200 pixels/second, or two screen widths per second in a 640x480 game -- ~500ms to cross a screen width.  You do not have much time to react against fast moving objects such as this!  On an arcade CRT in this situation, 8ms translating to 10 pixels is approximately one centimeter!   So you see, in this specific situation, 8ms means the bullet is one centimeter behind!  This type of input lag difference is visible to the human eye when you have have two displays edge-to-edge and one of them is lagged by 8ms! (two displays sitting top-to-bottom for visually comparing horizontal lag, or displays sitting side-by-side for visually comparing vertical lag).   

In button mash games, there are many situations where the enemy can hit/kick/shoot/kill you first.  In any given minute of punches and kicks, exciting moments can have something like 100+ "time races against the enemy" (react before enemy does).  Having 8ms of advantage, actually starts to "averagely become really noticeable" to seasoned players in this situation. 

Examples of "races-against-time" situations:
- Timing a punch/kick button during enemy vulnerable moments, etc.
- Timing a shields-up button at close range when you have no time to dodge; etc.
- Timing a shoot as immediately as possible because you'll die quickly upon glancing enemy, etc.
- Other situations where the edge between win/lose is very thin.

So there are actually TWO benefits:
(1) Some people sensitive to input lag, actually feel differentials of 8ms; and
(2) Even if you can't feel it, it allows you to "shoot first" in a draw situation (win because you shoot first).
« Last Edit: July 14, 2013, 02:52:51 pm by mdrejhon »

jdubs

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 61
  • Last login:January 03, 2018, 09:06:27 am
I'm not sure Calamity is saying that 1/2 frame of lag is not noticeable, just that it's not large relative to the other sources of lag in MAME.

-Jim

Rigby

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 171
  • Last login:August 26, 2014, 09:40:59 am
  • I am good at liking video games
I'm not sure Calamity is saying that 1/2 frame of lag is not noticeable, just that it's not large relative to the other sources of lag in MAME.

-Jim

That's exactly what he was saying, and it is this reason that attention from the MAME devs will not be very positive.  There are much bigger latency hurdles to overcome before beginning to even consider worrying about 8ms.

jdubs

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 61
  • Last login:January 03, 2018, 09:06:27 am
I'm not sure Calamity is saying that 1/2 frame of lag is not noticeable, just that it's not large relative to the other sources of lag in MAME.

-Jim

That's exactly what he was saying, and it is this reason that attention from the MAME devs will not be very positive.  There are much bigger latency hurdles to overcome before beginning to even consider worrying about 8ms.

Makes perfect sense.  Need to focus on the big(ger) causes for the lag, first.

-Jim

Calamity

  • Moderator
  • Trade Count: (0)
  • Full Member
  • *****
  • Offline Offline
  • Posts: 7461
  • Last login:May 23, 2025, 06:07:25 am
  • Quote me with care
So running at 120 Hz can, in theory, remove half frame of input lag, but although it's definitely an additional benefit, I wouldn't call this a "large" reduction of lag that you're going to notice, as compared to the 3 frames of lag that may be introduced by undesired frame queues built in the drivers.

Just to clarify, by drivers I mean video drivers, NOT MAME drivers.

There seems to be a frame queue built in by the main video card manufacturers (ATI, Nvidia) that affects their Direct3D v-synced surface flipping implementations, as well as OpenGL. So this indirectly affects MAME, but it is NOT a defect of MAME itself. It can result in 3 frames of lag. This problem is bypassed by GroovyMAME when using the -frame_delay option (although its purpose was not exactly *that*).

There would be another possible source of lag in MAME, associated to the way the input is handled in the same thread that manages the video output, so it gets locked for input while waiting for v-sync. But this source of lag is probably residual and minor as compared to the first one. This problem is bypassed by GroovyMAME's 3-thread implementation.

Then, we would have the 1-frame lag which is inherent to frame-based emulators. This is the part that may be reduced by 1/2 frame when rendering at 120 Hz with black frame insertion. But just because it takes less time (1/2 less) to get the frame completely drawn so you can see the bottom of the frame sooner that you would on a 60 Hz CRT display, not because the input is processed faster or whatever.
Important note: posts reporting GM issues without a log will be IGNORED.
Steps to create a log:
 - From command line, run: groovymame.exe -v romname >romname.txt
 - Attach resulting romname.txt file to your post, instead of pasting it.

CRT Emudriver, VMMaker & Arcade OSD downloads, documentation and discussion:  Eiusdemmodi

mamenewb100

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 210
  • Last login:April 01, 2022, 03:32:29 pm
Just to clarify, by drivers I mean video drivers, NOT MAME drivers.

There seems to be a frame queue built in by the main video card manufacturers (ATI, Nvidia) that affects their Direct3D v-synced surface flipping implementations, as well as OpenGL. So this indirectly affects MAME, but it is NOT a defect of MAME itself. It can result in 3 frames of lag. This problem is bypassed by GroovyMAME when using the -frame_delay option (although its purpose was not exactly *that*).


Really? 3 Frames is very harsh. I never knew what the -frame_delay option was for. So technically I should see an improvement in response time by changing it to 3 or so? Thanks for the tip.
Life is a Game and we are all being Played.

mdrejhon

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 35
  • Last login:September 12, 2022, 01:24:28 am
  • I want to build my own arcade controls!
If you run various emulators in windowed mode (or don't do fullscreen mode properly), then you also have the window compositing manager too -- 1 extra frame of lag.   
So you need to run in Direct3D full screen mode to eliminate 1 frame of lag caused by the window compositing (e.g. Aero layer).

Another solution to further reducing input lag.
- Wait until just right before page flip (idle loop).
- Execute 1/60th of a second worth of MAME emulation as quickly as possible (e.g. run a 1Mhz chip at 10Mhz equivalent when you're already 80% of the way towards a vsync timing/pageflip). This reads the control inputs at the very last minute before vsync, reducing input lag.

That way, you get fresher control inputs as late as possible, right before a vsync/pageflip -- reducing input lag.  Surging the emulator's CPU 60 times a second at full tilt.  This would only work when the only external devices you need precise time synchronization to, is the monitor -- so you can simply freeze the emulation until right before a page flip, execute a 1/60sec timeslice of emulation at full tilt, and have a framebuffer created just about right before pageflip.  This can reduce input lag by almost 1 frame, by eliminating the create-framebuffer-wait-till-flip.   Some simpler game machines run so fast that it uses less than 10% CPU; you can easily idle until 80% of the way to the next pageflip, execute a 1/60sec timeslice of emulation quickly, and you'll still have time before the pageflip to make the pageflip; with fresher control inputs.   Assuming you run in full screen mode, and eliminate driver buffers except the front buffer, this can reduce MAME input lag to less than 1 frame behind a real game machine.  This is only practical when you've got arcade machines that reliably run at far less than a low percentage of your system's CPU; since missed pageflips will cause stutters in emulation.   I had been thinking about this for a long time now.

I believe GroovyMAME's -frame_delay option does exactly this.
But there can be other buffers getting in the way.

So, in theory, if you force full screen mode (bypass window compositing), and you've got only one framebuffer that will, upon the upcoming vsync, get immediately output to the monitor, you can theoretically get less than 1 frame of input lag.  Maybe less than 0.5 frame input lag.   Combine this with a 120Hz monitor to kill half a frame of lag, and you may actually get less average input lag than the real original arcade machine itself!  (Even when accounting for USB controller latency)   This is because CRT's slowly scan out their image over a period of 1/60sec (16.7ms) and it's possible to get ahead in this race with a faster-refresh display, that displays the individual 60Hz frame quicker.   Essentially, last-moment-before-refresh-emulation combined with the fast-refresh.

For that to happen (getting less input lag than the actual original arcade machine!) will require a long chain of perfectly-executed events and a fast-refresh display (e.g. 120Hz monitor), as well as delaying emulation to nearly the beginning of the upcoming refresh, bypassed windows compositor, and properly configured graphics driver, low-latency controller....  But it is, mathematically possible.  :applaud:
« Last Edit: July 16, 2013, 05:19:06 am by mdrejhon »

Rigby

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 171
  • Last login:August 26, 2014, 09:40:59 am
  • I am good at liking video games
An interesting solution to this frame buffering problem is to implement the native hardware on an FPGA.  With an FPGA you can draw the graphics to the screen in realtime, probably even if it's 120hz.  Original hardware couldn't do that so you'll have to implement that somehow but if I understand it correctly you'll get an average frame latency of 0.5 frames with a method such as that.

The MAME project has done a great deal of the hard work by documenting the hardware to the games thoroughly, especially the older ones.  MESS has done the same for a lot of systems.  Describe that hardware in an FPGA, do the virtual wiring that the physical board would have done, (FPGA programming is almost entirely hardware description), supply the roms, and boom you have Pacman on a chip.  Not in emulation, mind you (an important point to remember.)  The actual hardware is recreated inside the chip out of logic gates and RAM on the FPGA itself and runs exactly as real hardware would, if you describe it accurately.

Think about that... actual hardware, on a chip, programmed via software...  the possibilities...

http://fpgaarcade.com/ has a board that is coming out in a few days, designed just for this, and it can drive arcade monitors natively.  (There isn't a CGA connector on the board, the 5 analog pins of the DVI connector serve this purpose.)

I'm going to be looking at this very closely and doing some FPGA development studying over the next few weeks.  I have no idea if this effort will bear fruit but this is amazing to me.
« Last Edit: July 16, 2013, 10:31:53 pm by Rigby »

Dr.Venom

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 270
  • Last login:May 08, 2018, 05:06:54 am
  • I want to build my own arcade controls!
Just to clarify, by drivers I mean video drivers, NOT MAME drivers.

There seems to be a frame queue built in by the main video card manufacturers (ATI, Nvidia) that affects their Direct3D v-synced surface flipping implementations, as well as OpenGL. So this indirectly affects MAME, but it is NOT a defect of MAME itself. It can result in 3 frames of lag. This problem is bypassed by GroovyMAME when using the -frame_delay option (although its purpose was not exactly *that*).


Really? 3 Frames is very harsh. I never knew what the -frame_delay option was for. So technically I should see an improvement in response time by changing it to 3 or so? Thanks for the tip.

This is a misunderstanding.  The possible 3 frames of lag is -not- fully eliminated by the -frame_delay option. It's a video driver thing, just search for "Flip Queue Size" for ATI drivers or "Max Frames to Render Ahead" for NVidia. Even if you use the -frame_delay setting with its custom flipping method you'll still suffer from the >0 Flip Queue Size on ATI.

The only way to eliminate the FlipQueueSize delay fully is by using the RadeonPro tool and its possibility to set flip queue size to zero (here: http://www.radeonpro.info/). You may or may not notice the difference, but it's there. In case you'll be testing in Windows 7 be also very sure to have turned Aero (desktop composition) off, as it adds delay by itself also. And while we're at it, you'd possibly want to overclock your USB ports also to 1000Hz (1ms), instead of the default 125Hz (8ms). Useful for all Windows versions (Windows 7 link here: http://www.ngohq.com/news/15043-how-to-increase-usb-sample-rate-in-windows-vista-7-a.html). But be warned, this may not work for all hardware out there, so use at your own risk.

What the -frame_delay really attempts is minimizing the time between emulation of a frame and displaying it (thus trying to eliminate the "1-frame lag which is inherent to frame-based emulators."). The closer these two are, the less input lag. I could point you to the very lengthy discussion Calamity and I had on the topic, but I won't. Just search my posts and you'll find enough interesting stuff to read into.

How to use the -frame_delay? The -frame_delay is set in steps of 1/10th of a frametime. So the setting goes from 1 to 10. A rule of thumb for getting a safe and proper value is to run your MAME game unthrottled (just only for testing!) , look at its achieved speed and raise the frame_delay by a value of 1 for every 250% of unthrottled speed. So if your game runs at 1000% unthrottled, you set it to a value of 4 when vsynced.

Do take notice that there'll  be individual frames that take much higher than the average. You don't want those frames to be skipping when setting the frame_delay to a too high value. "Wait" command on PC can also sometimes take longer than the wait value itself. Both combined are basicly the reason for the safety margin that the rule of thumb also accounts for. So even if your games run really really fast unthrottled (2000%+) , I would advise against setting it to a value higher than 8. What if you set it too high? Then it will be randomly starting to miss vertical blank for frames. You'll notice this by irregular video, audio and/or -added- input delay instead of lowered input delay.

To be sure that your not degrading the emulation by setting too high values you need to run your games with the "-v" flag, so for example "MAME.exe toki -v". This will log two things on exit of a game:

- Average speed:  make sure this is very close to 100%
- Sound: buffer overflows=value and/or underflows=value. Make sure that the "value" for both of them is close to zero after running a gaming session for at least 5 or more minutes. Ideally they are zero after a gaming session (in which case they'll not be reported).

If any of the the two above don't hold in vsynced_throttled mode, then you either have set the frame_delay too large, and/or you have set the MAME audio_latency value in mame.ini too low, and/or mame is opened on a video screen that has a very different refresh rate than the emulated game, and/or your PC is too slow. You choose ;)

Given all of the above, if you're willing to invest some time and are running Groovy MAME on a CRT, you can get _really_ close to real hardware regarding display and input lag. Largely thanks to Calamity's efforts in keeping improving GM. (Thanks dude, hope real life work is starting to relax a bit and you'll soon have some more time for GM. I'm really looking forward to the resolution masking feature!). Only Sound will lag by 2 to 3 frames in Windows, even on a audio_latency setting of 1. But this is mostly due to having DirectSound as the Audio API.

« Last Edit: July 17, 2013, 07:52:41 pm by Dr.Venom »

adder

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 640
  • Last login:February 04, 2021, 10:51:51 am
  • Location: Easy St.
> The only way to eliminate the FlipQueueSize delay fully is by using the RadeonPro tool and its possibility to set flip queue size to zero

just having a read around it looks like radeonpro tool only supports hd 2000 cards or higher?

im wondering for those of us using older ati cards, if ati tray tools could eliminate the flip queue size? i will give that a try when i get a chance.
i found the paragraph below at this link:
http://www.tweakguides.com/ATICAT_9.html
_________________________________________
Flip Queue Size: This setting is similar to the 'Max Frames to Render Ahead' Nvidia setting. It works in much the same way, controlling the number of frames which are calculated in advance of being displayed. The default is 3, or Undefined, however by lowering this setting you may be able to resolve mouse lag problems, and even prevent graphics freezes in certain games. Experiment by setting this value to 2 first, and then if necessary try an extreme value like 0. For most people however I recommend either 3, 2 or 1 at the lowest as setting a value of 0 can disable the performance benefits of dual core CPUs for example, and in general lowering this setting will reduce overall FPS the lower the setting. You can try raising it if you want to see if you can gain performance, however again you may experience mouse lag or input lag.
_________________________________________
ati tray tools: http://downloads.guru3d.com/download.php?det=733

Calamity

  • Moderator
  • Trade Count: (0)
  • Full Member
  • *****
  • Offline Offline
  • Posts: 7461
  • Last login:May 23, 2025, 06:07:25 am
  • Quote me with care
This is a misunderstanding.

This needs some clarification.

The -frame_delay option was designed to do exactly what Dr.Venom is explaining.

But... as an unexpected side effect, it also serves to bypass the flip queue. Let's explain this a bit.

The unfamous flip queue seems to be hardcoded in the ATI drivers. There's no way to remove it, because although you can use a program like the RadeonPro, this only stores a new FlipQueueSize value in the registry. The driver reads this value, but it's ignored if it's lower than 3, as the driver's disassembly proved (it's somewhere in the original discussion that Dr.Venom pointed). There's some evidence of this being the actual behaviour in the tests done by DaRayu here: http://www.mameworld.info/ubbthreads/showthreaded.php?Cat=&Number=307291&page=&view=&sb=5&o=&fpart=1&vc=1  (It's a shame that thread ended up as pure sh*t).

The flip queue seems to affect only to v-synced flip operations done by DirectDraw and Direct3D (and probably OpenGL). In MAME, v-synced flip operations are used by the following configurations:

  - DirectDraw + triplebuffer
  - Direct3D + (triplebuffer or syncrefresh or waitvsync)

This means that if you use -video ddraw & syncrefresh, you're theoretically bypassing the flip queue. But DirectDraw is not an option for everybody nowadays. So, if you use Direct3D and any sort of vertical synchronization on an ATI card, be ready to enjoy 3 frames of lag.

Now, what does -frame_delay have to do with the flip queue?? Well, while experimenting to implement the frame_delay option, I found that it worked perfectly fine for the DirectDraw side, that uses the IDirectDraw7::WaitForVerticalBlank method. But surprisingly it was virtually impossible to get a stable frame_delay implementation on the Direct3D side, where the v-sync is encapsulated inside the 'Present' method. I was about to ditch it all together, until I figured out how to implement the v-sync externally and before to the 'Present' method, by using the GetRasterStatus method, and then using a non-v-synced 'Present' operation right after that. This provided a perfectly stable frame_delay implementation for the Direct3D interface too. But recently, there has been some evidence that, as an unexpected side effect, removing the v-sync operation from the 'Present' method seems to bypass the frame queue too! So you get this extra benefit that is much larger than what it was originally meant to resolve.

Please take all this with a grain of salt, until we have some videos that can prove things, I might be totally wrong.

Important note: posts reporting GM issues without a log will be IGNORED.
Steps to create a log:
 - From command line, run: groovymame.exe -v romname >romname.txt
 - Attach resulting romname.txt file to your post, instead of pasting it.

CRT Emudriver, VMMaker & Arcade OSD downloads, documentation and discussion:  Eiusdemmodi

adder

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 640
  • Last login:February 04, 2021, 10:51:51 am
  • Location: Easy St.
Quote from: Calamity
The infamous flip queue seems to be hardcoded in the ATI drivers......the driver reads this value, but it's ignored if it's lower than 3

hello Calamity, could your crt emudriver be modified to overcome this problem, or is it too big a problem/too much work to try to correct this ati drivers issue

was also wondering, could you release your frame_delay feature as a standalone diff which other people who are not using groovymame could include in their versions of mame, or is that impossible/too much work because eg. the frame_delay code is written specifically for groovymame?
best wishes

Dr.Venom

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 270
  • Last login:May 08, 2018, 05:06:54 am
  • I want to build my own arcade controls!
The unfamous flip queue seems to be hardcoded in the ATI drivers. There's no way to remove it, because although you can use a program like the RadeonPro, this only stores a new FlipQueueSize value in the registry. The driver reads this value, but it's ignored if it's lower than 3, as the driver's disassembly proved (it's somewhere in the original discussion that Dr.Venom pointed).

I think we need to be careful with generalizing results across drivers and OS's. This result, although valuable, was for an "old" XP driver (9.3 I believe). The new Windows Display Driver model of Windows 7+ (on which I'm running) is quite different, and officially supports custom flipping values (even adding techniques to be able to discard any pre-rendered frames chain at the very last before vsync come up). Any "hardcoded" flip queue size would be very much at odds with this new WDD model.

I can positively confirm though that using RadeonPro in Windows 7, with GM in D3D+frame_delay, there's a noticable difference between using a flip queue size setting of zero versus anything higher. To me it seems that it eliminates the lower restriction of flip queue size of 1 that Windows 7 sets.

I'm somewhat curious as to what RadeonPro does exactly. Could you point me to the registry key where according to your findings RadeonPro changes it?  I can't find any in Windows 7, which may suggest that the flip queue size "patch" works in a different way, at least in Windows 7.

Quote
Please take all this with a grain of salt, until we have some videos that can prove things, I might be totally wrong.

Yes that research would be very interesting. Will also be interesting to see how XP fares against Windows 7 (Aero disabled) with and without applying the RadeonPro flip queue size of 0.

I don't know if you have access to one, or if you already had thought about it, but it may be valuable to use a "gaming keyboard" for testing, specifically one of those that does "1000Hz driverless polling / 1ms response time". See for example the Coolermaster CM Storm Trigger here: http://www.cmstorm.com/en/products/keyboards/Trigger/ . That would at least safely eliminate any possible lag from the input device. Or of course any other device of which you can be -100%- certain that it sends input a 1ms intervals. 

mamenewb100

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 210
  • Last login:April 01, 2022, 03:32:29 pm
I thoroughly tested the frame_delay option and had interesting results.

First off I can't use RadeonPro as Venom suggested because it requires Vista and above. While I use XP. So first I tested the black frame insertion version on GroovyMAME and the frame_delay option in D3D was unplayable and often displayed choppy or no video. It worked in DirectDraw but the emulation speed and frame rate was screwed up. But I knew this option might conflict with the frame insertion version of Groovy.

So then I tried regular GroovyMAME and had allot better results, although still odd results. If I used D3D with  frame_delay set to 1, the emulation speed was too fast. If I had it set to 2.. DirectDraw version played near perfect but D3D version still had game playing too fast. Setting it to 3 had the same results. Setting it to 4 suddenly had D3D working at the correct speed. Maybe because D3D uses more frames? Doesn't make allot of sense. Setting FrameDelay to 4,5,6 didn't affect the frame rate but when i got to about 7 or 8 it did. I didn't change my regular settings at all that I normally use in GroovyMAME. SyncRefresh on 1, Throttle on 1, WaitVSync 0, etc. I tried changing WaitVSync to 1 and it made no difference that I could tell. :\  The only thing I forgot to try was AudioLatency which was set at 1.

The fact that the setting made an impact on performance proves that it is having an effect but whether it is really improving response time is hard to tell. Of course it would be difficult to verify if it did work. Unless there is some program that can measure the time between a command and the delay before that command registers. Otherwise I might *Think* there is less delay, when in fact there isn't.
« Last Edit: July 18, 2013, 10:18:44 pm by mamenewb100 »
Life is a Game and we are all being Played.

Calamity

  • Moderator
  • Trade Count: (0)
  • Full Member
  • *****
  • Offline Offline
  • Posts: 7461
  • Last login:May 23, 2025, 06:07:25 am
  • Quote me with care
@Dr.Venom, you're probably right, I didn't remember but my results were from using Ati Tray Tools, not RadeoPro which doesn't work in XP. In the case of Ati Tray Tools in XP, a new key named FrameQueueSize is created in the registry, and searching for it inside the drivers lead to the disassembly I posted where you could see how the value was ignored. So this probably could be patched to remove the queue but I don't know if there's more to that.

@jadder, the -frame_delay patch works in combination with the modeline generator: it needs to know the actual timing of the video mode in order to schedule the wait with the proper length.

@mamenewb100, actually the -frame_delay option can't be used with the black-frame-insertion modified GM, because it calculates the frame duration based on the original timing (e.g. 16.67 ms) instead of the halved duration (e.g. 8.33 ms) that you have when running at 120 Hz. However, with the normal build of GM, running at 120 Hz, a frame_delay value higher that 4 or 5 allows to miss exactly one out of two retraces: this has the effect of achieving perfect 60 Hz v-sync over a 120 Hz display, that's what your seeing indeed.
« Last Edit: July 19, 2013, 08:01:07 am by Calamity »
Important note: posts reporting GM issues without a log will be IGNORED.
Steps to create a log:
 - From command line, run: groovymame.exe -v romname >romname.txt
 - Attach resulting romname.txt file to your post, instead of pasting it.

CRT Emudriver, VMMaker & Arcade OSD downloads, documentation and discussion:  Eiusdemmodi

Dr.Venom

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 270
  • Last login:May 08, 2018, 05:06:54 am
  • I want to build my own arcade controls!
First off I can't use RadeonPro as Venom suggested because it requires Vista and above. While I use XP.

Interesting. Apparently XP support has been ditched along the way by John, the author. A quick search on google shows the following reason: "New features like DVC are not possible to be implemented in XP, other than that XP doesn't support DX10 and a lot of other things that might just slowdown RP development."

Judging from the changelog the previous RC 1.1.0.6 version (from over two years ago) may still have supported XP. It seems the only older version floating around. I wouldn't know how well it works though in XP. If I remember correctly in the past 64-bit apps weren't fully supported, so if you're going to try it you possibly are restricted to the 32-bit compiled MAME version. Google search learns that it can still be gotten here: http://www.radeon3d.org/downloads/ati_software_und_tools/radeonpro/get/  .

@Dr.Venom, you're probably right, I didn't remember but my results were from using Ati Tray Tools, not RadeoPro which doesn't work in XP. In the case of Ati Tray Tools in XP, a new key named FrameQueueSize is created in the registry, and searching for it inside the drivers lead to the disassembly I posted where you could see how the value was ignored. So this probably could be patched to remove the queue but I don't know if there's more to that.

Ah, okay that explains it. My best guess is that RadeonPro does do some patching, but I could be wrong and maybe it works differently in Windows 7. In response to mamenewb I found that an older version of RadeonPro floating around possibly has XP support (see link above). Maybe it's useful.

adder

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 640
  • Last login:February 04, 2021, 10:51:51 am
  • Location: Easy St.
Quote from: Calamity
The unfamous flip queue seems to be hardcoded in the ATI drivers. There's no way to remove it, because although you can use a program like the RadeonPro, this only stores a new FlipQueueSize value in the registry. The driver reads this value, but it's ignored if it's lower than 3, as the driver's disassembly proved

hi all i wont write a long post with all the details of my setup (but if anyone wants to know anything about my setup just ask)...
regarding the ati flipqueuesize bug, today i had success using my mame pc (regular mame not groovymame, win xp 32bit, hp dx2200 desktop with onboard radeon xpress 200 gfx). i guess i got lucky as i was successful (using ati tray tools) with my video drivers/setup, and was not affected by the bug.

i dont know why it worked for me, maybe it's because i am using older drivers/older hardware?
i did tests of course to make sure my findings were correct. i only used my eyes for testing ::) but for me there WAS a noticeable difference for me in mame games between using flipqueuesize '0' or '3' (an example game i tested was 'scramble' (which already has a 1 frame lag in it by default so is a good game to test). for that game with flipqueuesize 3 your spaceship moves around the screen a little sluggishly, but at flipqueuesize 0 it really flys fast around the screen!)

my final test was testing several mame games with 'directdraw +vsync', against 'direct3d +vsync + flipqueuesize 0'
i couldnt see or feel any difference between these two tests. i then set flipquesize back to 3 again and redid this final test, and the sluggish controls were back again when using direct3d instead of directdraw

so, im a happy chap, cheers to all

ps. i did have a backup plan incase i didnt have any success with resolving the ati flipqueuesize bug on my mame pc. i will post details of that second plan later/tomorrow incase anyone wants to try it (note: it's a quick easy method, but it seems you will need to be using directx version 9 or it wont work...)
« Last Edit: July 19, 2013, 08:00:05 pm by jadder »

adder

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 640
  • Last login:February 04, 2021, 10:51:51 am
  • Location: Easy St.
ok if anyone wanted to try it i have attached a zip to this message, a program called direct3d 9 antilag tool 1.01

so for anyone using directx 9, you have to extract all the files in the zip into your mame folder. then just launch mame

if anyone tries it and it works for them, please report

(unfortunately for me i get an 'unable to initialize direct3d' error message.  my mame cab uses a very stripped down version of windows xp (called micro xp 0.87))