Author Topic: Greatly Improved Black Frame Insertion on 240Hz Monitors (& "Temporal HLSL") (Read 31779 times)

mdrejhon · « **on:** May 28, 2020, 01:53:19 pm »

Quote

UPDATE for software developers:
See improved MAME feature request description

Hello!

I'm the founder of Blur Busters / TestUFO. Long time no post!

Some exciting MAME-benefitting info....

240Hz 1ms IPS panels Appear To Be Excellent 60Hz MAME Arcade Panels

I want to mention the discovery that strobed 240Hz 1ms IPS panels have excellent 120Hz hardware strobe + 60Hz software BFI.
No color degradation, no patterns, no chess board artifact, no vertical lines artifact, no horizontal lines artifact.
It just looks like a beautiful 60Hz CRT.

The reason is that LCD GTG is easier to hide in the blanking interval at higher refresh rates. For example, a 240Hz panel is a 1/240sec scanout. When you reduce the refresh rate, you have a 120Hz panel at 1/120sec scanout (8.3ms). So the panel is idling for 8.3 milliseconds, whether or not you're using software BFI or a hardware strobe backlight, or both (using software BFI to reduce the strobe frequency of a hardware strobe backlight, like 120Hz LightBoost + 60Hz BFI).

The bottom line is that on 240Hz monitors, 1/120sec (or less) visible, 1/120sec (or more) dark is now easily possible. This is now a humongous full 1/120sec 8.3 milliseconds to hide LCD GtG in a black frame (black frame, or backlight turned off). On good panels, this dramatically reducing artifacts, and massively improving quality and reducing strobe lag.

Goodbye crappy LightBoost. The ultrafast refresh of high-Hz monitors, gives plenty of opportunity to reduce display motion blur with fewer artifacts caused by LCD pixel response limitations. Even 1ms GtG hyped with 5ms GtG realworld, still fits into that 8.3 millisecond blanking interval -- an opening big enough to drive a slow GtG truck through.

I recently tested ViewSonic XG270 240Hz 1ms IPS panel (a Blur Busters Approved monitor), with the latest firmware, 120Hz PureXP+, software black frame insertion, GroovyMAME to enable software BFI. And with PureXP set to brighter levels (PureXP Normal), it just looks much closer than before to a Sony FW900 CRT in colors and motion clarity (except for the grey blacks). It's much better than old-fashioned LightBoost.

The IPS panel eliminates lots of TN artifacts, and the faster "1ms" IPS panels avoids the strobe crosstalk, and the better IPS colors, combined -- make 240Hz IPS panels become excellent 60Hz MAME cabinet panels if you want motion blur reduction.
- Less dimming for BFI
- No ugly chessboard patterning for BFI
- No ugly color depth loss for BFI

360Hz IPS monitors are coming this summer (DELL is creating one) so "High Hz" and "IPS" are no longer mutually exclusive! Also, higher Hz reduces input lag, because a 240Hz monitor can deliver a "60Hz refresh cycle" in 1/240sec. So the refresh rate race still helps 60Hz emulation, because of lower latency + more accurate CRT emulation (less blur)!

Software BFI Feature Request for GroovyMAME
(Very easy change)

Some 240Hz monitors don't have software BFI, but it's possible to have adjustable brightness-vs-motionblur via custom duty cycles.

Several of you have seen www.testufo.com/blackframes but the test was improved to show duty cycles (e.g. 2 black frames + 1 visible frame) to create different amounts of display motion blur.

View this on a 240Hz monitor: 60fps Software BFI on 240 Hz at 25%, 50% and 75% Duty Cycles
(Don't bother to see this on a 60Hz monitor, it will flicker at 15Hz)

240Hz-compatible and 360Hz-compatible BFI can be a useful additional flexibility adjustment, given refresh rates are going up.

Is there a way to adjust BFI duty cycle in GroovyMAME, as I wasn't able to customize duty cycle of 25%, 50% and 75% for 240Hz BFI. Or is this a feature request? Some people want more brightness and reduced flicker, while others want the least motion blur (closer to CRT).

Future 1000Hz Monitors (ETA: 2030) to Achieve "Temporal HLSL"
(Harder change, longer-term future)

Some of you may be familiar with the Blur Busters writings, such as Amazing Journey To Future 1000Hz Displays, as well as Frame Rate Amplification Technologies (future cheap 1000fps GPUs), and others such as Stroboscopic Effect of Finite Frame Rate Displays. Most of these are more relevant to modern games, not emulators.

However....1000Hz makes "Temporal HLSL" possible!

ASUS already has planned a 1000Hz monitor in a decade, as confirmed to us, to PC Magazine, and to a few others, so 1000Hz prototypes are already being developed, though it'll probably be a decade before they hit the market.

But there's also exciting possibilities for future CRT emulation algorithms. Basically a "temporal HLSL". HLSL is currently spatial, but with 1000Hz, HLSL can go temporal too (emulate CRT scanning!)
By the time 1000Hz monitors arrive, we'll finally be able to emulate 1ms CRT phosphor (1ms refresh cycles), even perhaps via a software-based segmented rolling scan.

Exciting possibilities for correct temporal emulation of a CRT tube on an high-Hz OLED panel or high-Hz FALD IPS panel with perfect blacks. (Desktop FALD for great LCD blacks is expected to fall below $1,000 by year 2025, making them practical for MAME arcade cabinets).

I'd argue that 360Hz is the right time to begin programming a "Temporal HLSL" (e.g. software based 6-segmented rolling scan with gamma-corrected alphablends on overlapping segments), with preparations to scale it towards 1000Hz.

Temporally emulating a CRT electron gun realtime at the sub-refresh level!!!!!

(It could also theoretically be combined with beamraced VSYNC to synchronize emu-raster to real-raster, to do this virtually laglessly for original-machine latency on original tubes)

The refresh rate race to retina refresh rates is exciting!

donluca · « **Reply #1 on:** May 28, 2020, 03:40:31 pm »

Thanks for posting this, it's really interesting.

Ironically, it looks like we're finally getting back to something vaguely remote to how a CRT worked through newer and cutting edge technology.

Maybe one day we'll finally have a solution which will be able to completely replace our beloved CRTs once they'll all die.

Calamity · « **Reply #2 on:** May 29, 2020, 08:10:01 am »

Hi Mark,

That's funny because I was considering deprecating BFI altogether since it was causing some scary image retention on LCDs, and aside that it gets in the middle of a critical feature like frame delay.

This 25/50/75 duty cycle implementation could be added if there's interest.

We're currently on the process of reconstructing the whole GM gears (yeah, raster "interrupts" on the roadmap finally).

arfink · « **Reply #3 on:** May 29, 2020, 09:33:21 am »

Now if only companies would stop making LCDs wider and wider and give us a proper 4:3 aspect in something bigger than 19" and I'll ditch my CRTs for good.

mdrejhon · « **Reply #4 on:** May 29, 2020, 08:32:36 pm »

Quote from: Calamity on May 29, 2020, 08:10:01 am

That's funny because I was considering deprecating BFI altogether since it was causing some scary image retention on LCDs, and aside that it gets in the middle of a critical feature like frame delay.

We know why image retention happens during BFI, and there are easy solutions to this

Burn-In Fix For Software BFI

As you know, Blur Busters works on contract with monitor manufacturers, so we know why burn-in happens during BFI.

Burn-in is because BFI defeats the LCD inversion electronics in software-based black frame insertion. LCD inversion is the opposite-voltage spatially and temporally. Basically pixels have to swap between positive voltage and negative voltage every other refresh cycle. Because BFI is even-numbered, it builds up a static electricity charge in the pixels because the opposite polarities are now unbalanced (positive voltage black, negative voltage non-black, or vice-versa). A more perfect smarter inversion would be a random error-difusion voltage inversion pattern, but those are not used.

LCD pixels are often a chessboard of voltages.

Occasionally it creates artifacts (should be invisible, but sometimes the voltage polarities become visible)

The chessboard polarity pattern normally inverts next refresh cycle (inverted pattern, aka LCD voltage inversion). But blackframes means the visible refreshes have the same voltage polarity, creating a static chessboard + accelerated burnin effect because of static electric charge buildup, creating temporary LCD image retention.

The good news is, this is the solution:

(A) Odd number of refresh cycles (e.g. create a 180Hz mode)
There's never burn in on a 240Hz monitor running at 180Hz, it can permanently run in software BFI harmlessly. Use Custom Resolution Utility to create the 180Hz refresh rate (on any 240Hz monitor) or 300Hz refresh rate (on any future 360Hz monitor), and just be done with it. Default preferred BFI cadence 0-0-100% or 0-0-0-0-100% respectively.

or

(B) Phase-shift every 20 seconds for even-divisible Hz
120Hz: 2-2-2-2-3-2-2-2-2
240Hz: 4-4-4-4-5-4-4-4-4

or

(C) Odd number of Present()s per emulator refresh cycle for VRR
- VRR means monitor slaves to the timing of Present() API; monitor instantly refreshes on call to Present()
- Must be temporally separated no faster than max-VRR Hz
- Be an odd number, to force monitor to have odd-number of voltage polarities per emulator frame
- You can still black-frame some of those Present()s to do BFI-during-VRR

BONUS
- The chessboard patterns disappear
- The colordepth becomes full color depth (no interference with 6-bit FRC algorithm)
- There is no more burnin
- BFI has better color quality too

_____

To prevent flicker during phaseshift, use a 50%:50% gamma-corrected alphablend. Gamma correction is helpful to the alphablended BFI frame because RGB(128,128,128) is not exactly half the number of photons of RGB(255,255,255). The electric static charge stops building up because the phaseshift will "rewind" the burnin.

I suggest argument --antiburnin <seconds>
(or come up with a better name consistent with BFI arguments)
- Enabled by default for refresh rates evenly divisible, like 120Hz, 240Hz, 360Hz (default: 30 seconds or 1 minute)
- Disabled by default for refresh rates oddly divisible like 60Hz, 180Hz, 300Hz
- Can be force-disabled via specifying with 0
- Can be force-enabled via specifying seconds
- Simply adds 1 repeat refresh cycle, to add an intentional phaseshift. At first, initially, I'd suggest every 30 seconds as a boilerplate default, e.g. 1800 MAME frames for 1801 refresh cycles.

"Burn-In Version 2" Advanced Improvements
I also highly recommend using high precision clocks (e.g. RTDSC) to better guess which voltage phase the refresh cycles are currently at, so incidentical computer freezes and disk freezes (that skips 1 refresh cycle) can automatically make the upcoming phaseshift refresh cycle unnecessary. In other words, count your chicken eggs of refresh cycles carefully! so you're always aware whether your first number of BFI sequence is landing on an odd-numbered refresh cycle or even-numbered refresh cycle. Then goodbye burn--in, it never appears, I can get my monitor BFI 24/7 for months with no burn-in (except what would result from say, 60 seconds of execution -- the phaseswitch length). Now say you specify --antiburnin 60 .... then because burnin rewinds back-and-fourth continuously, e.g. 60 seconds of very faint burn in, followed by 60 seconds of burn-in rewinding (via the phase-swap refresh cycle to swap the voltage polarities of the black frames). For burnin-insensitive LCDs, half hour could work. For burnin-sensitive LCDs, 15 to 30 seconds is best.

Actually, there's already an open-source antiburnin algorithm, someone implemented it somewhere already, I'll try to dig it up, it's a thread at Blur Busters Forums (either Programming area or Area51 area)

Native 60 Hz Hardware Single-Strobe Is A One-Line Monitor Firmware Fix
But Manufacturers Aren't Doing It

In the past, the main reason we needed software BFI for emulators is to use software BFI to block out every other strobe for a 120Hz hardware strobe backlight.

But it shouldn't have happened this way. We are dissapointed in multiple monitor manufacturers not adding a native 60Hz hardware-strobe mode (few panels has single-strobe, like BenQ XL2411P and some LG 60Hz OLED panels) because of manufacturer bias against allowing users to enable low-Hz CRT clicker. Fundamentally, it's only a 1-line firmware modification to enable 60Hz single-strobing, there's already a petition thread, Dear ViewSonic: Please Add 60 Hz Single-Strobe for PureXP on XG270. You can also hack it by adding an Arduino to the backlight controller, to force a strobe independently of the firmware.

Interim Easy Phosphor Fade Emulator Before Holy Grail of Future "Temporal HLSL"

Ideally, it should be zoned (e.g. rolling scan) in a future "Temporal HLSL" implementation but it can just be global faded BFI frames for now. Future "Temporal HLSL" (electron gun emulator) generating rolling software BFI (segmented rolling scan spread over multiple refresh cycles at 240Hz, 360Hz, 480Hz, 720Hz and 1000Hz future monitors) -- could also emulate sub-refresh phosphor fade as part of the future "Temporal HLSL". Basically the millisecond-timescale sub-refresh emulation of a CRT electron gun.

However.... easy software BFI can also be done globally too:
-- Blended full global BFI for 240Hz using formulas such as 0%-25%-100%-25% 4-frame-per-refresh cadence
-- Blended full global BFI for 360Hz using formulas such as 0%-0%-25%-100%-25%-0% 6-frame-per-refresh cadence
-- Since it's a continual cycle, "0%-0%-25%-100%-25%-0%" is identical to "25%-100%-25%-0%-0%-0%" except the latter is less laggy, since the first refresh cycle contains a visible frame

For now, the easy BFI sequence configurable perhaps via a string in configuration file
"100,0" (120Hz), the traditional version, 50% less motion blur
"100,50" (120Hz), the less flickery version (a bit more blur though), 25% less motion blur
"100,0,0" (240Hz), the lowest-blur for 180Hz, about 1/3 brightness, 66% less motion blur
"100,0,0,0" (240Hz), the lowest-blur for 240Hz, about 1/4 brightness, 75% less motion blur
"100,0,0,0,0,0" (360Hz), the lowest-blur for 240Hz, about 1/6 brightness, 83% less motion blur
"100,100,100,0" (240Hz), the brightest for 240Hz, about 3/4 brightness, 25% less motion blur
"25,100,25,0" (240Hz), a phosphor fade emulated 240Hz, 66% less motion blur
"25,100,25,0,0,0" (360Hz), a phosphor fade emulated 360Hz, 75% less motion blur
"25,50,100,50,25,0" (360Hz), a brighter phosphor fade emulated 360Hz, 40% less motion blur

Could be pre-alpha-blend corrected or uncorrected, just be consistent (stick to a standard). For compatibility with BFI anti-burnin algorithm, simply average the first & last number and use that alphablend percentage for that specific phaseshifting refresh cycle.

BFI Can Be Done At Same Time As VRR

If one wants to do BFI-in-VRR, for simplicity one can add a "--emulate-hz-via-vrr" argument or something similar. And tell user that it must be within VRR range. Doing "--emulate-hz-via-vrr 180" to emulate a 180Hz monitor, and then activate the 3-count BFI, and you'd just use a high-precision software timer (with a last-minute busywait for RTDSC-league time alignment to previous refresh) to temporally separate Present() by exactly 1/180sec apart, to make sure monitor refreshes at 180Hz (as long as 180Hz is within the variable refresh rate range). So 180Hz fixed-Hz as well as 180Hz via VRR, would have exactly the same BFI appearance and behavior. Obviously, you'd prefer not to be too close to min Hz or max Hz of your VRR range, when emulating a VRR Hz, to give you some flexibility to adjust VRR timing. You could use multiples instead, e.g. "--emulate-vrr-multiple 3" which is multipled by the emulator module's refresh rate (typically 60 but could be 50 or 53 or other), whichever is architecturally simpler...

Either way, this would conceptualize keeping BFIv2 programming simple to use the same BFIv2 programming for both VRR and non-VRR monitors. phaseshift during VRR could still happen if it's an even-numbered VRR refresh cycles (e.g. desire to have less motion blur 240Hz and 360Hz), since some users may forget to turn VRR off when running emulator BFI, so we should at least be able to handle BFI regardless of VRR ON/OFF.

Also, generic VRR detection is theoretically easy under Microsoft Windows, monitor D3DKMTGetScanline() after Present(). If scanline numbers starts incrementing instantly after Present(), your software triggered the hardware refresh cycle. This can also be used for VRR range detection (detecting min and max Hz can be done this way too, since at min Hz, scanline numbers starts re-incrementing again when repeat-refreshes start to occur again). By doing this, can auto-adapt BFI to work universally (automatically choose correct cadence, and anti-burnin, etc). A simple cascading-options configuration file format may need to be conceptualized (that's pathable to Temporal HLSL) to make global BFI (and eventually rolling BFI) Hz-agnostic VRR-agnostic auto-adapting.

P.S. As Blur Busters is the resident refresh rate mythbusters: Don't believe in 1000Hz or ultrahigh refresh rates? NVIDIA has confirmed diminishing curve of returns vanishes at >10,000Hz, please read Blur Busters Law: The Amazing Journey To Future 1000Hz Displays, and the other linked articles in my first post before disputing. It's very important when emulating real life in virtual reality which is booming, like a Holodeck, since real life is infinite refresh rate, wit no stroboscopic stepping effects. Think about this; 4K was a $10,000 curiousity 20 years ago, now it's a $299 Walmart special. 120Hz is becoming mainstream soon (new iPhone and Galaxy), and even 1000Hz will be only a few dollars by the end of the 21st century. We shut up the laughing about 1000Hz, and shame mainstream media that jokes about ultra Hz with total slamdunk micdrops. The benefits are visible (like a CRT tube that doesn't need to flicker -- blurless sample-and-hold -- strobeless ULMB -- lagless without needing strobing -- and looks like per-pixel VRR, where even a 24fps, 25fps, 50fps, 60fps playing at same time, looks simultaneously stutterless -- and all sync technologies (VSYNC ON, VSYNC OFF, GSYNC, FreeSync, FastSync) converge to identical zero lag, zero blur, zero stutter, at higher and higher frame rates. Eventually, it looks CRT motion clarity without impusling. (The same motion blur of a 1ms flash, can be replaced by 1,000 unique 1ms refresh cycles instead, creating a blurless sample-and-hold display). But it also unlocks exciting emulation possibilities like customizable Temporal HLSL holy grail -- the ability to emulate a CRT electron gun at a millisecond timescale (software-based "bar" rolling scan, which can be in sync with beamraced vsync algorithms for emuraster=realraster original machine identicalness).

mdrejhon · « **Reply #5 on:** May 29, 2020, 09:07:21 pm »

Quote from: donluca on May 28, 2020, 03:40:31 pm

Maybe one day we'll finally have a solution which will be able to completely replace our beloved CRTs once they'll all die.

High-Hz OLED and MicrroLED as CRT saviors, or Ultra-High-Hz MicroLED-FALD LCDs

Someday, I hope retina-resolution direct-view MicroLED screens 1000Hz should make it very easy to "Temporal HLSL" emulate most CRT tubes, except for the actual curvedness. MicroLED is fast enough for 1000Hz in theory, but the journey to get there will be a while (2040s maybe?)

But at least, this decade, I expect see some "near CRT" emulators via a 240Hz OLED panel or FALD LCD panel (preferably future 10,000 LED local dimming, so that haloing is smaller than CRT phosphor dot haloing!)

I also expect ultra-high-resolution full array local dimming (FALD) with tens of thousands LEDs to become affordable before 2030s, with haloing smaller than a CRT electron gun dot, so those are also suitable horses in this "CRT replacer" refresh rate race.

Keep It Simple Approach

Do not depreciate BFI. Fix BFI. Make BFI the future of HLSL. Get prepared for future retina refresh rates.

BFIv1 is today's burn-in prone BFI
BFIv2 would be what I suggest in this thread, 240Hz/360Hz compatible, and with anti-burnin algorithm. Easy changes!
BFIv3 would be Temporal HLSL (rolling-scan BFI; emulating a CRT electron gun in a sub-refresh basis, piggybacking on the refresh rate race, sheer Hz to provide means for electron gun scanning emulation).

BFIv2 can also be modified to work on VRR displays (you can do software BFI during VRR).

The goal is 1000fps high speed video of BFIv3 (60Hz rolling-BFI emulated on 1000Hz LCD) and on a 60Hz CRT, looks identical -- just a rolling bar with a phosphor fade behind it. That's what Temporal HLSL would look like on a per-frame basis (But with the full dynamic range of the displays' pixels). This can already begin to be prototyped on common 240Hz gaming monitors, and begins to actually look good on 360Hz panels (starts to begin looking like a 2-3ms persistence CRT). I'm ESPECIALLY looking forward to the Dell 360Hz IPS panel as a Temporal HLSL achivement unlocker...

Beam Raced VSYNC Refresher (Original Machine Latency in an Emulator), More Faithful Than RunAhead

Beam raced VSYNC is now implemented in some emulators such as WinUAE. Long time readers may remember my Tearline Jedi Experiments treating VSYNC OFF tearline as rasters, and the development of a jittermargin to make it forgiving enough to have a chase beamracing between emulator raster (of emulator buffer) and real world raster (of actual display), for sub-refresh latencies. Calamity succeeded in an experimential GroovyMAME patch, though I'm not sure if this is fully implemented yet. There's a Blur Busters Forums thread about beam raced VSYNC too. I should create a better, easier article, because more emulator authors (Thomas Harte) has successfully experimented with this on the Mac.

I have more high speed videos of an LCD refreshing in real time, and most gaming monitors (at max Hz) just streams those scanlines from cable onto the screen in the same raster direction, so LCDs scan like a CRT, just like a flickerless CRT.

But we can piggyback on this to do beamraced VSYNC, either to the hardware scanout (60Hz scanout) or Temporal HLSL scanout (rolling-bar software scanout).

Beam raced VSYNC is the holy grail of original machine latency (or FPGA-mimicking latency). No RunAhead latency distortions. RunAhead in RetroArch is amazing, but never correctly replicate original sub-refresh latency mechanics. RetroArch's RunAhead (ArsTechnica article) which is amazing, can't achieve faithful latency like FPGA, due to things like latency-nonlinearity. For RunAhead, emulated scan rate in RunAhead is always faster than realtime, generating rasters faster than realtime to offscreen frame buffers. This distorts time between mid-screen inputreads (sawtooth latency effects at 60 sawtooths per second, distorting latency extremes to different points of the 1/60sec = 16.7ms time windows), even if you manage to match average latency to original machine or FPGA machine. Also Game A may inputread at end of VBI, and another Game B may inputread at beginning of VBI. And Game C may inputread at raster #002 or #199 or whatever.

So input lag differentials between a RunAhead emulator and the original machine (at same RunAhead setting), will vary in a window of [0..16.7ms] because of the latency distortions within the RunAhead algorithm. Compare this 16.7ms of lag nonlinearity to the 1ms lag-behavior symmetry achieved by WinUAE's GPU-beamraced sync. With RunAhead and photodiode oscilloscope, one can attempt calibrate lag of one game to match original machine. But a different game will diverge in lag from original machine as a result of that original calibration. So worst-case lag non-faithfulness difference is 16.7ms between Game X (vs original machine) and Game Y (vs original machine) with identical RunAhead settings. So RunAhead can't create universal latency faithfulness. It's a stunning emulator innovation, but latency purists know it does not predictably duplicate sub-refresh latency faithful originality.

Holy Grail: Spatial HLSL (today) + Temporal HLSL (BFIv3) + Beam Raced VSYNC (like Calamity's patch)

On a future 1000Hz display, of course...
I'd love to see both existing Spatial HLSL, simultaneously combined with a future sub-refresh Temporal HLSL, combined simultaneously with beamraced VSYNC (emuraster=realraster sync). For beam raced VSYNC, the internal memory emulator frame (in mid-raster) would need to rasterplot a little bit ahead of the software-based rolling-scan Temporal HLSL emulator, but then you'll be able to beam-race a theoretical future "Temporal HLSL" within the jitter safety margin.

Benefits
- Original faithful CRT motion blur (at least down to refresh granularity, 1ms phosphor = emulatable via 1000fps@1000Hz)
- Original faithful CRT flicker (rolling flicker)
- Original faithful machine latency at all rasters (MAME duplicates latency of original machine, MAME duplicates latency of FPGAs)

In the coming months and years, I plan to post dozens more knowledge articles on Blur Busters (for other reasons; because other markets is interested -- manufacturers, esports, FPS gamers, etc) that will be massively educational to future programmers of future Temporal HLSL algorithms. For various mutually-beneficial reasons, I do plan to raise all boats of Internet knowledge of how a display refreshes, so in five years, enough programmers will be knowledgeable enough to begin implementing Temporal HLSL in open source projects for CRT electron gun emulation. But I would like to see a few emulators start getting prepared, as I am a fan of original faithful in emulators.

mdrejhon · « **Reply #6 on:** May 29, 2020, 09:58:49 pm »

Quote from: Calamity on May 29, 2020, 08:10:01 am

That's funny because I was considering deprecating BFI altogether since it was causing some scary image retention on LCDs, and aside that it gets in the middle of a critical feature like frame delay.

I actually have some ideas how to solve this. Wanna do a mutual brainstorm session?

There are options to solve the BFI sync problem
-- Automatic strategic addition of alphablended BFI Frames. A 3-3-3-3-3 phase temporarily becomes 3-3-2-3-3 or 3-3-4-3-3 to resync to an external clock (audio, inputdelay, whatever)
-- Use of slight Present() time lengthen/shortening during VRR BFI (VRR = time flexibility)
-- BFIv3 Temporal HLSL path: variable count of rolling bars per refresh cycle

Some BFI paths are also beamraceable (BFIv3 Temporal HLSL idea).

BFIv3 Advanced Note: Could be an average of 6.6 bars per refresh cycle for a 400Hz refresh rate (like filming a 60Hz CRT on a 400fps camera, the phosphorbars would be approximately 1/6.6th height of screen, you'd just do Temporal HLSL rolling scan like that that is not necessarily time-aligned across refresh cycles, as long as average photons per pixel per second remains constant.... so Temporal HLSL shader code may need an photon accumulator buffer array to keep track to allow non-divisible rolling-bar BFI, decoupled from needing to be evenly-divisble by Hz, the temporal aliasing equivalent of HLSL spatial aliasing!). Ultimately, the theoretical Temporal HLSL should be able to run any Hz on any Hz (e.g. 53Hz at 240Hz, 59.94Hz at 1000Hz, 50Hz at 360Hz, etc).

Also, variable refresh rate is also a software-BFI solution too, e.g. 240Hz can make it easy to do burnin-free 180Hz via doing three Present()'s per 1/60sec time period, to make sure that you're delivering three refresh cycles per emulator frame, and you can easily sync it to the emulator's Hz, sound clocks, or frame delays. VRR+BFI is simply software-controlled refresh cycles, because Present() starts the new refresh cycles if frametime (interval between Present()s are within VRR range), you can use black frames for some Present()s during software BFI, and you simply need to use an odd-number Present()s to avoid burn-in. And you can even vary the time interval between Present()s, perhaps with custom configuration strings variant to above. Because you can create ANY refresh rate within range with VRR, simply by controlling the time interval between Pre sent()s, and you can still BFI on them if you wish. And beam raced VSYNC also works too on VRR in WinUAE with GSYNC+VSYNC OFF, because Present() less than one refreshtime apart will "tear into" the current scanout, so you can piggyback on this, to create beamraced refresh cycles during VRR, too. Toni of WinUAE figured this out and successfully beamraced VRR, too. Also, VRR+VSYNC ON and VRR+VSYNC OFF can be autodetected by determining of Present() immediately after Present() is blocking or not (since Present() blocks on VRR+VSYNC ON if you try to Present() faster than the refreshtime of a max-Hz refresh cycle). So there's some amazing VRR information that is automatically detectable (could use a --autodetectvrr command line option). Of course, these are platform-specific APIs and only works for platforms that can tell you the hardware raster scan line number of the real world display hardware -- like NVIDIA/AMD drivers via D3DKMTGetScanLine().

Anyway, that said, 1000Hz will probably make VRR obsolete, because 1000Hz is defacto per-pixel VRR.

Quote from: Calamity on May 29, 2020, 08:10:01 am

This 25/50/75 duty cycle implementation could be added if there's interest.

There's no interest because users haven't seen how amazing software BFI looks on a 240Hz IPS monitor yet....little known secret!!! It's that shockingly good compared to the past.

Also, one can just use 180Hz for simpicity if you don't want to code anti-burnin algorithms (see above for burn-in solution). Once more of you discover it, I'm sure 240Hz demand will go up for emualtors, because for now, everyone is like "Why do I need a 240Hz monitor to do 60Hz emulation!?"

But the Holy Grail of CRT emulation is lurking within as refresh rates are getting higher, and people are missing forest for the trees -- this might be bigger and more amazing than beamraced VSYNC because it's an easier-to-program feature -- and a simple emulator user education problem "Hello, use a 240Hz panel if you want to reduce motion blur with less problems!"

Either way, even a hardcoded "0,0,100" configuration string with no antiburnin, will at least let me begin experimenting more often with software BFI, letting advanced users like me design the BFI strings, while keeping things easy on the programmer (you).

Quote from: Calamity on May 29, 2020, 08:10:01 am

We're currently on the process of reconstructing the whole GM gears (yeah, raster "interrupts" on the roadmap finally).

Yes, lots of work. But reason to architecture it into a generic frameslice-beamraced delivery pipeline that is initially hardware scanout (60Hz scanout) but modular enough to deliver to a software scanout engine (rolling-bar-BFI Temporal HLSL), so don't want to see it code-architectured into a corner shooting in feet.

_________

While 180Hz BFI can be used software-only without hardware strobing, one can also use a 180Hz strobe backlight with software BFI to get low-persistence without needing anti-burn-in code. So a quick simple change (few lines) to at least unlock the ability for users to set up a 180Hz ViewSonic XG270 PureXP+ strobing at 180Hz, with using "0,0,100" software BFI to convert 180Hz strobing to 60Hz strobing. Then it looks like great 60Hz strobing, but WITHOUT the burn-in problem (without needing the burn-in compensation).

For now, we'd instruct users "Please use 180Hz on 240Hz, or please use 300Hz on 360Hz" to avoid burn-in problems, until antiburnin code is added.

_________

Proposed Coding Path

ASAP
For now, I propose the simplest of simplest change, which will at least unlock flexible BFI in the easiest possible manner.
- The comma-separated BFI string

This Year
Then when you have time
- Anti-burnin for BFI
- VRR-compatibility for BFI

Year 2021
Then incrementally improve once more users post glowing reviews of fixed BFI (Especially on superior IPS panels)
- Mutually design a cascading configuration file (I can help incubate conceptual designs)
- Autodetect VRR ON/OFF
- Autodetect VRR range

Year 2022
Then even more advanced steps:
- Autoconfigure best BFI pattern for current monitor Hz + monitor settings.
(Allowing BFI to work automagically with default cadences, on current monitor settings, safely, with no burn in, as long as minimum Hz is
- Permit decouple BFI Hz from real Hz if fixed Hz (allow it to autoblend/interpolate over multiple Hz in temporal adjacent-refresh-cycle blending)
(Duuuh, this is kind of how I pull off software-emulated VRR, www.testufo.com/vrr ...)

BTW, my brain is able to mentally emulate new algorithms and new TestUFO tests before I design them, and I've already emulated a few BFI algorithms in my brain. I could help the original HLSL people architect a Temporal HLSL, if they asked me.... I even mentally saw the optical illusion www.testufo.com/eyetracking and www.testufo.com/persistence before I created the test, and that's how I invented www.testufo.com/ghosting sync track pursuit camera that is now widely used by display reviewers 1 year before peer reviewed researchers discovered me and created a scientific paper that confirmed a year later.)

Year 2023
First implementation of Temporal HLSL (software-based sub-refresh-cycle HLSL CRT segmented rolling-scan BFI that is also Hz-agnostic).

Delay/Accelerate these suggested dates. Whatever works. But please, incubate. I'd like to do 180Hz with GroovyMAME, since 180Hz is naturally 100% burnin-proof with 60Hz software BFI. (ViewSonic XG270 already supports 180Hz, both strobed and non-strobed).

Hz-decoupling would be problematic at low Hz, but with tomorrow's fine-granularity Hz (360Hz 480Hz 720Hz 1000Hz of the monitors of the 2020s and 2030s), it becomes human unnoticeable, as long as you do it correctly. You could do software BFI 60Hz on a 321Hz monitor or 574Hz monitor, with a temporally-antialiased software BFI, via gamma-corrected alphablends between two adjacent refresh cycles at the simplest level. And, all of these would be naturally burnin-resistant obviously due to the continual Hz-slewing. Hz-agnostic VRR-compatible antiburnin BFI becomes nicer and nicer, the higher the monitor Hz goes.

Then future "complex steps" can be decided later... Temporal HLSL isn't something we need to worry about for a few years unless someone is brave enough to start it early. But we need to incubate knowledge because lots of people don't realize how amazing software BFI looks now on some panels such as LG OLED HDTVs or the 240Hz 1ms IPS panels (those users having seen crappy LightBoost, chessboard artifacts, banding artifacdts, and bad burn in).

Easy Burn-In Proof 180Hz BFI for 240Hz Monitors

I'm OK if you stop at "The comma-separated BFI string", and "Anti-burnin for BFI" (so, you, the programmer is not overwhelmed

)

Then that will allow impressive-looking software BFI on the new high-Hz IPS panels.

Even if you stop at this initially -- hopefully my ideas will be welcome for continued improvements after (whether in GroovyMAME, the mainline MAME, or a different fork). Don't have to implement my suggestions; I'm always a dreamer, but it would be fantastic to commandeer retina refresh rates for CRT electron gun emulation at really refresh timescales. Early tests show more impressive results than beam raced VSYNC.

donluca · « **Reply #7 on:** May 30, 2020, 08:22:36 am »

This is an enormous amount information and way over my head.

Hope Calamity will be able to make sense of most of it and translate it into something in GM in the following years.

Although I have an arcade cab with a freshly recapped monitor, I know it's not going to last forever.
And when it dies, it'll probably be too late to get another one, which means that the cab will have to go, and I'd love to go to a full emulation, full digital solution with 100% accurate emulation of both software and video side.

Calamity · « **Reply #8 on:** May 30, 2020, 08:25:58 am »

There's something I don't like about current BFI implementation, and it's the main the reason I'd like to ditch it. The way it's done now is so that the emulator calls its frame update function normally, and after that, if BFI is enabled, it will draw a black box over the frame buffer and present it. Both present calls, the normal and the BFI one, must be vsynced. This means that BFI wastes half the frame period, resulting in half of the time left for anything else. Now think if we start introducing more than one black frame per period. Indeed, the way it's implemented now, as an excrescence in the frame update function, greatly limits what can be done.

On the other hand, I deeply dislike the idea of phase shifting to solve the burning issue. It certainly has the smell of a temporal workaround. I find much more appealing to jump directly into method BFIv3, if I correctly understood your taxonomy. Even if this means forgetting about 120 Hz LCDs.

The reason for this is that I agree it's actually easier (or at least cleaner) to implement. Instead of sticking black frames after the actually rendered one, which somewhat defeats the emulator's loop logic, I'd rather implement this as a feature of the frame slicing logic. This way, you'd have low latency and blur correction for free.

It's actually easier to implement on VRR because you don't need any fancy beam racing mechanism when the monitor's refresh can be an integer multiple of the emulated game's refresh. Just emulate 3 cpu slices per refresh period, and call Present based on the emulator clock. For each slice, alpha the correct bands on the screen with a black %, and leave the current band fully visible. I wonder if you'd be able to see horizontal banding due to this on 180 Hz. I mean, since this method enables a sort of rudimentary scaninng.

Don't hold your breath on this however, we're just starting to think how frame slicing should be merged into the new GM design, and distractions coming from the LCD realm can cause us to derail easily

EDIT: Mark, 1000 Hz HLSL scanning would require monitors to be at least 10x brighter than what they are now, and you know it.

mdrejhon · « **Reply #9 on:** May 30, 2020, 05:27:09 pm »

Quote from: Calamity on May 30, 2020, 08:25:58 am

Don't hold your breath on this however, we're just starting to think how frame slicing should be merged into the new GM design, and distractions coming from the LCD realm can cause us to derail easily

Skip the Anti Burn In Logic For now...

For now, can just educate users to use odd-multiple Hz on an even-multiple Hz monitor (180Hz on a 240Hz and 300Hz on a 360Hz), to solve the BFI burn in problem. We can just use publish something into a WiKi, create a new article, or have a Knowledge Base link to this post, instead.

Agreed, that's why I just want to see simple BFI, to begin with. At least add it without anti-burnin support.

Ok, how about I propose:
(A) Just implement comma-separated BFI pattern idea.
(B) Don't implement phaseshift code.

Initially, you don't even have to worry about choosing VRR or non-VRR. I'd prefer non-VRR, just so I can combine 180Hz + PureXP, because PureXP doesn't work at the same time as VRR. However, I think a simple comma-separated BFI solution would work both with VRR and non-VRR. I'd rather if you don't depreciate BFI though, but find a way to migrate it to future.

Theoretically, BFI could be decoupled as a completely separate engine for another programmer, and let some other programmer worry about BFI. For example, a Windows virtual display driver that handles BFI. GroovyMAME wouldn't need to know. (Are there any virtual display driver people here?

Blur Busters had actually worked on a virtual display driver project that confirmed BFI works at a Windows driver level (refresh-level driver, not frame-level driver so SweetFX / RTSS / ReShade approaches won't work). There was a Windows virtual display driver that I financed the development of, but it is in legal limbo. However, it's long been public knowledge that BFI can be achieved in a Windows virtual display driver, and we know it works... If anyone developed such as BFI driver independent of GroovyMAME, all it would need to monitor is when the frame changes, and it can use flywheel-sync algorithms to sync to the framerate, and handle its own phaseshifting.

(It would mean a 1-refresh-cycle-granularity latency change everytime a phaseshift happened, but at 240Hz, that's only a 4.2ms latency change that will rewind itself during the next phaseshift). And people who want consistency can just use VRR-BFI or the 3x-trick or 5x-trick to avoid phaseshift algorithms for anti-burnin.

Also, another argument against depreciating classical BFI, is that I read about some GroovyMAME users who use BFI with CRT tubes already, to allow 60Hz single-strobe 15KHz look on 31.5KHz arcade tubes, because some of them are doing 120Hz 240p in place of 60Hz 480p (same scanrate), and then using GroovyMAME software BFI to make it look like 60Hz 240p to make a 31.5KHz arcade tube look like a 15.3KHz tube.

Quote from: Calamity on May 30, 2020, 08:25:58 am

I wonder if you'd be able to see horizontal banding due to this on 180 Hz. I mean, since this method enables a sort of rudimentary scaninng.

Yes and no. Depends on the motionspeed.

VSYNC OFF tearing is visible at 180fps at 180Hz, so sharp bands will create tearing-looks during rolling-BFI. So you need to alphablend the overlapping slices. Bigger overlap would be bigger invisibility. But higher Hz would allow higher motionspeeds without seam artifacts.

We calculated, that for most motionspeeds of original arcade games, that 360Hz would be the threshold where the seams begin to disappear while producing very useful low persistence for rolling-scan BFIv3 (Temporal HLSL).

Let's consider a very fast-panning arcade game that pans one screenwidth per second. For a 1080p LCD, that is a motionspeed of 1920 pixels/sec, 1920/180 means 11 pixel offsets between rolling-scan frameslices at 180Hz rolling-scan. Even with alphablend, there will be noticeable disjoints.

For slower motionspeeds (e.g. walking Super Mario Brothers), it won't be visible. But a running Super Mario pan would show a slight amount of 180Hz rolling-scan seams, unless you used extremely large persistence (reducing motion blur only a little, rather than by 2/3rds).

That's why 240Hz and 360Hz will be the very beginnings of seamless-enough rolling-scan for fast motions. For now, global BFI will work much better for 180Hz because of this. The good news is you can thought-experiment merging BFIv2 and BFIv3 if you don't mind partially including BFIv3 concepts mentally.

[Thought experiment, not necessarily final numbers]

However, I think BFIv3 can actually clone BFIv2 120Hz by configuring:
sliceduty=1 (pixel visibility time of one refresh cycle)
slicealphablend=0 (no blend percentage between slices)
slicegamma=2.2 (default, ignored if slicealphablend=0)

Or one could use:
sliceheight=100.00 (full screen height frameslice)
slicealphablend=0 (no blend percentage between slices)
slicegamma=2.2 (default, ignored if slicealphablend=0)

Probably bad parameters, but just an example of how this could be begun.

With such configuration parameters, the BFIv3 would autocompute a rolling-bar pattern that was a full-height "bar" for 1 refresh, and a full-height "black" for the next refresh. Allowing BFIv3 to clone BFIv2.

Personally I'd prefer the "sliceduty" approach over "slightheight" because "sliceduty" is refresh-rate agnostic, meaning it would autocompute the BFI rollingbar height depending on the source refresh rate (emulated) and destination refresh rate (actual hardware).

You could even do a 1/3.333th height bar for outputting 60Hz to a 200Hz monitor.
The BFIv3 would autocompute overlaps between refreshes.

For emulating electron gun for a 60Hz CRT onto a 200Hz display using a Hz-non-divisible rolling-scan "Temoral HLSL" phosphor bar:
(rolling-bar black frame insertion)

Concept of Hz-Agnostic Rolling Scan BFI (CRT Scanning Emulation)
Situation Example of 60Hz CRT emulation onto a 200Hz LCD

Emulator Refresh Cycle #1:
....Real Refresh 1: full 60/200th height bar (30% screen height), at 0%-30% vertical position
....Real Refresh 2: full 60/200th height bar (30% screen height), at 30%-60% vertical position
....Real Refresh 3: fuill 60/200th height bar (30% screen height), at 60%-90% vertical position
....Real Refresh 4: 1/3 of 60/200th height bar (10% screen height), at 90%-100% vertical position

Emulator Refresh Cycle #2:
....Real Refresh 5: 2/3 of 60/200th height bar (20% screen height), at 0%-20% vertical position
....Real Refresh 6: full 60/200th height bar (30% screen height), at 20%-50% vertical position
....Real Refresh 7: full 60/200th height bar (30% screen height), at 50%-80% vertical position
....Real Refresh 8: 2/3 of of 60/200th height bar (20% screen height), at 80%-100% vertical position

Emulator Refresh Cycle #3:
....Real Refresh 9: 1/3 of 60/200th height bar (10% screen height), at 0%-10% vertical position
....Real Refresh 10: full 60/200th height bar (30% screen height), at 10%-40% vertical position
....Real Refresh 11: full 60/200th height bar (30% screen height), at 40%-70% vertical position
....Real Refresh 12: full 60/200th height bar (30% screen height), at 70%-100% vertical position

So you get the concept of Hz-agnostic BFI. Yet the configuration parameters would be configurable to be Hz-divisor BFIv2 with no alphablend. Basically the venn diagram of BFIv3 configurability would fully overlap BFIv2!

For now, use simple mathematics, and makes sure performance stays high (framerate=Hz). Though, theoretically, one can dynamically expand/shrink height of the bars slightly (below human perceptible levels, like 1ms changes), to slew to a sync problem, e.g. audio sync, or arcade monitor sync, or emulator module running behind schedule, etc. Algorithmically, it is important to keep photons-per-pixel-per-second constant, so a photon accumulator array approach (an array the size of the screen resolution), if you want to have varying-framerate BFI for any reasons. Now, that veers into overly complex-think (thinking too far ahead), so I'll back away from those thoughts for now...

Anyway, if one decides to proceed with rolling-bar BFI, be careful that overlapped alphablends are gamma-corrected, with an adjustable-height overlap and configurable gamma-correction factor that can be adjusted. So that photons-per-pixel-per-refresh is identical. And persistence should be adjustable. If adjusted to a factor of one 180Hz refreshcycle worth of photons, then a pixel that was 100% bright one refresh, should be 0% birght next refresh. In the alphablend zone, a pixel that was 25% bright should be 75% bright the next refresh (and vice versa). And obviously, should be gamma corrected, so that 50% bright is exactly half the nit brightness. (Non-gamma-corrected RGB(128,128,128) is not half the number of photons as RGB(255,255,255), so you must always gamma correct to avoid "brightness bars" artifacts. Otherwise, you get dim bars or bright bars in the alphablend zone, depending on how you calculated or how overdrive setting is, so gamma-correct overlap needs to ber adjustable. This becomes much more seamless at higher Hz than at lower Hz. But needless to say, 180Hz and 240Hz permits early prototyping of BFIv3, it will look okay with blur-reducing slow-panning platformers (e.g. Donkey Kong Country would probably look just fine), but at 180Hz will blurry-tear artifact during fast panning (e.g. a full-running-speed Super Mario). For 180Hz, global BFI will look vastly superior at fast speeds.

This would be a simplified "Temporal HLSL" that doesn't require much modification to existing HLSL (except beamraced optimizations that work with both hardware beamraced sync and BFIv3 rolling-scan beamraced sync).

If this overwhelms you, I understand, just don't workflow-architecture this into a corner...

Ideally, I would put a simplified rolling-BFIv3 as a separate layer after a mostly-unmodified Spatial HLSL layer, and abstract Present() away from the real Present(). You'd rerun HLSL filtering for a frameslice area (with enough HLSL filter overlap to accomodate disjoints between HLSL grid, and actual pixel grid, and BFIv3 alphablend overlaps). Heck, even do full-screen HLSL reprocess every frameslice, if you want to keep it simple (and just use GPU brute overkill) to continue unmodified non-beamraced-optimized HLSL for beamraced VSYNC for now.

Eith er way, by abstracting Present() away from the real Present(), a separate module would handle beamraced VSYNC that is compatible with desination-hardware rasters (actual monitor raster, like your existing patch) and software raster module (beamracing with rolling BFI). The same code can be made compatible with beamraced VSYNC. Basically a virutalized beamraceable Present() layer that is future-proof. The software raster module would be the BFIv3 module that snips an alphablend frameslice out of it into a separate frame buffer whereupon that's actually Present()'d

So workflow proposal is:
- Continue your frameslice beam racing approach as you were planning to do;
- But abstract Present() in a way to make it compatible with both beamracing the real hardware raster (beamracing to an existing low-Hz display) or future software rolling scan (beamracing to ultrahigh-Hz display with a rolling phosphor bar in sync with emulator raster)

Oh, I just thought of this now: One could even do a sample-and-hold non-strobed BFIv3, simply using sheer refresh rate to simplify beamraced VSYNC (But without impulse flicker). Basically simply presenting the full partially-rasterplotted emulator framebuffer in its current mid-scanout, everytime the real hardware Hz needed a new refresh cycle. Theoretically, one could use a frameslice height for BFI too, as full height. So basically it'd look like non-strobed VSYNC ON, except with sub-refresh latency, without needing RunAhead, for people who hate flickering 60Hz CRTs.
....So theoretically, the configuration parameters should be flexible enough for BFIv3 to do a non-BFI subrefresh VSYNC ON without needing hardware beamracing; just using sheer Hz as the beamracing method instead.
....Metaphorically, you're simply Present()ing the whole partially rasterplotted framebuffer every actual refresh cycle, to achieve original-latencies (sub-emulator-refresh latency) for ordinary VSYNC ON, simply via using the sheer brute Hz instead of VSYNC OFF frameslice beamracing. Nearly the same lagless original-machine latency, within the granularity of the destination Hz, at least. So BFIv3 would be configurable to do this in theory.

For now, understandably, this is a thought experiment; but it's exciting "year 2021-think" or "year 2025-think". We have times before inexpensive arcade CRTs go extinct, but we should begin thought-experimenting this now thanks to emerging display tech...

Still, I hope you can at least quickly implement a simple rudimentary "100%,0%,0%" BFI for 180Hz to help incubate demand for improved BFI (and attract additional developers for future advanced BFIv3 in a year or two). I don't know which MAME fork deserves this (I'd prefer to see this in mainline), but right now, GroovyMAME is more daring about these kinds of endeavours...

Quote from: Calamity on May 30, 2020, 08:25:58 am

EDIT: Mark, 1000 Hz HLSL scanning would require monitors to be at least 10x brighter than what they are now, and you know it.

There's good news for that: HDR

That's what HDR headroom is for.

The world's first 10,000nit display (Sony prototype I saw at CES) is an LCD.

With that, you still can have 1,000 nits after 90% persistence reductions. That's still brighter than an arcade CRT tube. So you can keep reducing persistence all the way to ~1/16th (1ms out of 16.7ms)....still more than 500 nits.

There's upcoming LG 240Hz 1440p IPS panels (ETA: early 2021) that are VESA HDR600 rated, and I would bet HDR1000 isn't far behind. For a 90% persistence reduction, even a 100 nit screen is still brighter than many 5-year arcade CRT tubes.

HDR exists because it makes those beautiful highlights (e.g. sun glints off a 1957 chevy that's brighter-than-white, or ultra-bright neon signs at night.... 10,000 nits looks absolutely GORGEOUS in these situations). However, the delicious nit headroom helps CRT electron gun emulation in the future.

FALD (Full Array Local Dimming) are typically 1000 nits. They are still somewhat expensive but inexpensive MicroLED backlight sheets for FALD LCDs are coming later in the 2021s-2025s. This will finally produce sub-$1000 FALD desktop monitors in the 24"-32" range, about time (because these screens are close to the appropriate sizes for embedding into a MAME cabinet). So that HDR looks better. This will be much cheaper than OLED panels and MicroLED panels initially, as a stopgap. MicroLED FALDs are super-bright (absolute minimum 1000 nits for some of them). Since these MicroLED FALDs are in the thousands of lights, this keeps haloing down a lot; some of them halo less than a phosphor tube front surface (the halo glow around an electron gun dot); so that's the venn diagram of overlap of FALD-haloing-acceptability. They may still not be found in cheap LCD displays, but the fact that these would be simultaneously FALD + ultra Hz, makes them excellent early candidates for Temporal HLSL emulation of a CRT electron gun in the coming years.

The problem is the bottom-barrel 60Hz LCD is terribly bad, and lowers many people's expectations about LCDs ability. But anybody who's actually seen BFI on the new 240Hz 1ms IPS panels, are shocked at how much better it is nowadays compared to the crappy LightBoost days. And it's going to keep getting better thanks to a convergence of factors (HDR! Sheer Hz! Etc.)

So HDR + retina refresh rates = helps solves the subrefresh CRT electron gun emulation problem.

But yes, Keep it Simple

I understand GroovyMAME's focus is on CRT tubes.

However, GroovyMAME already has BFI, and I would like to have a humble request of keeping that existing feature, and add support for 180Hz BFI (100,0,0) and 240Hz BFI (100,0,0,0), to help incubate demand.

Calamity · « **Reply #10 on:** May 31, 2020, 02:45:53 pm »

Just an update: today I tested BFI at 120 Hz on a CRT, but this time I blacked the upper and lower halves of the screen alternatively. I wanted to see if banding is visible in this edge case. I've indeed experienced a glitch that's completely new to my eye. The picture looks solid as long as eye remains still, which is a difficult task when you try, but as soon as the eyeball moves to track the picture, you can notice a brightness seam in the middle of the screen. Besides, fast scrolling games show a tearing glitch that looks exactly like classical tearing, while it's definitely not there! (I know for sure that both halves of the screen match horizontally). Using a brightness reduction of 50% instead of totally black greatly reduces the issue, while it doesn't totally remove it.

My understanding is that this method of rolling scan simulation logically doesn't play nice with a screen that's already of rolling scan nature. In order to draw the lower half of the screen, the raster has to return to the top and draw the first half in black before it starts drawing the second half. This produces a brightness discontinuity in the middle of the screen that the eye can catch, even if each line is exactly lit for the same amount of time.

Another interesting issue that's likely related to the CRT electronics is that the second half of the screen starts *slightly* narrower than the upper lines, I guess this has to do with how brightness affects the raster width and the sudden brightness change that this method triggers.

I have to say that these issues are really subtle, even in this extreme case. I'd like to think that these wouldn't happen on a sample and hold display for the reasons explained. I even think that blending won't be required. Unfortunately I don't have a higher refresh LCD to test on, everything I have is 60 Hz.

mdrejhon · « **Reply #11 on:** May 31, 2020, 05:12:47 pm »

[duplicate removed]

mdrejhon · « **Reply #12 on:** May 31, 2020, 05:15:46 pm »

What you observed is exactly what I expected. I've already done that test before, my brain correctly predicted that before I made an internal test too. Also, unfortunately it also happens if you do sharp-line BFI on a 120Hz LCD too -- try it on an LCD too! Same problem, even if you eye-track around. Artifact will look slightly different, but there will also be an artifact.

LCDs are already rolling scan (flickerless rolling scan), see high speed videos at www.blurbusters.com/scanout ... That's not the main problem (impulsing versus non-impulsing), however.

Just point a Samsung Galaxy 960fps high speed camera at an LCD running www.testufo.com/scanout and you'll see all LCDs scan in the same way as a CRT. Faster GtG makes these scanouts easier to see. Metaphorically, it's just a fadeless phosphor.

Quote from: Calamity on May 31, 2020, 02:45:53 pm

I have to say that these issues are really subtle, even in this extreme case. I'd like to think that these wouldn't happen on a sample and hold display for the reasons explained. I even think that blending won't be required. Unfortunately I don't have a higher refresh LCD to test on, everything I have is 60 Hz.

Actually, blending will be required, alas.

And blending will still be visible at lower Hz (e.g. 120Hz, 180Hz, 240Hz)

It will look like VSYNC OFF tearing of a stationary tearline in the middle of the screen, with offsets directly proportional to motionspeed in pixel-per-frame. So you will have to alphablend this to "soften the VSYNC OFF tearling effect". Thankfully, sheer Hz, e.g. 360Hz means fast motionspeeds like 1000 pixels/sec only have 2.7 pixel offsets, which can be alphablended. Since additionally 1000 pixels/sec creates 2.7ms of motion blur at 360fps at 360Hz, the display motion blur combined with the alphablend, hides the alphablend-seam artifacts further.

Also, scanskew is still visible on both LCD and CRT: www.testufo.com/scanskew (Try it now on an LCD!). View this above TestUFO test on 60Hz CRT and on a native 60Hz LCD (not a 120Hz LCD running at 60Hz). iPads work, though rotate by 90 degrees to find the scanskew orientiation. That's because both CRT and LCD are rolling-scan. Just LCD is a sample-and-hold rolling-scan. I'm VERY familiar with scanskew, tearing, etc, and the mathematics behind them, understanding the Present()-to-photons.

Now view www.testufo.com/scanskew at 120Hz and at 240Hz. The skew shrinks at progressively higher Hz. Also, some 240Hz monitors use scan-converting TCONs so displaying 60Hz on these 240Hz panels will still show the same quarter-scanskew as 240Hz native. The brute Hz eliminates the rolling-scan artifacts. There is a behaviour of progressively-reducing seam artifacts also applies for higher-destination-Hz for a emulated-Hz for a specific motionspeed.

Anyway, the fact that not all pixels refresh at the same time, is mostly irrelevant at higher Hz, since that's the shrunken temporal differnce between first pixel and last pixel. So just worry about emulating rolling-scan at the frame (refresh cycle) level, without worrying about the display's own machinations. The bottom line is higher Hz is always better for software-based rolling scan emulators.

And regardless of refresh pattern, sample-and-hold, impulse or not, alphablending between segments will be required for software rolling BFI, no matter what kind of refresh pattern the actual displays use (guaranteed). The only time this will be tearingless will be stationary images on sample-hold for a non-strobed rolling scan.

I will create a TestUFO "CRT Emulator" demo in the coming weeks, with configurable settings. That'll help micdrop slam-dunk the correctness of my answers

Mind you, it'll still be an educational new TestUFO test for educating people in the refresh rate race to retina refresh rates.

True, you can begin with sharp bands with no alphablend to test things out first. That is still worthwhile, though. Static games like Pac Man would be very hard to see issues unless the Pac Man was on the stationary tearline (also visible on LCD, unfortunately).

Now, it will be subtle for stationary, but amplify massively during increasingly faster motion. A fast scrolling platform game or shoot-em-up would show noticeable artifacting. The faster the motion, the worse. But the higher the Hz, the less visible. And the more blending, the less visible. This is a vicious cycle effect of sorts.

Oh by the way, I've posted a GitHub request at RetroArch:
RetroArch Feature Request: (BFIv3) Emulate a CRT Electron Gun Via Rolling-Scan BFI

Calamity · « **Reply #13 on:** June 01, 2020, 04:42:16 am »

Quote from: mdrejhon on May 31, 2020, 05:15:46 pm

Oh by the way, I've posted a GitHub request at RetroArch:
RetroArch Feature Request: (BFIv3) Emulate a CRT Electron Gun Via Rolling-Scan BFI

Mark, you only offer cash to the bad guys

mdrejhon · « **Reply #14 on:** June 02, 2020, 01:07:55 pm »

Quote from: Calamity on June 01, 2020, 04:42:16 am

Quote from: mdrejhon on May 31, 2020, 05:15:46 pm
Oh by the way, I've posted a GitHub request at RetroArch:
RetroArch Feature Request: (BFIv3) Emulate a CRT Electron Gun Via Rolling-Scan BFI

Mark, you only offer cash to the bad guys

It's the reuse of the March 2018 unclaimed beam raced sync bounty that was built for Add Beam Racing Algorithm for lagless VSYNC ON.

Look at the old date on the existing bounty -- I was merely permitting rolling BFIv3 (ala "Temporal HLSL", or CRT electron gun emulation) to also claim this pot since the qualifications can be pre-programmed into it.

In retrospect, my explanations may be a bit overly complex to many developers -- but providing multiple avenues to interpret the bounty (2018's Lagless VSYNC idea, versus 2020's CRT electron beam simulator idea) may make it easier for specific developers to conceptualize.

They are independent ideas but are actually theoretically mergeable ideas. CRT electron beam simulators don't necessarily have to be beam raced, but there are multiple ways for them to be (either via hardware raster at 60Hz, or via software rolling scan via sheer Hz)! The tube can be simulated only after an emulator frame is fully finished ready. Eventually more TestUFO tests are coming that will help train future people.

Offer to clone the Bounty (rolling-scan electron beam simulator)

Have no role in emulator politics -- RetroArch was selected simply because it already has software BFI and supports a huge number of emulator modules in a crossplatform way --

So if it helps makes things fair, I'll be glad to extend a duplicate bounty to any MAME-only project (likely via $250 personally + $250 funds matching through various contacts) for a crossplatform sub-refresh CRT emulator that has an optional real-hardware beamracing option. If you personally want to assign this bounty under the GroovyMAME umbrella, send me an email to mark [at] blurbusters.com

Blur Busters is all about display temporals of all kinds (latency, GtG, MPRT, refresh rates, high Hz, beam racing, sync technologies, etc), so we're a fan of the concept of emulating a CRT tube temporally at the sub-refresh timescales.

I am so familiar with display temporals, I know what it takes to create an electron gun emulator. What would be so beautiful is combining spatial HLSL with temporal HLSL to accurately temporally mimic a CRT (including its zero blur nature).

The same CRT emulation codebase can be ported to a different emulator.
I extend this offer to the first MAME implementation (whether GroovyMAME, or mainline, etc)

TestUFO CRT Emulator Animation Coming

I can help start tipping a domino by creating a basic rudimentary TestUFO CRT Emulator to prove the concept of a blended-overlap rolling scan. Minimum goal is 4-segment for 240Hz and 6-segment for 360Hz, with configurable on/off blending. Test won't look good on 60Hz displays (15Hz flicker for 4-frames-per-emulated-CRT-refresh-cycle) but I think a a slow-motion feature could be added so that visitors can "watch a software based CRT emulation". This will be hugely educational, methinks.

TestUFO has become the classic slam dunk mic-dropping factory to silence many myths and debates to more agreement (all the spinoff myths since "Humans Cannot See 30fps Versus 60fps" -- and raise Internet education about display temporals, a very Blur Busters thing). Originally a see-for-yourself linkable motion answer to temporal display questions, is why I originally created TestUFO, but is now also used by display manufacturers and actual scientists/researchers (including display measurements, like for the pursuit camera for photographing display motion blur).

Sometimes I create new tests to answer -- someone claimed software-based BFI cannot create double images on LCD panels. So I created proof on TestUFO to answer that question. Even a sample-and-hold display LCD can clone a CRT "30fps at 60Hz" double image. Sheer Hz is helping make it easier to clone CRT style behaviors. The more Hz, the merrier, for CRT emulation moves when antique tubes creating lead-turned-into-gold pricing, and we're forced to use digital panels in future emulator cabinets. From a preservation perspective, this will be a bigger problem in ten or twenty years. As long as other things don't degrade (colors, HDR, etc), more Hz potentially solves a CRT simulation problem at the temporal level.

Even long before Blur Busters, I've been mythbusting since 1994 on display temporal behavior debates (MSDOS VGA screenshot of my TestUFO's grandfather app I created in 1993, was available as a BBS download)

Replacing aging CRTs may require ways to temporally mimic CRTs in pure software

GroovyMAME's specialization on CRT tubes may eventually need to include emulation of CRT tubes, so this forum thread may become prophetic to 2030s developers looking for ways to emulate a CRT tube temporally --

I often write things 5 or 10 years ahead of schedule from a hobbyist point of view, even without a profit motive. Anyone reading back on old posts by me, realizes I was years ahead of schedule on a lot of display temporal topics. It's likely that by the 2030s, 120Hz becomes few pennies premium included in almost all displays, 240Hz even more mainstream than today, with 1000Hz professional displays. Just like 4K was once waved off as unnecessary, now-cheap 4K shows benefits in MAME HLSL. Likewise, tomorrow's cheap ultra-Hz can show benefits in future "Temporal HLSL" algorithms.

I consider this one of the fun "20% free time" projects. It's my philosophy to do fun projects outside the usual Blur Busters stuff like these, from time to time. Which is why I write here from an enthusaic hobbyist point of view.

Improvements to Global BFI Desired For Now

For now, it would be nice to see the simple steps like improving the existing global black frame insertion ([Feature Request] (BFIv2) BFI is more CRT-like at higher Hz. Need 60Hz BFI for 180Hz, 240Hz, 300Hz, 360Hz) -- that GitHub description is a more simple-written description. Which is much easier than even basic alphablended rolling scan initially.

Global BFI is still helping a few CRT tube users at the moment. Users of VGA tubes (31.5KHz-only scans) can do 240p120 in place of 480p60, and use BFI to generate 240p60 that looks identical to human eyes to 240p60 on a 15.3KHz NTSC tube.

One does not have to combine beamraced VSYNC and BFI at the same time (disable the other when one is enabled), albiet both can be made theoretically compatible.

If you want to combine software BFI and beam racing -- one could simply beamrace the first scanout of the first frame of the BFI sequence (emuraster-realraster sync), much like how WinUAE beamraced sync already does a 2x-faster scanout for 120Hz monitors -- (1/120sec accelerated beamrace, 60 times per second) with 1/120sec pauses between beamraced scanouts. Basically treating every other refresh cycle like a large VBI, regardless of whether it's a black frame, or if it's a repeat refresh cycle. (So BFI + beamraced sync can be enabled simultaneously). When doing lagless VSYNC of a 60Hz Amiga onto a 120Hz monitor. Of course, this is assuming computer is fast enough to execute the emulator in realtime at the faster scanout. It's identical regardless whether you do BFI or not, i.e. beamraced VSYNC to a higher Hz.

Calamity · « **Reply #15 on:** June 06, 2020, 05:21:31 am »

Hi Mark,

I was kidding about the cash thing.

I have a doubt concerning your terminology:

Quote

Default preferred BFI cadence 0-0-100% or 0-0-0-0-100% respectively.

Your percents here, mean % of brightness or % of darkness? I'm assuming its % of darkness. E.g., for a 180 Hz presentation, your preferred cadence would be:

frame #1 [vsync] frame #1 [vsync] black_frame [vsync] frame #2 [vsync] [frame #2] [vsync] black_frame ...

or would it be like this:

frame #1 [vsync] black_frame [vsync] black_frame [vsync] frame #2 [vsync] black_frame [vsync] black_frame ...

I'm also assuming we always want present the new frame as soon as possible and then insert the black frames afterwards, to minimize latency. But by revisiting your posts it's not clear if this is your intention.

mdrejhon · « **Reply #16 on:** June 06, 2020, 06:39:39 pm »

Quote from: Calamity on June 06, 2020, 05:21:31 am

I was kidding about the cash thing...

Understood -- although, this is coming from my hobbyist mind rather than my business mind -- my offer is actually genuine and would be happy to help out other hobbyists.

From a preservation perspective, I am a big fan of preservation of originality (original CRT look, original CRT zero-blur, original CRT rolling scan, original CRT latency), and would be happy to incentivize that a bit. ...Although given low budgets (i.e. three figures rather than four figures at the moment) -- there might need to be a pain point first (e.g. lack of CRTS + wide availability brute-Hz displays), which might not happen for a number of years yet.

Will be happy to to allow offer to stand until end of 2021 for a beamraced implementation (BFIv3 rolling-scan BFI, rather than global-refresh BFI).

Quote from: Calamity on June 06, 2020, 05:21:31 am

I have a doubt concerning your terminology:

Quote
Default preferred BFI cadence 0-0-100% or 0-0-0-0-100% respectively.

Egads -- good catch about latency -- I meant "100%-0-0" (3-frame cadence situation) and "100%-0-0-0-0" (5 frame cadence situation). I knew about latency, but was 100% focussed on trying to explain the BFI extensions conceptually, without lag-compensating the numbers. However, you're right about the best-practices to eliminate BFI lag.

And yes, the percentages are alphablends between a completely black frame and the MAME emulator frame. So 50% means a very dark MAME frame (where RGB(255,255,255) pixels are actually RGB(128,128,128) in the frame). For simplicity, you don't have to bother to gamma-correct the alphablends for global BFI (that's only for rolling-scan BFI overlap regions)

I have now edited the original post. Since it's a continual cycle, "0%-0%-25%-100%-25%-0%" is identical to "25%-100%-25%-0%-0%-0%" except the latter is less laggy, since the first refresh cycle contains a visible frame. Yes, the first refresh cycle should be visible, to reduce input latency. Apologies. Thanks for the catch!

For configuration file processing, you can simply numbershift until it hits the first non-black frame. So a user who enters "0,0,100,100,0,0" in a configuration file, it can safely be automatically numbershifted until the first number is a non-0 (upon config load logic inside GroovyMAME) until "100,100,0,0,0,0". It looks visually identical, except just same input lag as non-BFI. As long as first refresh cycle is non-black, the latency will feel the same (except the reduced motion blur will often make it feel less laggy, because of lack of motion blur).

My intention was that the least-blur (but most flicker) occurs with such a squarewave cadence, those who want absolute minimum blur from a software-based BFI algorithm, want only one visible 100% frame, with the rest 0%. However, there are pros/cons, in a phosphor simulation tradeoff.

But still, would like flexibility on percentages -- yes -- so people who want reduced blur but want to soften it with a bit of phosphor decay simulation, might actually prefer "100%-50%-25%-0%-0%". Phosphor tends to illuminate darn near instantly, but decay much more slowly after, so it's kinda a triangle sawtooth when you measure a CRT phosphor onto an oscilloscope screen, and stretch the graph from where phosphor hits 100% and decays to 10%. So, it is an attempt to emulate that.

P.S. Optional: Convenient if you auto-fix BFI latency via numbershift routine in MAME config file loader: For anti-burnin, you can also realtime numbershift the BFI sequence internally one cycle forward then one cycle backwards, e.g. "1,0,0,0" becomes "0,1,0,0" cycled back and fourth every 5 minutes (like wraparound rightshift operation then 5 minutes later do wraparound leftshift operation). The same numbershifting routine that was used during config file loader, can also be recycled as an "easy to add" antiburnin feature. The higher the Hz, the smaller the temporary numbershift lag penalty is (e.g. 360Hz is only 2.8 ms penalty for a temporary antiburnin operation). I know you hate the idea of antiburnin logic, but is an easy-to-add feature once you've already coded "BFI sequence" system. And every-5-minutes is better than nothing at all.

Feature request of global BFI Version 2, reposted below in simpler language:

mdrejhon · « **Reply #17 on:** June 06, 2020, 06:54:33 pm »

Quote from: Calamity on June 06, 2020, 05:21:31 am

I'm also assuming we always want present the new frame as soon as possible and then insert the black frames afterwards, to minimize latency. But by revisiting your posts it's not clear if this is your intention.

Correct.

I am rewriting the feature request into something easier to read, derived from the BFIv2 section of RetroArch feature request (I have 4 different Feature Requests at RetroArch, this is a different one, separate of the Temporal HLSL).

BFI Percentage Terminology:
1 = Fully visible frame
0 = Fully black frame
0.5 = A darkened frame that is 50% brightness

This feature request is rewritten for programmer simplicity & configuration file simplicity.

Feature Request for Improved MAME BFI Implementation
Support Adjustable Software BFI for Higher Hz

Would like to see a MAME implementation to support the following Black Frame Insertion (BFI) sequences:

BFI sequence on 120Hz for 60Hz emulation: ON, OFF
BFI sequence on 180Hz for 60Hz emulation: ON, OFF, OFF
BFI sequence on 240Hz for 60Hz emulation: ON, OFF, OFF, OFF
BFI sequence on 300Hz for 60Hz emulation: ON, OFF, OFF, OFF, OFF
BFI sequence on 360Hz for 60Hz emulation: ON, OFF, OFF, OFF, OFF, OFF

Best Case Display Motion Blur Reduction by BFI
The easiest way to do so is provide a comma-separated black-frame insertion sequence in a configuration file, to allow customizability. Default strings can be done for common scenarios, but would let advanced users customize BFI. Relative to the original blur of a 60Hz LCD, higher Hz produces more software-BFI-blur-reduction (non-strobed LCD use-case, though software BFI also helps hardware strobing too for these specific numbers, in lower strobe lag + better quality strobing).

120Hz BFI sequence (50% less motion blur): 1 , 0
180Hz BFI sequence (66% less motion blur): 1 , 0 , 0
240Hz BFI sequence (75% less motion blur): 1 , 0 , 0 , 0
360Hz BFI sequence (83% less motion blur): 1 , 0 , 0 , 0 , 0 , 0

Adjustable Motion Blur (Tradeoff Between Flicker + Brightness + Clarity)
Custom sequences can allow you to adjust motion blur, brightness, and flicker tradeoff. Just like TestUFO Variable-Blur BFI Demo. Try this link on a high-Hz LCD with hardware strobing distabled! If you have 240Hz, try configuring 4 or 5 UFOs instead, to see more variable-blur flexibility. Adjustability is a continuum between hardware Hz to emulator Hz. Minimum persistence-based display motion blur is persistence of max Hz (1/360sec visibility = 2.8ms blur). Maximum display motion blur is emulator Hz (1/60sec visibility = 16.7ms blur). Thus higher Hz, the more BFI motion blur adjustability.

180Hz bright BFI sequence (33% less motion blur): 1 , 1 , 0
240Hz bright BFI sequence (25% less motion blur): 1 , 1 , 1 , 0
360Hz bright BFI sequence (66% less motion blur): 1 , 1 , 0 , 0 , 0 , 0
360Hz bright BFI sequence (33% less motion blur): 1 , 1 , 1 , 1, 0 , 0

Basic CRT Phosphor Decay Emulation
In fact, alpha-blended BFI is also desirable, so this could be a percentage setting or floating point setting, to approximate phosphor fade. This makes flickerfeel more approximate a CRT tube (as far as refresh granularity permits). And feels much less harsh than 60Hz squarewave for many.

240Hz alpha-blended BFI slow-rise slow-decay sequence: 0.5 , 1 , 0.5 , 0
360Hz alpha-blended BFI slow-rise slow-decay sequence: 0.5 , 1 , 0.5 , 0 , 0 , 0
360Hz alpha-blended BFI fast-rise, slow-decay sequence: 1 , 0.5 , 0.25 , 0 , 0 , 0
360Hz alpha-blended BFI fast-rise, superslow-decay sequence: 1 , 0.75 , 0.5 , 0.25 , 0.1 , 0

Alternative methods of configuration could be discussed instead.

Hopefully this is a very easy change for a MAME implementation (MAME mainline / GroovyMAME / etc) for the refresh rate race to retina refresh rates. For now, this can be just a simple configuration file string, to help users incubate this. It should be easy to write instruction guides, to help get more users playing with BFI.

Tips
-- This above rewritten suggestion now uses floating-point scale instead of integer percents. (But you can choose)
-- If easier to program, can implement discrete BFI (no percentages) before later adding phosphor-fade support
-- You may choose either numeric approach of integer percents or floating point numbers, whichever you prefer.
-- Comma sequences too long for current actual-hardware refresh rate can simply be trunctated for lower Hz. So "1,0,0,0,0,0" can become catchall for all refresh rates
-- Comma sequences too short for current actual-hardware refresh rate can simply pad missing values as 0
-- First number should be non-blank for lowest lag. For convenience, config file reader can automatically numbershift to lag-optimize user-defined setting, e.g. turn "0, 1, 0.5, 0" into "1, 0.5, 0, 0". It is visually identical but lower lag.
-- The numbershifting technique can also be used as the anti-retention technique (can be enabled by default for hardware Hz evenly divisible by emulator Hz, such as 120,240,360 instead of 180,300, for 60Hz emulators).
-- Future Rolling BFIv3 could actually theoretically use the same BFIv2 strings, simply by using 6 different numbershifted versions of the same strings for 6 slices for the same 360Hz display (360/60 = 6). Or 4 different numbershifted versions of the same strings for 4 slices of the same 240Hz display (240/60 = 4). For some developers, this can theoretically make BFIv3 conceptually simpler to implement (for all phosphor fade speeds), albiet alphablended overlaps will still be needed to eliminate seams/tearing artifacts. You'd simply add a command line argument "rolling = on/off". Although this may not be the most ideal coding path, explaining it this way, may be more conceptually simple for a developer how to visualize how to turn global BFIv2 into a rolling BFIv3 later on in the future, as a stopgap...

Once this feature is implemented, I'd be happy to write an article about this to improve emulator BFI awareness among high-Hz monitor users

donluca · « **Reply #18 on:** June 07, 2020, 08:20:50 am »

Man, I have good feelings about this.

Thanks for taking time to posting those very detailed information (and putting your own money in a bounty).

I really hope something will come out from this.

Calamity · « **Reply #19 on:** June 07, 2020, 01:40:14 pm »

I've tweaked the existing -bfi option to take an integer value and made a package for LCD users that's ready to unzip and run, you only need to edit mame.ini to point to your roms folder and your appropriate -black_frame_insertion value:

GroovyMAME 0.221 BFI v2.0

- Option -black_frame_insertion (-bfi) now takes an integer that means the number of black frames to insert after the visible frame:
- For 120 Hz, use -bfi 1
- For 180 Hz, use -bfi 2
- etc.

This is the simplest implementation that's not configurable yet, you can only specify the number of black frames after the visible one, e.g. (1, 0), (1,0,0), (1,0,0,0), etc.

The user is responsible for setting the correct refresh on Windows' desktop (or Linux). The settings in mame.ini are configured for LCD so GM won't attempt to switch the video mode, it will just pick whatever your desktop is using. GM won't attempt any video timings tweaks either, so anything that deviates from your (desktop-refresh / bfi_factor) will impact your emulated game speed (e.g. rtype at 180 Hz will run at 109% with -bfi 2, if you wanted it to run at 100% you'd need to create and set a 165 Hz mode).

cools · « **Reply #20 on:** June 07, 2020, 05:19:08 pm »

Walls of text. Mention of RetroArch. Calamity: "hold my beer"

mdrejhon · « **Reply #21 on:** June 08, 2020, 12:43:24 pm »

Quote from: Calamity on June 07, 2020, 01:40:14 pm

I've tweaked the existing -bfi option to take an integer value and made a package for LCD users that's ready to unzip and run, you only need to edit mame.ini to point to your roms folder and your appropriate -black_frame_insertion value:

GroovyMAME 0.221 BFI v2.0

- Option -black_frame_insertion (-bfi) now takes an integer that means the number of black frames to insert after the visible frame:
- For 120 Hz, use -bfi 1
- For 180 Hz, use -bfi 2
- etc.

This is the simplest implementation that's not configurable yet, you can only specify the number of black frames after the visible one, e.g. (1, 0), (1,0,0), (1,0,0,0), etc.

Fantastic! Thanks for stepping up to be the first emulator to incubate at least a simple, basic >120Hz BFI.

Quick test shows that it eliminates burn-in at 180Hz while still producing correct MAME motion for 60Hz games.
- As seen in other BFI experiments, even the artifact distortions on TN panels (chessboard patterns + color depth loss) disappear at 180Hz too.

Brief tests of 180Hz BFI on a 240Hz monitor:
Nonstrobed 180Hz: There is, as predicted, a correct ~66% reduction of 60Hz motion blur, looks like ~5ms CRT phosphor
Strobed 180Hz: It looks correctly like a 60Hz strobe backlight, but without the burnin/artifacts of interference with LCD inversion/FRC temporal dithering nasty interactions. Some minor brightness gradient issue which is simply the placeholder of strobe crosstalk against black frame (so no double image, but a slight fadeoff at top or bottom edge). But is extremely minor and dismissable as a CRT tube artifact too (some are brighter on one edge). Has much more solid looking colors than 120Hz BFI, and no chessboard-artifact texture. looks like ~1ms CRT phosphor

Image retention (burnin) is completely gone, as predicted.

In both situations, is a 66% loss of brightness. This is acceptable on bright panels (400+ nits), since most worn arcade CRTs have dimmed to only ~100 nits anyway especially when displaying bright arcade images. Future 600nit and 1000nit HDR high-Hz panels will certainly help big-time, but at least we're now able to test a real BFI implementation in a real emulator.

Main disadvantage is 66% loss of brightness (regardless of strobed or nonstrobe). However, everything else looks better -- the 60Hz CRT-look-on-LCD has just been upgraded with this modification!

Currently, all known 240Hz gaming monitors made in the last 2 years, all support 180Hz via Custom Resolution. This is a 100% burnin-proof refresh rate with software-based BFI.

Once I do more tests (and hopefully more flexibility is added by then), I'll post an article publicizing the benefits of improved BFI. To keep alive discussion on long-term decade-long discussion of temporal preservation of CRT look in a display-independent manner. I also started building a TestUFO CRT Emulator (rolling impulse with alphablend overlaps), as a spare-time project.

Instructions for testing new BFI feature for 180Hz+ Displays
For anybody else who has a 240Hz gaming monitor:
1. Go to NVIDIA Control Panel
2. Click "Change Resolution"
3. Click "Customize..."
4. Create a 180Hz refresh rate under default timings
5. Switch to 180Hz
6. Adjust monitor brightness to maximum via monitor's menus
7. Make sure no background software is running that would interfere (slowdowns will cause BFI flicker)
8. Start this modified GroovyMAME with -bfi 2
9. Play a 60Hz arcade game that fast 60fps motions, fast-scrolling games are a good test

Note: Use -bfi 3 if using full 240Hz, use -bfi 4 if using 300Hz ASUS laptop. Be noted there can be potential image retention with refresh rates evenly divisible by 60, but they are temporary, as explained above.

The quality is noticeably better than 120Hz BFI for multiple simultaneous reasons, with the sole exception of being slightly less bright than 120Hz BFI on the same screen. BFI at 120Hz has slight image retention on some panels, but 180Hz has absolutely no image retention issues on any panel on the market.

This is a good start because previously it was impossible to do 180Hz BFI or 240Hz BFI with any emulator ....until today.

Calamity · « **Reply #22 on:** June 08, 2020, 04:00:38 pm »

Mark, did you need to tweak the -lcd_range option or did it just work out of the box? I forgot to mention that it might be necessary to extend the default range to something like -lcd_range 59-240

mdrejhon · « **Reply #23 on:** June 08, 2020, 07:17:12 pm »

[duplicate]

mdrejhon · « **Reply #24 on:** June 08, 2020, 07:19:28 pm »

Quote from: Calamity on June 08, 2020, 04:00:38 pm

Mark, did you need to tweak the -lcd_range option or did it just work out of the box? I forgot to mention that it might be necessary to extend the default range to something like -lcd_range 59-240

It worked out of the box. I didn't need the -lcd_range option.

I did have FreeSync turned off at the moment (to allow me to enable/disable PureXP strobing), so it was 180Hz fixed-Hz.

It did complain with an error message, and it defaults to unfiltered scaling, but the BFI was in perfect cadence at 180Hz (-bfi 2) and 240Hz (-bfi 3).

MAME users who install GroovyMAME to try out BFI: The main non-out-of-box factor; make sure MAME ROMS folder is configured in GroovyMAME's MAME.INI
For fresh unzips of GroovyMAME that I didn't want to overwrite other MAME installs with -- the MAME.INI config file does need a quick edit to configure to point to the ROMS folder for a fresh install of GroovyMAME if you've been using other copies of MAME / MESS (Such as c:\mame\roms) but other than that, all it needed was the -bfi command line option. Be noted, no CRT filters is enabled by default.

vicosku · « **Reply #25 on:** September 07, 2020, 08:48:56 am »

Hi there. Thanks for all the great work on this. I'm trying to get it to work with a Samsung Odyssey G7 LCD. I can use -bfi 3 at 240hz and it does seem to improve motion. From the posts above, I believe I should be using 180hz and -bfi 2, correct? I've created the 180hz custom mode in the Nvidia control panel according to mdrejhon's instructions, and it is set on the desktop. Unfortunately, Groovymame goes back to whatever refresh rate is set in the monitor's OSD. It ignores the custom resolution. For example, if I set the OSD to 144hz, GM uses 144, 240 uses 240. If I enable Adaptive Sync, it gets set to 240.

Is there something I can set in Mame.ini to work around this?

Calamity · « **Reply #26 on:** September 07, 2020, 12:45:34 pm »

The behaviour you're seeing is correct. GM will pick whatever mode is defined currently for the desktop when -monitor lcd is used (basically it won't switch modes). So you're responsible for setting the desktop to the desired refresh before launching GM.

vicosku · « **Reply #27 on:** September 07, 2020, 02:02:59 pm »

First, thanks for your response in the main update thread. I'll look forward to the next release.

Understood in regard to setting the desktop refresh rate. I have done this according to mdrejhon's instructions. I've attached a photo showing the Nvidia control panel settings and 180hz shown in the OSD. I've also attached a log. 180hz is shown there, but it's shown as "out of range" and eventually -noswitchres is chosen. You can see at the end that game runs at 133% speed.

*edited after reading the log a bit more closely.

vicosku · « **Reply #28 on:** September 07, 2020, 02:14:01 pm »

Sorry, that log shows 240hz as out of range. Here's the one that shows 180 as out of range. I think the difference between the two was turning off VRR on the monitor. the end result was the same though.

jimmer · « **Reply #29 on:** September 09, 2020, 02:34:07 pm »

This is all very good stuff, it's got me thinking seriously about trying out a 27" monitor.

Apart from the blur reductions, I like to think about lag. Would it be right to estimate that playing Defender on a XG270 will be more responsive than a CRT near the bottom of the screen (where I spend most of my time) ?

I am doing this basic calculation at 90% scanout:
CRT = 0ms lag + 15ms scanout = 15ms
180hzBFI =2ms lag + 5ms scanout + 1ms gtg = 8ms

I might be double accounting the 1ms gtg and the lag figure I read somewhere, and I might need to add some vsyncoffset time or is that not applicable to this type of monitor?

Calamity · « **Reply #30 on:** September 09, 2020, 04:38:19 pm »

Quote from: vicosku on September 07, 2020, 02:14:01 pm

Sorry, that log shows 240hz as out of range. Here's the one that shows 180 as out of range. I think the difference between the two was turning off VRR on the monitor. the end result was the same though.

@vicosku, check this.

Calamity · « **Reply #31 on:** September 09, 2020, 04:42:23 pm »

@jimmer,

You're assuming BFI + frame delay which is not possible yet (don't think it'll ever be). So the gains you'd get from 15 vs 8 ms scanout would be less than what you loose from normal double buffering (no-frame delay).

jimmer · « **Reply #32 on:** September 10, 2020, 07:51:16 am »

Quote from: Calamity on September 09, 2020, 04:42:23 pm

@jimmer,

You're assuming BFI + frame delay which is not possible yet (don't think it'll ever be). So the gains you'd get from 15 vs 8 ms scanout would be less than what you loose from normal double buffering (no-frame delay).

Oh

(But at least it saves me £400 for the time being)

I'm sad that you don't see it as possible. I didn't expect a connection between frame_dely and BFI, I suppose it's in the details of how MAME/Groovymame works.

Is there a reason why a different emulator couldn't give the result I want?

In my simple model the emulator would work to the beat of the games original frame rate. It would read the inputs (0ms), do the gameplay and screen draw (1ms), squirt the picture/scan out (eg 6ms) followed by 2 black frames (2x6ms). Repeat at eg 59Hz.

PS. Keep up the good work, I'm an appreciative user (even though I haven't used anything new since the 0195 frameslice version)

vicosku · « **Reply #33 on:** September 10, 2020, 04:35:16 pm »

Quote from: Calamity on September 09, 2020, 04:38:19 pm

Quote from: vicosku on September 07, 2020, 02:14:01 pm
Sorry, that log shows 240hz as out of range. Here's the one that shows 180 as out of range. I think the difference between the two was turning off VRR on the monitor. the end result was the same though.

@vicosku, check this.

Thanks! Sorry to make you repeat yourself. Unfortunately, this monitor just seems quirky. I've set my lcd_range to 59-180. Now when I run with -bfi 2, it seems that Groovmame/Switchres is correctly choosing 180hz. However, my monitor still reverts to 240 and the game runs at 133% speed. I'm still in the return period for this monitor and am considering weird behavior like this in my decision to keep it or not. It also has a basically non-functional strobing mode, which is why I was hopeful for this feature.

Here's a new log, for reference. It was done with adaptive sync disabled, 240hz in the OSD, 180hz in Windows.

Calamity · « **Reply #34 on:** September 11, 2020, 03:42:13 am »

Hi vicosku,

I can't understand that you have one refresh on Windows and a different refresh on your monitor's OSD. Maybe my knowledge is outdated with regards to newer monitors. Only one of those refresh is real, and it looks like it's 240 Hz. Is there any way you can set both to 180 Hz?

vicosku · « **Reply #35 on:** September 11, 2020, 10:07:24 am »

Yeah, it's weird. I've attached a photo of the OSD's refresh rate choices. When I change these, Windows automatically changes to match. Regardless of what is chosen in the OSD, I can set a custom refresh rate of 180 in Windows and it works. I can run borderless windowed games like this. However, if I run any game/application in exclusive fullscreen mode, it will go back to whatever refresh rate is set in the monitor's OSD. I assume using something like CRU to edit EDID information under one of the refresh rate settings would be necessary to overcome this.

mdrejhon · « **Reply #36 on:** September 24, 2020, 10:12:04 pm »

Some News, RetroArch Now Also Supports 180Hz and 240Hz BFI

I've also been encouraging other emulators to implement 180Hz+ and 240Hz+ BFI on a more wider scale.

Github Tracking Item about BFIv2:
https://github.com/libretro/RetroArch/issues/10754

Accepted Github Pull Request With Comments:
https://github.com/libretro/RetroArch/pull/11342

It doesn't yet support the comma-separated BFI feature or the BFI profiles yet, however, it is now in the github master / latest development builds.

Quote from: Calamity on September 09, 2020, 04:42:23 pm

You're assuming BFI + frame delay which is not possible yet (don't think it'll ever be).

It should be; though only small amounts of framedelay is useful -- something less than one destination refreshtime's frame delay for the first frame of the BFI cycle.

The higher the Hz, the less useful framedelay becomes. You could still do 2ms framedelay if your emulator module is capable of rendering a frame in 1ms, plus a 1ms safety margin, saving you about 2ms of input lag. That's much less savings than for 60Hz. As soon as you hit 1000Hz, your random halftime to next VSYNC is only 0.5 milliseconds! So might as well ignore framedelay for simplicity, because using 240Hz + waitable swapchains already guarantees you're no more than 1/240sec lag.

Moving Present() / glxxSwapBuffers() into a wrapper that hands it off to a separate thread responsible for precision present timing

However, if Calamity wishes to architecture the separate-thread technique (to allow software based VSYNC implementations), and pretend the thread (doing its own BFI independently) is just a hardware based VSYNC, then the existing framedelay would work with it with no modifications.

Read about Separate Frame Presenter Thread which replaces your Present() with a present-wrapper that manages it in a separate thread. This is also useful for beam racing & electron gun emulators. This allows you to roll-your-own software based sync technologies and/or virtualized displays or emulating VSYNC ON with VSYNC OFF, and other weird sync technologies in software instead of graphics drivers & display.

Then you can just continue to use your existing frame delay logic. And it provides a good future path for beam racing improvements, and CRT electron gun emulators (Temporal HLSL, BFIv3) via brute Hz too.

Advantages of wrapper that hands it off to a separate thread for timing responsibilities
- Reduces BFI flickers
- Can emulate blocking behaviour of waitable swapchains / VSYNC ON in the wrapper (even presenter thread continues executing its own precision) for backwards compatibility, with all sync technologies (VSYNC OFF, G-SYNC, etc)
- Improves compatibility of BFI with G-SYNC / Windows DWM compositor
- Improves frameslice beam racing
- Makes it easier to implement CRT electron gun emulators
- Makes it easier to implement your own custom software-based sync technologies
- Makes certain modes easier to be compatible with other emulator features (e.g. input delay).

Wrapper and threads would have its own independent internal delays / internal busywaits / wait-on-threads, as applicable, where appropriate. Where wrapper emulates the blocking behaviour of VSYNC ON and the thread does its own precision timing for the actual hardware presents (whether be G-SYNC, or for BFI, or for triple buffering like Fast Sync, etc). And for beamraced workflows, you can have a separate raster for presenting frameslices, or even single scanlines, with its own return after busywait-to-raster -- e.g. PresentScanLine() while passing whole buffer and just letting the wrapper decide frameslice size and when to pass those to the thread. Infinite granularity flexibility!

Wrapper essentially behaves as a software based VSYNC ON (for full-frame workflows), and/or rasterwaiter (for beamrace workflows). So calling the wrapper feels just like calling frame present directly on a VSYNC ON system, regardless of what special processing you're doing (BFI, high Hz, low Hz, beam racing, CRT electron gun emulator).

mdrejhon · « **Reply #37 on:** September 24, 2020, 10:24:05 pm »

Quote from: Calamity on September 11, 2020, 03:42:13 am

Hi vicosku,

I can't understand that you have one refresh on Windows and a different refresh on your monitor's OSD. Maybe my knowledge is outdated with regards to newer monitors. Only one of those refresh is real, and it looks like it's 240 Hz. Is there any way you can set both to 180 Hz?

Create a custom resolution of 1920x1079 that only has 180Hz.

Force GroovyMAME to use that resolution. Problem solved, it will use only that refresh rate instead of switching to the wrong refresh rate. Since that custom resolution only has one refresh rate (180Hz).

Calamity · « **Reply #38 on:** September 25, 2020, 06:35:05 am »

Hi Mark,

I need to re-add BFI, the feature got missed accidentaly in GM releases after June. I'll read your suggestion calmly. Anyway, multithreaded rendering has always been problematic. I haven't seen a single implementation that doesn't crash under certain stress conditions. We already have a "blitting" thread in GM for the triplebuffer implementation. The roadmap we have goes in the direction of implementing a cross-platform software vsync "interrupt" library, using threading to keep track of vsync while keeping rendering in the main thread, similar to what we discussed in your site. Not sure how BFI and your other suggestions will fit in this scheme.

vicosku · « **Reply #39 on:** September 25, 2020, 04:56:52 pm »

Quote from: mdrejhon on September 24, 2020, 10:24:05 pm

Create a custom resolution of 1920x1079 that only has 180Hz.

Force GroovyMAME to use that resolution. Problem solved, it will use only that refresh rate instead of switching to the wrong refresh rate. Since that custom resolution only has one refresh rate (180Hz).

This works! The result looks fantastic. Thank you so much. My thanks to you as well, Calamity.


Main	Restorations	Software	Audio/Jukebox/MP3	Everything Else	Buy/Sell/Trade
Project Announcements	Monitor/Video	GroovyMAME	Merit/JVL Touchscreen	Meet Up	Retail Vendors
Driving & Racing	Woodworking	Software Support Forums	Consoles	Project Arcade	Reviews
Automated Projects	Artwork	Frontend Support Forums	Pinball	Forum Discussion	Old Boards
Raspberry Pi & Dev Board	controls.dat	Linux	Miscellaneous Arcade	Wiki Discussion	Old Archives
Lightguns	Arcade1Up	Try the site in https mode		Site News


Unread posts \| New Replies \| Recent posts \| Rules \| Chatroom \| Wiki \| File Repository \| RSS \| Submit news