Nope... It's pretty much impossible with current tech.... I looked into this ages ago, there are just problems to be blunt.
Ok let's say you want to use a camera tracker on a crt screen to detect the scanline beam... well that won't work. While a light sensor is incredibly fast and could register a hit in a fraction of a second, camera-based tracking requires a chip to analyze the picture it's looking at and detect points of interest. It's laggy detecting what it's supposed to detect (a big old IR led), so it'd be impossibly slow reading the white as it goes across the electron gun.
Now the other way around... simulating the electron beam and reading it with an old-school lightgun won't work either. LCD's aren't bright enough and they don't give off the right frequencies of light. Even modified with a special sensor it'd be difficult because again... speed... Getting a piece of hardware/software that could accurately simulate that beam scan to work fast enough and timed reliably would be hard, maybe impossible.
You're mis-reading what I'm proposing. Nowhere am I talking about A: Using a camera to read a CRT screen nor B: using current tech screens and trying to get a CRT scanning gun to work with it. There is no CRT and no CRT gun involved... unless you hack one or the other apart but that's not what I'm talking about either.
What I am proposing is using a modern X/Y light gun with its established sensor/emitter pair with a bit of hardware to
simulate the signal timing from the CRT based light gun. In essence, to "fire" off the appropriate signal when the old CRT light gun would have detected a hit.
At best, the adapter would strictly be a part of the assembly between (or hacked into the) gun and/or console. At worst, a pigtail on the video out would be required to capture the scanline timing. It would go without saying every case would have a calibration routine (ATARI can't go without one anyways).
Now on real consoles, the lightgun interface is usually quite simple (with the exception of the playstation guns). You've got two raw switches that are usually wired directly into the console for speed... the light sensor switch and the trigger switch. All old guns, even the zapper use this... the software waits for a trigger switch to be pulled and then starts whiting the screen. If the light sensor goes off in this cycle, it uses the refresh rate and the time elapsed to calculate the position.
Yes, I believe I mentioned this? Oh, here it is.
This is what I mean. The XEGS LG (XG-1) worked by blanking the screen when the trigger is pulled. Then scan line is started with white (or some bright color) and goes down line by line until the LG sensor picks up the "bright" light. X and Y is latched by the POKEY which is then read by the software.
I assume the SEGA Master System also works on a similar principle since it is common to hack the SMS gun in lieu of the XG-1. I would bank all other old school CRT light gun games for the home console also works on very similar principles. I think the NES (or was it the SNES?) just blanked the whole screen and "lit" up the targets.
I didn't bother going into detail about the specific hardware because, hey, I figured that it's pretty simple stuff

It's a simple as that, but that refresh rate is timing critical, as explained above, which is what makes it practically impossible.
See above.
I don't think it's impossible, just seems that way. Programmers and fans have been racing the beam for 37 years using nothing more than an 1.19MHz 8 bit CPU coupled to TIA. Racing the beam would become nearly trivial if moving to a 32-bit CPU/MCU running as low as 16MHz where there is far more than 50% CPU time available. Swap in a FPGA or a MCU/FPGA hybrid or a faster clock and what do we get? Something that might be workable. We don't need a 120+MHz CPU to stay ahead of a 63.5us scan line when all we're really interested in is vsync (in the case of the ATARI/SEGA).
To put it simply.
If we tap into Video out, we can get the scan timing. For the light pen, there isn't a need to know what is on the screen. Just when vsync occurs. All the rest is just timing it out within the controller. The NTSC signal timing is very well documented. Pick some reasonable multiple of the NTSC signal for your clock. The NES method is a little trickier with more required to interpret the video signal. However, we're not looking at the actual video the entire time, just the "firing" frames.
Think about it, I'm not proposing to create another TVP5160 or TVP5020 (though a dedicated decoder might make things a little easier

). Baby steps right? No one ever made an accessory that worked with every known console known to man in one go.

Not yet anyways.
The gun trigger is stupid simple and we will not discuss it here. If you can't handle a switch, just hush up.
The X/Y position of the receiver/emitter pair is where I think the real work will occur and what will separate the baby MCU from the real work horses. Receiving accurate X/Y positioning and doing the math required to accommodate the different variables present. Position of the emitters and receiver. Actual geometry of the monitor. Timing of the trigger pull and applying the necessary offsets when firing off the "hey! I sensed a light beam!" signal.
I'm not saying it is simple (except for that switch

). What I'm saying is that I think it's feasible.