I've been testing a new mod. This is for plain D3D9. In drawd3d.c, just replace this function:
void renderer::end_frame()
{
window().m_primlist->release_lock();
// flush any pending polygons
primitive_flush_pending();
m_shaders->end_frame();
// finish the scene
HRESULT result = (*d3dintf->device.end_scene)(m_device);
if (result != D3D_OK) osd_printf_verbose("Direct3D: Error %08X during device end_scene call\n", (int)result);
// sync to VBLANK
if (window().machine().options().frame_delay() != 0 && ((video_config.triplebuf && window().fullscreen()) || video_config.waitvsync || video_config.syncrefresh))
{
D3DRASTER_STATUS raster_status;
memset (&raster_status, 0, sizeof(D3DRASTER_STATUS));
osd_printf_verbose("\nwait vblank start\n");
while (!raster_status.InVBlank)
{
if ((*d3dintf->device.get_raster_status)(m_device, &raster_status) != D3D_OK)
break;
osd_printf_verbose("current_line: %d\n", raster_status.ScanLine);
if (raster_status.ScanLine >= m_height)
break;
}
}
// present the current buffers
result = (*d3dintf->device.present)(m_device, NULL, NULL, NULL, NULL, 0);
if (result != D3D_OK) osd_printf_verbose("Direct3D: Error %08X during device present call\n", (int)result);
// sync to VBLANK
if (window().machine().options().frame_delay() != 0 && ((video_config.triplebuf && window().fullscreen()) || video_config.waitvsync || video_config.syncrefresh))
{
D3DRASTER_STATUS raster_status;
memset (&raster_status, 0, sizeof(D3DRASTER_STATUS));
osd_printf_verbose("wait vblank end\n");
while (!raster_status.InVBlank)
{
if ((*d3dintf->device.get_raster_status)(m_device, &raster_status) != D3D_OK)
break;
osd_printf_verbose("current_line: %d\n", raster_status.ScanLine);
}
}
}
The system I've been testing this is an i7-4771 3.5GHz, Radeon R9 270, LCD monitor. I don't know how this will behave with older cards/drivers. Here I've been amazed that it can log several hits per scanline. This opens a new horizon of custom vertical synchronization. Beware of the size of the logs if you run this build too long.
Ideally this build should be very stable even with frame_delay. I don't know how good or bad it will behave with asio (hopefully better than current implementation).
The value m_height here:
if (raster_status.ScanLine >= m_height)
is the total height of the screen, however usually we will use a somewhat lower value (substract some lines) to account for the time it takes the card to render the frame in paralell, so that when it finishes rendering it actually hits vblank. This way I expect to remove static tearing for LCDs too, although a pretty decent video card will be required. The dynamic estimation of how many lines to substract there is a challange.
There's something critical I've found out: the frame_delay value is off by 1 unit of what's supposed to be. So if you use -frame_delay 8, it's effect is -frame_delay 9. If you use -frame_delay 9, it actually "wraps" to the next frame, so -fd 9 must not be used!