Main Restorations Software Audio/Jukebox/MP3 Everything Else Buy/Sell/Trade
Project Announcements Monitor/Video GroovyMAME Merit/JVL Touchscreen Meet Up Retail Vendors
Driving & Racing Woodworking Software Support Forums Consoles Project Arcade Reviews
Automated Projects Artwork Frontend Support Forums Pinball Forum Discussion Old Boards
Raspberry Pi & Dev Board controls.dat Linux Miscellaneous Arcade Wiki Discussion Old Archives
Lightguns Arcade1Up Try the site in https mode Site News

Unread posts | New Replies | Recent posts | Rules | Chatroom | Wiki | File Repository | RSS | Submit news

  

Author Topic: Multithreading vs No-Multithreading performance  (Read 15299 times)

0 Members and 1 Guest are viewing this topic.

Dr.Venom

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 270
  • Last login:May 08, 2018, 05:06:54 am
  • I want to build my own arcade controls!
Multithreading vs No-Multithreading performance
« on: January 11, 2014, 11:10:36 am »
Hi,

I was noticing some input delay issues while using frame_delay of 7 on some drivers, which used to run perfectly previously, so I investigated a little further and found that -nomt seems to be the reason. 

My testcase is gradius 2. When measuring unthrottled speed I get: 

1. gradius2 without multihtreading, i.e. "mame gradius2 -nothrottle -nomt -waitvsync -v": 250%
2. gradius2 with multithreading, i.e. "mame gradius2 -nothrottle -mt -waitvsync -v": 1500%

NB. frame_delay 7 is set in mame.ini

When I run both case 1 and 2 throttled ("-throttle"), the normal way of playing games, then I'm experiencing input delay for case 1 and not for case 2. This would make sense as a frame_delay of 7 is way too big when unthrottled speed is only 250%.

I didn't realise the multithreading option could make such a big difference. Since GM has introduced the "-nomt" as the default with the latest release, I'm wondering if that's been the correct choice? Especially when it seems to have such a large impact on the effectiveness of frame_delay.
« Last Edit: January 11, 2014, 11:12:37 am by Dr.Venom »

Calamity

  • Moderator
  • Trade Count: (0)
  • Full Member
  • *****
  • Offline Offline
  • Posts: 7414
  • Last login:April 10, 2024, 02:02:31 pm
  • Quote me with care
Re: Multithreading vs No-Multithreading performance
« Reply #1 on: January 11, 2014, 11:27:53 am »
Hi Dr.Venom,

1. gradius2 without multihtreading, i.e. "mame gradius2 -nothrottle -nomt -waitvsync -v": 250%
2. gradius2 with multithreading, i.e. "mame gradius2 -nothrottle -mt -waitvsync -v": 1500%

This test is misleading. To get the real performance impact of -mt you need to disable -waitvsync in the command line. This is because when running in a single thread, -waitvsync can't be bypassed if enabled.

Quote
When I run both case 1 and 2 throttled ("-throttle"), the normal way of playing games, then I'm experiencing input delay for case 1 and not for case 2. This would make sense as a frame_delay of 7 is way too big when unthrottled speed is only 250%.

This may be a confirmation that the 3-threads model in GroovyMAME does improve input responsiveness too.
Important note: posts reporting GM issues without a log will be IGNORED.
Steps to create a log:
 - From command line, run: groovymame.exe -v romname >romname.txt
 - Attach resulting romname.txt file to your post, instead of pasting it.

CRT Emudriver, VMMaker & Arcade OSD downloads, documentation and discussion:  Eiusdemmodi

Dr.Venom

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 270
  • Last login:May 08, 2018, 05:06:54 am
  • I want to build my own arcade controls!
Re: Multithreading vs No-Multithreading performance
« Reply #2 on: January 11, 2014, 12:16:07 pm »
Hi Calamity,

This test is misleading. To get the real performance impact of -mt you need to disable -waitvsync in the command line. This is because when running in a single thread, -waitvsync can't be bypassed if enabled.

If I use "mame gradius2 -nothrottle -nomt -nowaitvsync -v" I still get ~250%, instead of the 1500% with mt. Am I missing something?

Quote
This may be a confirmation that the 3-threads model in GroovyMAME does improve input responsiveness too.

That could be of course, but before attributing it fully to that I'd rather make sure that there isn't something peculiar going on with the above case.

Calamity

  • Moderator
  • Trade Count: (0)
  • Full Member
  • *****
  • Offline Offline
  • Posts: 7414
  • Last login:April 10, 2024, 02:02:31 pm
  • Quote me with care
Re: Multithreading vs No-Multithreading performance
« Reply #3 on: January 11, 2014, 12:28:27 pm »
If I use "mame gradius2 -nothrottle -nomt -nowaitvsync -v" I still get ~250%, instead of the 1500% with mt. Am I missing something?

Yeah, I just tested it here. You need to disable both -waitvsync and -syncrefresh.

Then you'll still get a difference of about 10%, but not so huge.
Important note: posts reporting GM issues without a log will be IGNORED.
Steps to create a log:
 - From command line, run: groovymame.exe -v romname >romname.txt
 - Attach resulting romname.txt file to your post, instead of pasting it.

CRT Emudriver, VMMaker & Arcade OSD downloads, documentation and discussion:  Eiusdemmodi

Dr.Venom

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 270
  • Last login:May 08, 2018, 05:06:54 am
  • I want to build my own arcade controls!
Re: Multithreading vs No-Multithreading performance
« Reply #4 on: January 11, 2014, 01:49:01 pm »
OK that helps. I now get (IvyBridge 3770K at 4.6Ghz):

-nomt: 1259%
-mt: 1437%

This is measured by starting gradius2 and letting it run for 30 seconds through the attract sequence. The difference between 1437% and 1259% is still worthwhile I would say, but probably doesn't explain the experienced input lag in this case. That then -does- seem to point to the 3-threads model improving the responsiveness. Or maybe the extra 180% helps that little bit for some frames to be rendered just in time (i.e. not be missed) at frame_delay 7? Or possibly a combination of both, who knows...  In any case, to be on the safe side, I'll be sticking with multithreading as default.

Thanks for the explanation / help :)

Calamity

  • Moderator
  • Trade Count: (0)
  • Full Member
  • *****
  • Offline Offline
  • Posts: 7414
  • Last login:April 10, 2024, 02:02:31 pm
  • Quote me with care
Re: Multithreading vs No-Multithreading performance
« Reply #5 on: January 12, 2014, 07:46:49 am »
The marginal difference in performance between running with or without multithreading is due to the game thread being free to go on with emulation of next frame while the thread that deals with the video update is still busy drawing last frame. Of course, when running with v-sync enabled, which is the normal case with GroovyMAME, this performance difference is sacrified as we force the emulation thread and the video thread to run synced, so one waits for the other to finish. So regarding performance, it should be the same as running in a single thread. However, unlike base line MAME, GroovyMAME uses separate threads for video rendering and window message processing (input events). This means the window is always "open" to process input events, because the wait for vsync job is performed by the video thread. How this can actually improve input responsiveness has always been a matter of faith, but if you say you can notice it, I believe you. Anyway, the tests I did with high speed video where all done with multithreading enabled.
Important note: posts reporting GM issues without a log will be IGNORED.
Steps to create a log:
 - From command line, run: groovymame.exe -v romname >romname.txt
 - Attach resulting romname.txt file to your post, instead of pasting it.

CRT Emudriver, VMMaker & Arcade OSD downloads, documentation and discussion:  Eiusdemmodi

Haze

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 1296
  • Last login:October 04, 2023, 08:30:02 am
  • I want to build my own arcade controls!
    • MAME Development Blog
Re: Multithreading vs No-Multithreading performance
« Reply #6 on: January 12, 2014, 08:46:00 pm »
-mt (at least in regular MAME) simply isn't safe.

it relates to the final blit only (so drivers actually using more cores are unaffected, they still use threads for offloading complex tasks)

you'll get ugly bugs in many drivers because it completely decouples the emulation and the point at which the screen is actually updated, causing bad palette glitches, VERY noticeable on fades (eg. Rapid Hero level fade-in)

The drivers it affects will depend on your system, but in every case the effect on the emulation quality is a negative one, that's why it was disabled by default, it should probably be removed altogether.

I'd say it will negatively affect at least 50% of MAME when turned on, if you notice individual cases is down to you however.

even affects things like mk2
http://mametesters.org/view.php?id=5397

it's a broken feature, do not use it.
« Last Edit: January 12, 2014, 08:49:59 pm by Haze »

sean_sk

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 144
  • Last login:August 06, 2019, 10:27:48 am
  • If quizzes are quizzical, then what are tests?
Re: Multithreading vs No-Multithreading performance
« Reply #7 on: January 13, 2014, 08:21:25 am »
-mt (at least in regular MAME) simply isn't safe.

it relates to the final blit only (so drivers actually using more cores are unaffected, they still use threads for offloading complex tasks)

you'll get ugly bugs in many drivers because it completely decouples the emulation and the point at which the screen is actually updated, causing bad palette glitches, VERY noticeable on fades (eg. Rapid Hero level fade-in)

The drivers it affects will depend on your system, but in every case the effect on the emulation quality is a negative one, that's why it was disabled by default, it should probably be removed altogether.

I'd say it will negatively affect at least 50% of MAME when turned on, if you notice individual cases is down to you however.

even affects things like mk2
http://mametesters.org/view.php?id=5397

it's a broken feature, do not use it.

I have yet to experience the issues you've mentioned in regards to multithreading and MAME or, in my case, Groovymame specifically. I run GM with multithreading on and I've checked out both Rapid Hero and MK2 and both run flawlessly. Has a video been posted that shows the glitch in Rapid Hero? I tried a quick search but couldn't find anything.

UPDATE: Ran it on vanilla MAME and saw the glitches you mentioned, but don't get it on Groovymame. Would that be because of the way GM handles multithreading? I don't have waitvsync enabled either.
« Last Edit: January 13, 2014, 09:21:24 am by sean_skroht »

Haze

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 1296
  • Last login:October 04, 2023, 08:30:02 am
  • I want to build my own arcade controls!
    • MAME Development Blog
Re: Multithreading vs No-Multithreading performance
« Reply #8 on: January 13, 2014, 05:42:08 pm »
-mt (at least in regular MAME) simply isn't safe.

it relates to the final blit only (so drivers actually using more cores are unaffected, they still use threads for offloading complex tasks)

you'll get ugly bugs in many drivers because it completely decouples the emulation and the point at which the screen is actually updated, causing bad palette glitches, VERY noticeable on fades (eg. Rapid Hero level fade-in)

The drivers it affects will depend on your system, but in every case the effect on the emulation quality is a negative one, that's why it was disabled by default, it should probably be removed altogether.

I'd say it will negatively affect at least 50% of MAME when turned on, if you notice individual cases is down to you however.

even affects things like mk2
http://mametesters.org/view.php?id=5397

it's a broken feature, do not use it.

I have yet to experience the issues you've mentioned in regards to multithreading and MAME or, in my case, Groovymame specifically. I run GM with multithreading on and I've checked out both Rapid Hero and MK2 and both run flawlessly. Has a video been posted that shows the glitch in Rapid Hero? I tried a quick search but couldn't find anything.

UPDATE: Ran it on vanilla MAME and saw the glitches you mentioned, but don't get it on Groovymame. Would that be because of the way GM handles multithreading? I don't have waitvsync enabled either.

maybe, or just other changes cause it to manifest in a different way, the glitches are going to be system specific (because it ends up depending on how quickly your processor performs other tasks and therefore at what point the image actually gets rendered)) and any other changes to the general flow of things will impact the games that end up affected by the issue.

could be groovymame has legitimate fixes that need submitting tho.  I'm only clarifying why it got disabled by default in MAME.

Calamity

  • Moderator
  • Trade Count: (0)
  • Full Member
  • *****
  • Offline Offline
  • Posts: 7414
  • Last login:April 10, 2024, 02:02:31 pm
  • Quote me with care
Re: Multithreading vs No-Multithreading performance
« Reply #9 on: January 13, 2014, 07:25:24 pm »
Hi Haze,

Yes, I agree the -mt feature in base line MAME is flawed. Chris Kennedy (bitbytebit) and I noticed this in 2011, so patches for both SDL and Windows were submitted, however only the SDL patch was admitted. Anyway, it's no surprise the Windows patch was not admitted, as it somewhat defeated the whole multithreading idea (in case this was ever a good idea, I mean, having window and core threads decoupled): http://forum.arcadecontrols.com/index.php/topic,106405.msg1154882.html#msg1154882

There's an additional problem with the -mt feature of base line MAME. In certain circumstances it can lead to extremely bizarre input lag, I don't mean the kind of lag that some people say they "feel", but the one that persists during seconds after leaving the joystick/mouse alone. There's an explanation for this: having the window thread busy with rendering means that it can't process input messages until the video routines return. If you're v-syncing, this can represent a lot of time. But if you happen to be v-syncing to a refresh rate that is lower than the native game refresh, and you have -mt enabled, then you're likely going to have some frames that are virtually deaf to input. This happens to be the case when running many vertical games like Arkanoid on horizontal arcade monitors: you usually can't get 256 lines at 60 Hz, this means the monitor will refresh at 55-57 Hz while the game will try to keep running at 60 Hz. Combine this with a spinner and its load of input events that need to be processed and you'll see what I mean.

GroovyMAME benefits from the multithreading "infrastructure" in MAME (so for us it's good if it's kept) but implementing things a slightly different way, that fixes the existing problems (at least as far as I understand) and allows some extra features, like truly asynchronous tearing-free rendering, in other words, genuine triple buffering, not the fake implementation DirectX provides, and (apparentely) improving input responsiveness, by having input messages being processed alone in their own thread. Unfortunately, I have to say this implementation is not 100% safe either, due in part to Direct3D not being a thread-safe API, this makes it a nightmare to keep all threads synchronized and avoid the program crashing or enter a deadlock upon a simple ALT + TAB. This is the main obstacle I've found to even consider submitting these changes.
Important note: posts reporting GM issues without a log will be IGNORED.
Steps to create a log:
 - From command line, run: groovymame.exe -v romname >romname.txt
 - Attach resulting romname.txt file to your post, instead of pasting it.

CRT Emudriver, VMMaker & Arcade OSD downloads, documentation and discussion:  Eiusdemmodi