| Main > Main Forum |
| Arcade hardware question |
| (1/3) > >> |
| Hoopz:
This is NOT a knock on Mamedev or anything like that. I simply don't know much about how the hardware was setup in some newer arcade games so I wanted to ask. What was different in the hardware for games that can't run at full speed in Mame now (Gauntlet Legends, Blitz etc) compared to the hardware that was available in pc's at the time or even pc's now? I don't know enough about how these games were made to understand it will take a 10 Ghz cpu to run some of them (Mame .120 changes notwithstanding). If someone could explain the difference between how the hard drives connected, what kind of cpu's were used, or _____ (whatever made the big difference), I would appreciate it. Again, I am not bitching or complaining that something isn't running at 100% speed in Mame. I just don't know enough about it to understand the gap between what they did then compared to now. Thanks. |
| u_rebelscum:
Ever noticed slowdown playing an old 16 bit DOS game on a modern 32/64 bit CPU in winXP? That's only emulating the software (dos). Ever ran windows on an intel mac? That's virtualization (windows has access to the real hardware) not emulation like mame. You are not playing ports of the games, nor is mame virtualizing the hardware. Mame is emulating the original hardware that is running the arcade game. This has more steps than the PC running the PC game, so you can't compare mame to a PC game + PC hardware. MameDev FAQ if you haven't read it yet. arcade game --> arcade hardware --> mame --> PC hardware PC game ------------------------------------> PC hardware Notice how the PC game directly goes to the PC hardware. Think each step as a 30 to 150 pound weight on your back. Another way to look at it, a very general rule is: it takes 10 to 100 times the speed of the original hardware to emulate said hardware on a modern PC. This is so wide, as emulating an 8-bit CPU is generally easier (read: faster) that emulating a 32-bit CPU, and analog & DSPs are harder (read slower) to emulate than fixed CPU. And since all the hardware is being emulated on the PC's CPU, the sound, video, O/I, and all other chips also are counted in addition to the arcade CPU. If you want info on the hardware, look at mame's source (the best place, but not easy to read ;) ), or www.system16.com (some boards more documented than others). Call GL as total of 250 "Mhz" (200 + 16 + 16 + ), times a IMO low factor of 40x, and you get 10 Ghz. |
| MonMotha:
The big culprit in the case of Blitz/Gauntlet Legends is the video hardware. Most of the old games that you can emulate in real time use simple sprite/tile based video architectures. This is pretty straightforward to implement on a general purpose CPU and modern PCs are so fast that they can keep up. Blitz/Gauntlet Legends (the hardware is very similar) use a 3dfx Voodoo (yes, the same thing you could have bought in ~1996 for your PC). 3d hardware is a different beast altogether, and there's a reason modern PCs now include 3d accelleration capabilities. In a nutshell, for every clock, a 3d device can complete several more operations, within the narrow set of operations it is designed to perform, than your CPU can. Your CPU can perform about 2-4 operations per clock, but a graphics processor can perform on the order of 8-256, depending on the device. Emulating this on a CPU is entirely possible, but even if you completely discount any "translation" that has to be performed, it'll run 2-128 times slower than a GPU clocked at the same speed. GPU operations are also specifically tailored to graphics usage and are in most cases SIMD type operations (they operate in a single instruction on more than one set of completely independent data). The graphics device in Blitz is clocked at (IIRC) 200MHz and for some reason I'm thinking the Voodoo pipeline is 8 wide (i.e. it can complete 8 ops every clock), so you'd need a "1.6GHz" CPU (or so, using very rough examples here) just to complete operations as fast as the Voodoo on Blitz does, plus you need to translate Voodoo instructions (which are very graphics specific) into CPU instructions (which are not), and this inevitibly requires more than one CPU instruction for every GPU instruction. THEN, you also need to emulate the CPU (again, often more than one CPU instruction for every GPU instruction, though the MIPS arch used by Blitz and Gauntlet Legends is pretty straightforward), AND you need to emulate the sound DSP (which again can complete about 4-8 instructions per clock, usually), AND you need to handle mapping the controls (which are very easy to talk to on the arcade hardware) to PC controls (which aren't nearly as easy to talk to, in comparison). PLUS you have a behemoth of an OS running in the background, while the arcade hardware does not. Now, all this is exacerbated by the fact that MAME is not multi-threaded! Modern CPUs are getting faster because they are more than one CPU in a package. MAME is emulating 2-4 pieces of discrete hardware (or more, in some cases) that all ran on their own clocks, completing instructions all in parallel, but MAME only uses one of your processors (if you have more than one), so your ONE processor has to do the job of ALL the hardware that was on the original board, and your ONE processor isn't even particularly good at most of it. I guess I do feel it necessary to point out that comparing wildly different architectures using clockspeed as the chief measure of comparison is a horrible thing to do, but with appropriate context it can at least give you an idea of what's going on. Take all numbers with an ocean's worth of salt, but the general principle is what matters. |
| Hoopz:
Thanks guys for the answers. Robin, I was hoping you would answer as I have always found you to be very helpful and probably the best person on the boards explaining different concepts re: the Mame code, inputs etc. While I appreciate the offer to look at the source code, my last programming class was in 1986 and I'm not really able to understand any programming now. I had read the MameDev FAQ but it had been awhile ago. I looked through it again tonight. I would say ultimately that my lack of understanding is because I don't have any recent programming experience. While I can understand the analogies, I don't have the frame of reference to understand more than a basic concept. Now you see why my last programming class was in the 80's. ;D You people that can do all that stuff and make it look so easy really have talent. :cheers: Thanks Robin, I appreciate your time with the answer. It does help. MonMotha, thanks for the help too. I actually understand more about modern cpu's, video cards etc. The Voodoo information sounds familiar and I remember choosing the Riva TNT over a Voodoo card in 1997 or so. I probably shouldn't have used Gauntlet Legends as an example because I did know that one was more advanced graphically and used a "real" video card. Since Mame has the cpu handle all those processes instead of having the gpu do it, I understand why that aspect is slower. The example of playing the 16bit game in XP helps too. Thanks guys. |
| Avrus:
--- Quote from: MonMotha on October 16, 2007, 07:37:36 pm ---Now, all this is exacerbated by the fact that MAME is not multi-threaded! Modern CPUs are getting faster because they are more than one CPU in a package. MAME is emulating 2-4 pieces of discrete hardware (or more, in some cases) that all ran on their own clocks, completing instructions all in parallel, but MAME only uses one of your processors (if you have more than one), so your ONE processor has to do the job of ALL the hardware that was on the original board, and your ONE processor isn't even particularly good at most of it. --- End quote --- As of 119u3 I believe they're working on that. --- Quote ---Changed implementation of OSD work queues that are created with the WORK_QUEUE_FLAG_MULTI hint. Such queues now create n-1 threads, where n in the number of logical processors in the system. This allows the main thread to continue accomplishing things while other threads process the work. If the main thread subsequently calls osd_work_queue_wait(), it will then dynamically "jump in" and help the other threads complete all the work items. [Aaron Giles] --- End quote --- --- Quote ---Added support for controlling multithreading behavior through an environment variable OSDPROCESSORS. To override the default behavior, set OSDPROCESSORS equal to the number of logical processors you wish the OSD layer to pretend you have. [Aaron Giles] Changed the 3dfx Voodoo emulation code to take advantage of the new threading mechanisms above. It now creates a work queue with the WORK_QUEUE_FLAG_MULTI flag set, and uses shared work items to spread rasterization work across multiple processors. Note that this support should be considered experimental; under some circumstances it is known to deadlock. If you encounter problems, set OSDPROCESSORS to 1 to effectively produce the previous behavior. [Aaron Giles] --- End quote --- |
| Navigation |
| Message Index |
| Next page |