There is no benefit to splitting things up. It doesn't inherently make anything faster, things get slower because they improve, be it cpu cores, or the drivers themselves, which is the exact same reason you can't just expect a plug-in system to work, many of the fixes are NOT in the drivers, but in the cores.
You're contradicting what Aaron says in his blog. According to him the slowdown is not just being caused by improved accuracy. He talks about an "optimisation point" which implies that the performance of older games is slowly being sacrificed so that code can be optimised (and simplified) for newer games. That implies to me that splitting up MAME would be beneficial for many games. Of course there's going to be some overlap between different eras. But that sounds to me more like an excuse than a reason. You're not going to convince me that Pac Man's code has much in common with Dead or Alive 2's code, even though both games are supported by MAME.
You also choose to make what I believe is a false distinction between a "driver" and a "core". I don't get that. Why can't you offer different plug in cores as well? Wasn't the whole point of re-writing the code in C++ to make it more modular?
Be honest now. The real issue is that you, and presumably most of the other current developers, just don't place any real value on execution speed. You're obviously free to develop the project in any way you see fit (just as others are free to fork it) but I personally think that ignoring performance is a big mistake. However fast processors get, they'll always be a range of different hardware out there, and many people like to re-use old computers for emulation purposes.
The optimization point applies to drivers and the core, the optimization point of the core moves to make driver development easier.
Old drivers had a lot of legacy cruft, dirty rectangle marking, manual timer handling, palette marking to squeeze things into 8-bit colours, code to manually do rotation.
Work like that, over time got removed/made unnecessary or moved to the Mame core, to make driver development easier for more complex systems, but at the same time moved the 'optimization point'
When it comes to programming MAME that makes things a lot easier, you can just get on with things, you can focus on EMULATING not coming up with a million and one ways to compensate for missing core functionality and limitations.
It also means you can't just mix and match old core versions with new drivers versions, it wouldn't work. You can't just take the CPS3 driver and shove it in old versions, because it doesn't have a bunch of hacks to do palette marking etc. (not to mention the SH2 core didn't exist in 0.3x, and you can't just backport the SH2 core to 0.3x because the recompilation engine didn't exist, and if you backport the interpreter it will be much slower anyway, and not work because the 0.3x framework has no timer system, everything had to do it manually back then, while the core instead relies on the timers..) The same is true of everything else. legacy stuff tends to depend on legacy framework principals, modern stuff on modern principals but it's not black and white.
There is the MAME core, or rather framework, and then there are CPU cores, sound Cores etc. which would better be referred to as devices, then there are the drivers.
CPU Cores, Sound Cores, Video Cores tend to be used by a lot of game drivers.
The C++ stuff makes it more modular, yes, but that still doesn't mean you can just mix and match, for example the 6502 was just rewritten to be cycle accurate, because most drivers made assumptions before that it wasn't they too had to be updated inline with the new core because some things in them were wrong before. You couldn't just mix the old / new 6502 with old/new drivers, it wouldn't work.
Sure Pacman might not share much directly with Dead or Alive, but it's code, running on a CPU, the concepts of memory maps, video updates etc. is just the same. If you understand how one thing works it's easy to get a grasp on how the other works because of how MAME is coded, if you know how to use the debugger with one, you know how to use it with the other etc. Splitting them wouldn't give any benefit at all.
Going back to the 6502, that's one case where you are going to see a performance decrease, and yes, only a marginal number of systems actually NEED a cycle accurate 6502, but of course, it's still progress, and Mamedev aren't going to decide to maintain 2 versions of the CPU core, and have to apply bug fixes to 2 versions of the CPU core, we go with the most accurate one and that's the only sane way to go.
As I said, we've seen emulators with plug-in systems, where you need an exact balance of various components to run specific games, it's a mess, it gives people and little incentive to actually emulate something properly when they can come up with a quickly hacked pluggable core. The very thought of Mame doing something like that is horrifying.