I just finished my lists, I did mine like so:
First created a list of only working roms and their descriptions (well all non-preliminary anyway, and removed the 'adult' games too) using Romlister.
Then I went through the whole 8000 line list by hand and removed; (clones, unless the parent was foreign or non-working then I removed the parent and kept the clone), (all clone hacks except pacman and galaga speedups), used that list to make a bunch of fake roms in a folder, then had Romlister make a merged XML with just those fake roms, then by hand edited it to; (remove all the 'USA 981222', 'bootleg' crap and such), (replaced all Japan, Spanish, Korea, ... with the word Foreign), (fixed the name of everything funky) and then (added 'pong' 'Dragon's Lair', and 'Space Ace' fake data to that Romlister merged data), then used Romlister to make the following Mala lists.
4-way Joystick.mlg (also has 2-way in it)
8-way Joystick.mlg (Magistiks top switchables)
All Games.mlg
Only Buttons.mlg
Casino.mlg
Console.mlg (all the VS and Snes and megadrive bootlegs)
Driving.mlg
Dual.mlg
Fighter.mlg
Foreign Games.mlg
Lightgun.mlg
Mahjong.mlg
Maze.mlg
Megatouch.mlg
Multi-Game.mlg
Paddle.mlg
Pinball.mlg
Platform.mlg
PlayersChoice.mlg
Puzzle.mlg
Quiz.mlg
Rhythm.mlg
Shooter.mlg
Spinners.mlg
Sports.mlg
Trackball.mlg
Ultracade.mlg (just games that were available on the ultracade cabs)
So now I have 3960 unique mostly working games with perfect names that can easily be filtered to remove foreign games. (and it only took me a week)

All the roms are still there, the lists only show the ones I want, it makes it easier to update sets.
I have compressed the Romlister 20mb merged XML file down to about 1.9mb, and all the Mala .mlg lists to 450k, if anybody wants them, let me know a site I can upload it to.
Tools used: Romlister, Mame -listxml, MS Excel (i didn't have SED or INTEG available on the computer I was using), Notepad, DOS batch files, Notepad++ (i used the xml plugin and collapsed the merged xml to show me only the description line)