There's a couple of approaches you can take.
One you can check out is romlister.. although I had some issues with it's reliability.
but I think it's because I was using old versions of mame and I found out bitterly that the XML has changed a bit over the years.
eventually I hobbled together my own autoit scripts to rip out stuff I didn't want.
I did not take it as far as to remove clones and non us games though.. because that's a little more difficult.
some games do not have US versions or at least may not have US versions in the version of mame you're using (even if they exist in RL may not be in mame yet)
Even if you ripped out anything that said, world, etc, japan, korea, asia, Oceania, europe, china.. and so on and on on
some of those games are in english, YES there are some japanese marked games that are in English or partial English, or both.
so if you really want to really be sure you have to test them (assuming they don't have a known English version)
world, europe are safe non US roms (as in they'll be in english) if no US set is available.
Then you're gonna run into situations where you might actually want the non US set.
It was already touched upon with puckman/pacman.. if you ripped out all clones you'd be left with puckman
US set's are not always the parent.
Then as was said some times the US set is inferior, this is very game specific, me I'd rather have the english version then a non-english game with some additons, rastan saga is suppose to have extra story in attract mode in the japan version but I just run the US version anyway.
one game I really do like though that is not a US set is Vendetta (Asia 2 Players ver. D)
I don't think any other version has what this one does, around the i think 3rd stage you're attacked by leather clad homosexuals and i'll lave the rest for you to go explore... it's without a doubt the most hilarious thing I've ever seen in a game. (If anyone knows a 4p version with that intact let me know)
and it's in english but you would'nt know it if you just blindly ripped out any non US marked regions.
Ok this is getting long winded.. I recently had to cobble together a older .106 rom set.
After I filtered out all the useless stuff that did not match my control setup 8way w/6buttons
I was still left with a hefty list and basicly went thru it by hand.. yes it did take over an hour.
most roms are easy to spot and eliminate.. I mean how many versions of street fighter 2 do we need? and thats even after I had ripped out all bootlegs and hacks..
What really tricks you up is when the game has clones with vastly different names
dark stalkers and vampire hunter, or vampire savior.. for example.
or vendetta and crime fighters 2
Give romlister a go and it should at least get you started.