What would I use to ensure I have *accurate* data since the game listing coming out of Mame has a number of inconsistencies? Can I assume that ROM name, description, date, MFG are mostly accurate from the Mame XML?
The mame XML is as accurate as mame, ignoring the few generalizations done to simplify for human readability.
The control area has the most generalizations. For example, as mentioned before, "players" is really "the highest player number used in the input stack in the game's driver". Another example, "buttons" really is "the highest button number used in the input stack in the game's driver". If a button number is skipped, it's not noted (AFAIK, no driver skips more than one button). If a button is used as a hack for something it can effect the button number (if it pushes the buttons numbers up, like pacman) or the players number (if a different player is used to keep the button count correct).
The easier stuff (name, description, etc) have a much higher accuracy rate, and are changed if found and agreed to be wrong. The high accuracy is mostly because they are easy to check, verify, and change.
Oh... and the history on defender says that they released a cocktail version of Defender, which, following the logic presented here, would mean that <input players="1"> is inaccurate. Right?
No, defender doesn't have a cocktail dipswitch, so the parent rom only has one set of controls as it doesn't have a cocktail mode. So defender's entry is accurate.
Looks like one of the following: mame doesn't have the cocktail version ROM dumped (if it was a different ROM), that part is not emulated, or the history is wrong. I'm guessing the first one.
... Some very basic info might be ok but there simply isn't enough room to show that much data. That is why I always reccomend calling johnny 5 from within the fe when you are curious as to controls. Now the other data doesn't take up as much room and I like to put that in a skin sometimes.
Yes, the controls can be too much sometimes, while the basic info is easy. Take the 50 or so different input types (from controls.dat), and the hundreds of different combinations they have been used, even after mame's simplification of the inputs to a dozen types, and you've got a tangle to mess with.