Main Restorations Software Audio/Jukebox/MP3 Everything Else Buy/Sell/Trade
Project Announcements Monitor/Video GroovyMAME Merit/JVL Touchscreen Meet Up Retail Vendors
Driving & Racing Woodworking Software Support Forums Consoles Project Arcade Reviews
Automated Projects Artwork Frontend Support Forums Pinball Forum Discussion Old Boards
Raspberry Pi & Dev Board controls.dat Linux Miscellaneous Arcade Wiki Discussion Old Archives
Lightguns Arcade1Up Try the site in https mode Site News

Unread posts | New Replies | Recent posts | Rules | Chatroom | Wiki | File Repository | RSS | Submit news

  

Author Topic: Hypothetical Console rom Standardization.....  (Read 4603 times)

0 Members and 1 Guest are viewing this topic.

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Hypothetical Console rom Standardization.....
« on: June 28, 2011, 04:04:31 pm »
This happened in a round-about way.... 

I was playing with googles's new search by image feature as a way of getting boxart more quickly.... you snap a pic of your cart via webcam and google uses the image to make a search routine.  It's not ready to be implemented via pure code yet (ie integrated into a front-end) but it works amazingly well!  I can snap a pic of a smb3 cart and google does a search for "super mario bros. 3 cartridge"  It will be something to look for in the future.

Anyway, while looking over my old cart collection, I noticed that all nintendo games, from nes to wii have a product code printed on the cart and the box.  The format varies from system to system, but it always contains a unique game id, manufacturer id and country code.  Looking into it even more I discovered that ALL nintendo and sega games have this info embedded into the header of the game's roms as well. 

So why in the world don't we have a rom name standardization for console roms?  Should this id be what the rom is called?  A very simple id=Game Name ini file could take care of the real names.  What's more if you have the cart, or pic of the cart getting the official game name is easy.  Type the id into google and the results will feature the game!  It's as simple as that!

I know that newer disc-based games, particularly wii games already use this standard for releases.  It seems like it might be worth going back and standardizing some of the classic games. 

The only thing crc's (which I think is what goodtools uses atm) would be good for would be rom hacks... but who cares about those anyway.

I could see a console-centric fe reading the headers directly instead of relying on the zom name and automatically renaming them to the standard!  The headers are actually quite easy to parse, it's just different for each system.

I might whip up a little app to show just what I'm talking about. 

Dazz

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 1246
  • Last login:January 11, 2025, 07:43:39 am
  • HyperSpin Team
    • VPUniverse
Re: Hypothetical Console rom Standardization.....
« Reply #1 on: June 28, 2011, 04:16:20 pm »
Hmmm, you've got my attention...  

Being part of a front end team; rom management is a HUGE issue, not only for HyperSpin, but most FE's.  Right now HyperSpin uses the No-Intro as it makes the most sense and the roms included in the set are only the best of the best.  No hacks, no bad dumps, over dumps, under dumps, etc.  

However, if we could come up with a nice standard this would make FE's jobs much easier...

I just pulled out a bunch of game carts for several different systems.  It looks like only a couple have codes.  Nintendo has always been good using their code system, Sony Playstation, PS2 and PS3, Xbox are the only ones that have standard codes that stand out on their carts/boxes.  So, a system like this would work great for the systems that assigned their games codes, but what about those that don't?
« Last Edit: June 28, 2011, 05:32:04 pm by Dazz »



Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #2 on: June 28, 2011, 07:31:42 pm »
Hmmm, you've got my attention...  

Being part of a front end team; rom management is a HUGE issue, not only for HyperSpin, but most FE's.  Right now HyperSpin uses the No-Intro as it makes the most sense and the roms included in the set are only the best of the best.  No hacks, no bad dumps, over dumps, under dumps, etc.  

However, if we could come up with a nice standard this would make FE's jobs much easier...

I just pulled out a bunch of game carts for several different systems.  It looks like only a couple have codes.  Nintendo has always been good using their code system, Sony Playstation, PS2 and PS3, Xbox are the only ones that have standard codes that stand out on their carts/boxes.  So, a system like this would work great for the systems that assigned their games codes, but what about those that don't?

I know that sega genesis games have a header system similar to the snes (suprise suprise) so even if it isn't on the box, it is in the rom itself.  Most roms have some sort of header... it would take some research, but I think that we would find that most roms have a unique identifier hidden in their code somewhere.  Reading the headers is trivial as well.  Typically you load up the first few bytes of the rom and parse a few bytes.  There's actually a lot of data in some of them... it's comparable to the info mame prints out in terms of manufacturer, revision number, date, ect...

I've always wanted to do a console-exclusive fe, but standardization is what has held me back.  I can pool some data if others would actually be willing to help me with this.  I think we need some hi res pics of carts from various systems, but like I said even if it isn't on the cart itself, it's probably in the rom. 

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #3 on: June 29, 2011, 12:47:44 pm »
I worked on this a little last night just to see how possible it is. 

n64: 

Very easy to parse, first 64 bytes of the game, content can be read out as ascii, the game id AND the first 20 characters of the game's name are in there and the only thing you have to deal with is a lookup table for the games region and the removal of null characters (ascii code 0).  Some of the less popular file extensions have some swapped bytes as well, but it is quite easy/fast to re-arrange them.

I wrote a little program to test this and it works quite well.

snes:

Ugh....  21 Characters are reserved for the game name this time, which is good I guess.  Contains about as much info as the n64 header, BUT I haven't found the game id yet... sometimes the id for a game is encoded though so it might be in there.  The problem is that the snes header is NOT at the beginning but can be at one of 4 places depending upon the type of rom it is and if somebody had inserted a emulator-specific header.  While it's easy to check all four areas, I haven't found a good way to determine if you've actually found the header.  I'm sure there is one, it's just the docs for snes headers are a bit sketchy.

nes:

Easy to parse, like the n64.  There is far less data though... basically you get the game id, some mapper info, and that's about it.  Luckily the id is all we really need.  There are also "iNES" headers for us to contend with, which add supplimentary info for the INES emulator.  On top of that, you can add data to the end of a nes rom without emulators complaining, so there is often a footer that contains the full game name, assuming the set has been manipulated that is.  Fun fact:  The mp3 tag protocol (id3) actually came from headers tacked on to early nes rom dumps for use with emulators!


My guess is I'll find similar results for the sega carts.  I'll also guess that carts earlier than the nes aren't going to have a header, simply due to lack of space. 

But sega and nintendo are the big two, so getting all of their info would be a massive help!

On top of that we know that once games switched to discs, they begin having unique ids.  If nohting else we can read the cd/dvd id, which has nothing to do with the game at all and HAS to be on any disc media pressed.

SavannahLion

  • Wiki Contributor
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 5986
  • Last login:December 19, 2015, 02:28:15 am
Re: Hypothetical Console rom Standardization.....
« Reply #4 on: June 29, 2011, 03:55:09 pm »
Did you check multiple versions of the same game?

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #5 on: June 29, 2011, 07:40:01 pm »
Yeah, it isn't quite as obvious, but it is detectable. 

Most of the headers have a single "version" byte, set to "00" by default.  If a revision of the game comes out, they up the version number. 

Also the region code seperates different countries' versions. 

Further inspection led me to the following conclusions:

Snes... the id isn't in the header, BUT each header is unique for each game what with the game title, the region and the version number included.  It should be possible to use the header as a unique identifier for a game and then rename it based on some sort of external lookup table.
I also found a method to determine where the header is hidden, so that isn't an issue.

NES... the header is pretty worthless, BUT there is enough data in there for a unique id.  I also found a txt file dating from 2001 with a list of all the games, and their id numbers as printed on the cart.  Nes games also support a 128 byte footer at the end of the rom, so the solution there would be to use the lookup table and add our necessary data to the end of the rom.

I want to get snes and nes games added to my little test application and then I can look into other carts. 

It also appears that many emu-centric websites as well as big brand websites like gamefaqs, have the game id's on a game's info page, so it should be fairly easy to come up with lookup ini files. 

n64 is golden... all the info we need is in there, so there's no mucking about.

SavannahLion

  • Wiki Contributor
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 5986
  • Last login:December 19, 2015, 02:28:15 am
Re: Hypothetical Console rom Standardization.....
« Reply #6 on: June 29, 2011, 08:25:46 pm »
I'm on my phone so I can't look it up but I recall one of the emulator authors trying to do something similar. Why he failed isn't the point but I do recall him saying that most roms out there have headers that were added by whatever tool was used to pull the data or whatever the emulator author put there. Might be worth a read. Let me see if I can find it tonight.

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #7 on: June 30, 2011, 01:59:13 am »
I'm on my phone so I can't look it up but I recall one of the emulator authors trying to do something similar. Why he failed isn't the point but I do recall him saying that most roms out there have headers that were added by whatever tool was used to pull the data or whatever the emulator author put there. Might be worth a read. Let me see if I can find it tonight.

Well understand that I'm not talking about supplementary headers (those are in there and you have to deal with them) I'm talking about actual headers...... these are part of the original rom image and aren't tacked on by the dump tools.

For example snes roms with the "SFC" extension have nintendo's internal header as well as the SFC header.  We promptly ignore the SFC header and use nintendo's internal one.  Now keep in mind that hacked roms probably have their original headers in tact, but again, I have no interest in dealing with hacked roms. 

SavannahLion

  • Wiki Contributor
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 5986
  • Last login:December 19, 2015, 02:28:15 am
Re: Hypothetical Console rom Standardization.....
« Reply #8 on: June 30, 2011, 02:11:06 am »
I'm on my phone so I can't look it up but I recall one of the emulator authors trying to do something similar. Why he failed isn't the point but I do recall him saying that most roms out there have headers that were added by whatever tool was used to pull the data or whatever the emulator author put there. Might be worth a read. Let me see if I can find it tonight.

Well understand that I'm not talking about supplementary headers (those are in there and you have to deal with them) I'm talking about actual headers...... these are part of the original rom image and aren't tacked on by the dump tools.

For example snes roms with the "SFC" extension have nintendo's internal header as well as the SFC header.  We promptly ignore the SFC header and use nintendo's internal one.  Now keep in mind that hacked roms probably have their original headers in tact, but again, I have no interest in dealing with hacked roms. 

TBH, neither do I. It especially irritates me that rom dumpers or writers feel the need to tack on a bunch of ---steaming pile of meadow muffin--- that wasn't there to begin with. Use ---smurfing--- tables for crying out loud. These consoles have a finite number of games so tables aren't unfeasible.

The fad of writing/hacking games that are emulator specific that can't or won't run on the actual emulator pisses me off too. Building an entire game around a known feature bug that doesn't exist in the original hardware is just asinine.

But that's off topic. I just thought I would point out the original article but for the life of me I can't remember which emulator author lamented about this. He also proposed how to manage the ROMS. I thought it would be a worthwhile read for ideas. The only thing I can recall is that he makes it a point to write his emulator for accuracy and not on the whims of the audience. He drew back from the scene and claims to only write his emulator for his own benefit and thus... his emulator lacks many of the "special" features most other emulators have. I think it was an NES emulator but I could be wrong. SNES maybe? I tried searching but no luck.

SavannahLion

  • Wiki Contributor
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 5986
  • Last login:December 19, 2015, 02:28:15 am
Re: Hypothetical Console rom Standardization.....
« Reply #9 on: June 30, 2011, 02:13:17 am »
Oh duh, now I remember. It was bsnes.  :banghead:

edit

OK, Maybe I'm still wrong. I could have sworn he wrote a detailed article about the ROM files and his ideas for managing them.

Oh well. I give up.

Oh... I like this idea. Forgot to mention that I think.
« Last Edit: June 30, 2011, 02:31:39 am by SavannahLion »

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #10 on: June 30, 2011, 03:46:24 pm »
I'm on my phone so I can't look it up but I recall one of the emulator authors trying to do something similar. Why he failed isn't the point but I do recall him saying that most roms out there have headers that were added by whatever tool was used to pull the data or whatever the emulator author put there. Might be worth a read. Let me see if I can find it tonight.

Well understand that I'm not talking about supplementary headers (those are in there and you have to deal with them) I'm talking about actual headers...... these are part of the original rom image and aren't tacked on by the dump tools.

For example snes roms with the "SFC" extension have nintendo's internal header as well as the SFC header.  We promptly ignore the SFC header and use nintendo's internal one.  Now keep in mind that hacked roms probably have their original headers in tact, but again, I have no interest in dealing with hacked roms. 

TBH, neither do I. It especially irritates me that rom dumpers or writers feel the need to tack on a bunch of ---steaming pile of meadow muffin--- that wasn't there to begin with. Use ---smurfing--- tables for crying out loud. These consoles have a finite number of games so tables aren't unfeasible.

The fad of writing/hacking games that are emulator specific that can't or won't run on the actual emulator pisses me off too. Building an entire game around a known feature bug that doesn't exist in the original hardware is just asinine.

But that's off topic. I just thought I would point out the original article but for the life of me I can't remember which emulator author lamented about this. He also proposed how to manage the ROMS. I thought it would be a worthwhile read for ideas. The only thing I can recall is that he makes it a point to write his emulator for accuracy and not on the whims of the audience. He drew back from the scene and claims to only write his emulator for his own benefit and thus... his emulator lacks many of the "special" features most other emulators have. I think it was an NES emulator but I could be wrong. SNES maybe? I tried searching but no luck.

I agree completely.. it's stupid.  Even if the authors of these emulators were too lazy to make a huge table, they could have wrote cfg files for each individual rom that are external.  They could have wrapped both files in a zip and been done with it. 

There are a ton of hurdles to overcome with such a project.... the main one being that even if I manage to write a converter people would actually have to start using it.  That's a hard one.  I tried to standardize visual pinball tables years ago and the vp authors basically told me to "get bent" in a nice way.  And still to this day it's next to impossible to know for sure which variant of a table you have. 


But back to progress. 

Looked into sega master system roms.  They have a header as well and also each game has a unique id.  The only tricky thing is that the bytes are flipped (my app already had compensated for that) and that the values are written in Binary Decimal Conversion.  Once I added support for reading individual bits it was easy to add. Also Game Gear roms apparently have the same exact header, so I got two for one! The only problem is that these unique ids never appear anywhere on the cart or box, they are simply internal. 

From my research I've ran into several variants and this is how I propose we deal with them. 

1.Cart/Box has a clear id.  Id is also contained in the header:  This is our ideal rom format.  We simply rename based on the internal header. We can also create a ini file/table with id number to Real game name lookup.  As you said, console roms have a finite number of games, so this is doable.

2.No id on Cart/Box but unique id is contained in the header:  We can still rename based on internal header, but because the id isn't documented, someone would have to take a good tools version of the whole set and convert it to get the game names.  A little more work, but still doable. 

3.Id on the cart/box, but no id in the header.  Header is unique though:  We would need a two stage ini for this one.  First a ini that converts the relevant header bits (like the game name + the region code + the version number in a snes header) to the official cart id.  Then a second ini that shows the game name in relation to the cart id.  This is what will have to be done for the snes, which is a pain in the ass, but still doable.

4. Id on the cart/box, but no id in the header.  Header is NOT unique:  Unfortuantely this is how the nes games are.  Fortuantely, somebody has already made a table of all known nes games and their corresponding id.  The problem comes with rom identification.  It obviously can't be done with the header, but perhaps a crc check of the the rom AFTER the header (because as you've stated, emu authors and rom hackers tend to monkey with the headers and/or tack extra data on.) Many roms including nes roms, allow you to tak some data on the end of the file.  Once a rom is crc checked, we might be able to "tag" it with the game id and game name.  I'm not so sure about messing with the actual rom data though. 

5.  No id:  If there's some sort of unique header or soemthing we could make up id's for all the games.  This would be time consuming though.  If not then these sad examples might have to remain in goodtools land. 

So yeah a ton of work would be involved.  I'm just playing with the idea atm... trying to write a little program that can parse the headers of all the popular consoles.  I would need help with this though if we proceed.  It's not like I'm going to sit down and document hundreds of roms by myself. 

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #11 on: July 01, 2011, 02:26:27 am »
Just an update:

Looked at the bsnes page....

While I completely and totally agree with everything the author has said, his "barebones" rom format has never caught on.  The most popular snes format by far is the "smc" format, which is pretty much the ONLY snes file format you are going to find these days.  A person has to do some hoop jumping to remove/ignore headers that have been tacked on, BUT the roms themselves and the internal game headers are still the same.  I added support for the smc format to my little test app this morning. 

Since then I've also added support for :

Gameboy/Gameboy Color:  They have a similar header to the snes except it's easier to find.  Unfortunately many of the dumps out there have their headers overwritten with emulator garbage and/or have a header appended at the top of the rom.  Getting a complete, untouched set together would be the real challenge. 

Genesis:  Added support for the ".bin" format only as this is the raw rom dump and it is the most popular genesis format anyway.  The cart id is actually in the header With these, so like the master system and n64, it will be a breeze to convert. 

On a side note the reason genesis ports were so inferior might have been all the space wasted on the header.  ;)  Seriously though, it's around 248 bytes!!  The thing contains everything from the name of the console, to a copyright notice, to TWO instances of the game name to even a memo section!  Compare that with the n64 header, which is a trim 64 bytes long.


So to recap, out of all the classic nintedo and sega carts, Ive found identifiable info in every cart except the NES.  Some of these headers don't outright have the cart id in there, but they have a checksum, game name or some other unique identifier that we can use to id the game and convert the file name to the cartid. 

Anything newer will be disc-based and thus at least have a cd id # for identification.  Anything older, unfortuantely, won't have a header, BUT systems older than the nes have FAR FEWER GAMES than more modern consoles so it should be easy for a team of people to get ahold of a complete set and assign arbituary id codes.  I know that companies such as activision actually printed a nes-like cart id on all their releases, so just because it isn't in the header doesn't mean we can't get an "official" id.

So is there still any interest knowing the work that would have to be done?

SavannahLion

  • Wiki Contributor
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 5986
  • Last login:December 19, 2015, 02:28:15 am
Re: Hypothetical Console rom Standardization.....
« Reply #12 on: July 02, 2011, 05:53:46 pm »
Why do arbitrary ID codes for those that don't have them? Just do a checksum and run it off that.

Case in point. Most 2600 carts have an ID usually in the form of a model number. It's not possible to find the ID inside the ROM. But it is possible to create a checksum and cross reference IDs leaving the few that do not have a model number a simple checksum number as their ID. Maybe as CRC-###### to avoid collisions with existing ID codes. 2600 code is extremely tight so I don't believe checksum collisions on a specific console are likely. I know a lot of 400, 800, XE and XL carts are labeled in a similar manner.

SavannahLion

  • Wiki Contributor
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 5986
  • Last login:December 19, 2015, 02:28:15 am
Re: Hypothetical Console rom Standardization.....
« Reply #13 on: July 02, 2011, 05:59:19 pm »
I know it sounds like I'm just paroting you, you're talking about carts with internal IDs or some other unique identifier. I'm talking about carts that DO NOT have any unique identifier inside the binary.

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #14 on: July 02, 2011, 09:52:55 pm »
It's mostly for readabilities sake.  Even if it's numerical a id code is generally easy for human eyes to read.  A crc checksum on the other hand isn't. 

It's the same reason mame gives each rom a name instead of a crc checksum.


I'm not ruling out the possibility though.

SavannahLion

  • Wiki Contributor
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 5986
  • Last login:December 19, 2015, 02:28:15 am
Re: Hypothetical Console rom Standardization.....
« Reply #15 on: July 04, 2011, 05:30:15 am »
It's mostly for readabilities sake.  Even if it's numerical a id code is generally easy for human eyes to read.  A crc checksum on the other hand isn't. 

It's the same reason mame gives each rom a name instead of a crc checksum.


I'm not ruling out the possibility though.

Uh... A name is not the same as an ID. I must be misunderstanding what you're talking about here.

Without Googling it, would you know what game CX2601 is? No one except for a hardcore collector would know that off the top of their head and only a hardcore geek would refer to the game as CX2601.

I'll save you the trouble for those who don't know. It's the number for Combat.

So... knowing this and if we choose to stay within the four digit limit, then we can use a 16 bit CRC (just as an example here) for 216 unique numbers. Atari Age has 970 2600 titles cataloged and I'm sure many of these are duplicates with different labels and homebrews. Ssubtract the titles with an existing ID (such as CX2601) and the likelyhood of a checksum collision approaches fairly close to zero.

So we assign an arbitrary two letter code to identify games that do not have a code such as.... I dunno... AC, then give it our CRC16 (which we're not restricting to just numbers but 0-9 and A-F which will help further identify these ID's as generated and not cataloged). So your ID for our example cartridge is now AC4B37.

TADA!

Adjust accordingly for other platforms. 

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #16 on: July 04, 2011, 07:32:58 pm »
It's mostly for readabilities sake.  Even if it's numerical a id code is generally easy for human eyes to read.  A crc checksum on the other hand isn't. 

It's the same reason mame gives each rom a name instead of a crc checksum.


I'm not ruling out the possibility though.

Uh... A name is not the same as an ID. I must be misunderstanding what you're talking about here.

Without Googling it, would you know what game CX2601 is? No one except for a hardcore collector would know that off the top of their head and only a hardcore geek would refer to the game as CX2601.

I'll save you the trouble for those who don't know. It's the number for Combat.

So... knowing this and if we choose to stay within the four digit limit, then we can use a 16 bit CRC (just as an example here) for 216 unique numbers. Atari Age has 970 2600 titles cataloged and I'm sure many of these are duplicates with different labels and homebrews. Ssubtract the titles with an existing ID (such as CX2601) and the likelyhood of a checksum collision approaches fairly close to zero.

So we assign an arbitrary two letter code to identify games that do not have a code such as.... I dunno... AC, then give it our CRC16 (which we're not restricting to just numbers but 0-9 and A-F which will help further identify these ID's as generated and not cataloged). So your ID for our example cartridge is now AC4B37.

TADA!

Adjust accordingly for other platforms. 

Well not all id's are that obscure... take double dragon for the NES, it's official "long form" id is:

NES-WD-USA

Believe it or not yes, I can tell that's Double Dragon.  NES and USA is obvious, that tells me the system and region.  "W" Is the manufacturer, in this case Trade"W"est.  And the Next letter refers to the title, "D" for double dragon. 

Now nes games don't have their id internalized, we'll have to do a crc cross-check, but "nes-wd-usa" or even it's abbreviated "NWDU" is a lot more readable to me than a crc checksum. 

Again, I'm not saying that using crc's for some systems isn't the way to go, but you asked me why not just use crc's.  I'm just answering you.  ;)

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #17 on: July 05, 2011, 05:13:20 pm »
Ok, our petty arguing aside.  ;)

I took a break and worked on this a little more today.  Added support for reading gba headers to my format.  I also looked at ds headers.  They are the same as wii headers pretty much and I am familiar with them from my wii hacking days. ;)

I'll go ahead and add an ini for ds and possibly 3ds games.  That should add support for every major game systme released between the nes and n64 era (ie all the sega and nintendo consoles).  The app I wrote uses ini files to understand how to read the various formats, so we can fill in as we go.

I'll release it this weekend for everybody to play with. 

Keep in mind though this is NOT a renaming/standardizing app... at least not yet.  I want people to play with it and read various headers.  If and only if there are enough people interested, we can then move on to figuring out how to convert a "clean" goodtools set into the new format and get started on writing info inis for each system.

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #18 on: July 05, 2011, 09:28:11 pm »
**Update**

Looked into the 32x and sega cd formats... they have IDENTICAL headers to the genesis, so I added support for those as well.  It's a shame sega games weren't as nice as the headers inside them ;)

So with the exception of the dreamcast, gamecube and wii, that pretty much does it for nintendo and sega consoles.  I also know the wii format pretty much and my understanding is that it's based on the gamecube format, which is similar to the ds format, so it shouldn't take much to add those.

It also occured to me after doing my first "iso" file extension that perhaps adding some console detection code might be necessary.  My guess is that there are several systems that use the ".bin" extension and there are definately several ".iso" games.


Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #19 on: July 06, 2011, 12:00:07 am »
**Update 2**

Man I'm having trouble sleeping lately.   :banghead:

Added Virtual Boy to my little app.  Upon looking it might be the perfect "Test Subject" for this hypotehtical name standardization.  The Cart has a id printed on the label, the cart id can be found within the header, the headers are official headers, and most importantly, there are just a very few games, and even fewer official games. 

We could setup an artwork pack (which is NOT illegal) and then modify the utility to rename a complete romset that a person could get elsewhere.


I'm thinking the best way to do something like this would be to convert my app into a dll.  I mean we COULD take all of the valuable info contained in the game's headers and externalize them into ini files, but why bother?  It might be nice to simply be able to "read" the game for extra info from within a fe itself.

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #20 on: July 07, 2011, 01:13:57 am »
Been venturing a little deeper down the rabbit whole on this whole standardization thing. 

There are some aspects that the headers seem to leave out no matter the system type or quality of the headers and that is the release region.

Example:

Gyromite (USA)/Robot Gyro (JPN)

They are the same game!  They have the same abbreviated cart id, the same programming code, the same header ect.  The difference is soley in the packaging. 

So how do you handle this?

Well GoodTools and no intro does it in a way that makes it hard to do proper box art, they simply cram a (USA, Japan) or a [J,E] onto the filename.  This is technically correct, but the fact the Gyromite is called Robot Gyro in Japan could mean that a collector could have "Gyromite_[J,E]_[!]" and "Robot Gyro_[J,E]_[!]"  In their collection, both being technically correct, never knowing that they are wasting space by having two copies of the same rom. 

This is exactly the sort of thing that made me think it's a bad idea to use the game name as the rom name.  The game name should be looked up, that way we can compensate for these oddities. 

Another example that complicated things even further is Galactic Pinball for the Virtual Boy. 

Both games have the same name in japan and the us this time, but they still have different box art and misc materials.  Here's the rub though.  Most nintendo games have a region code built in BUT this isn't necessarily the only place the game was released.  Here's an example:

CON-C##@-REL -V

That's the typical layout of the extended product code on any nintendo release.  The "CON-C" part is worthless, it just identifies the console.  The "##" is the unique game id, or in mame artwork terms the "parent rom."  Next is where it gets crazy.  The "@" is the region code and typically it is J for Japan, E for America (think "English"), and P for Europe (think "PAL" video).  There isn't a multi-region code though, and this is where we get into trouble.  If a game is multi-region it typically has a J as the last letter (as most carts were physically manufactured in Japan) and it is also released in America because of the common NTSC video standard.  The "-REL" part of the code is the language of the country in which the game actually gets released in.  So the difference between Galactic Pinball (USA)'s ID and  Galactic Pinball (JAPAN)'s ID is a three character language code tacked on to the end.  USA's id is "VUE-VGPJ-USA" and Japan's is "VUE-VGPJ-JPN".  The problem is they just pull that three letter language code out of the ether, it doesn't exist anywhere on the cart or in the code. 

(p.s.  The "-v" is something they tack on when they do a rom revision, with "v" being the incremental update.  This is typically stored in the rom and even if it wasn't the crc of the rom would be different, so no problems there)

The solution?  Well maybe a system like this:

In a front end, there should be an option for each console to set the Preferred Language, this tells the fe which title the user perfers. 

Then you look for artwork and/or the official title in the following manner.

1.  Check the extended product id (C##@-REL), which would be the most correct name/art.
2.  If nothing is found check the "Medium" product id (C##@), which is still fairly accurate.
3.  If nothing is found check the game id (##) which could be used on games like Gyromite/Robot Gyro for snapshots as the games are identical.

In terms of artwork authors they should always name things as broad as possible to avoid duplicates.  Snaps, for example should typically be named "##" for any of these clone games and "C##@" for non clone games because the "@" represents the language actually on the cart and thus the games will be identical graphically.

I know this sounds like a very "Nintendo-Centric" problem but I can assure you that it exists on other consoles as well.  Sega is particularly famous for releasing the same cart in all four regions in the genesis days.  And although segas product id's are more numerical, they also have a similar "gameID-Region-version" format.


It sounds like a lot of working for nothing, but if we ever want to get this whole console artwork thing standardized this or something similar needs to be implemented.  It sounds like it wouldn't be an issue as most of the people around here are english speaking until the guy in the UK looks up his nes copy of "Teenage Mutant Hero Turtles" only to find the US "Teenage Mutant Ninja Turtles" box art or something similar. 

Also by looking at the way good tools are currently done, it makes me think that 20-40% of our game collections consist of duplicates.  NES carts in particular pretty much always had a dual JPN/USA cart released in both countries.  If you download a "japanese pack" to add the japan exclusives in, you are also going to get hundreds of other games you already have in the US collection, just named differently.

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #21 on: July 07, 2011, 01:22:41 am »
Been venturing a little deeper down the rabbit whole on this whole standardization thing. 

There are some aspects that the headers seem to leave out no matter the system type or quality of the headers and that is the release region.

Example:

Gyromite (USA)/Robot Gyro (JPN)

They are the same game!  They have the same abbreviated cart id, the same programming code, the same header ect.  The difference is soley in the packaging. 

So how do you handle this?

Well GoodTools and no intro does it in a way that makes it hard to do proper box art, they simply cram a (USA, Japan) or a [J,E] onto the filename.  This is technically correct, but the fact the Gyromite is called Robot Gyro in Japan could mean that a collector could have "Gyromite_[J,E]_[!]" and "Robot Gyro_[J,E]_[!]"  In their collection, both being technically correct, never knowing that they are wasting space by having two copies of the same rom. 

This is exactly the sort of thing that made me think it's a bad idea to use the game name as the rom name.  The game name should be looked up, that way we can compensate for these oddities. 

Another example that complicated things even further is Galactic Pinball for the Virtual Boy. 

Both games have the same name in japan and the us this time, but they still have different box art and misc materials.  Here's the rub though.  Most nintendo games have a region code built in BUT this isn't necessarily the only place the game was released.  Here's an example:

CON-C##@-REL -V

That's the typical layout of the extended product code on any nintendo release.  The "CON-C" part is worthless, it just identifies the console.  The "##" is the unique game id, or in mame artwork terms the "parent rom."  Next is where it gets crazy.  The "@" is the region code and typically it is J for Japan, E for America (think "English"), and P for Europe (think "PAL" video).  There isn't a multi-region code though, and this is where we get into trouble.  If a game is multi-region it typically has a J as the last letter (as most carts were physically manufactured in Japan) and it is also released in America because of the common NTSC video standard.  The "-REL" part of the code is the language of the country in which the game actually gets released in.  So the difference between Galactic Pinball (USA)'s ID and  Galactic Pinball (JAPAN)'s ID is a three character language code tacked on to the end.  USA's id is "VUE-VGPJ-USA" and Japan's is "VUE-VGPJ-JPN".  The problem is they just pull that three letter language code out of the ether, it doesn't exist anywhere on the cart or in the code. 

(p.s.  The "-v" is something they tack on when they do a rom revision, with "v" being the incremental update.  This is typically stored in the rom and even if it wasn't the crc of the rom would be different, so no problems there)

Of course when you venture into the European releases things get even more hairy.  Back in the day "P" releases were typically in the english language for the UK but were distributed into dozens of countires in europe, each with different packaging and thus a different "-REG" extension.  And then there were rare instances in which they localized the text, but again, only three region codes so they did a "-REG/REG" where the first "REG" is the country it was released in and the second was the language on the cart.  Luckily these are rare and they would have a different crc value anyway.

The solution?  Well maybe a system like this:

In a front end, there should be an option for each console to set the Preferred Language, this tells the fe which title the user perfers. 

Then you look for artwork and/or the official title in the following manner.

1.  Check the extended product id (C##@-REL), which would be the most correct name/art.
2.  If nothing is found check the "Medium" product id (C##@), which is still fairly accurate.
3.  If nothing is found check the game id (##) which could be used on games like Gyromite/Robot Gyro for snapshots as the games are identical.

In terms of artwork authors they should always name things as broad as possible to avoid duplicates.  Snaps, for example should typically be named "##" for any of these clone games and "C##@" for non clone games because the "@" represents the language actually on the cart and thus the games will be identical graphically.

I know this sounds like a very "Nintendo-Centric" problem but I can assure you that it exists on other consoles as well.  Sega is particularly famous for releasing the same cart in all four regions in the genesis days.  And although segas product id's are more numerical, they also have a similar "gameID-Region-version" format.


It sounds like a lot of working for nothing, but if we ever want to get this whole console artwork thing standardized this or something similar needs to be implemented.  It sounds like it wouldn't be an issue as most of the people around here are english speaking until the guy in the UK looks up his nes copy of "Teenage Mutant Hero Turtles" only to find the US "Teenage Mutant Ninja Turtles" box art or something similar. 

Also by looking at the way good tools are currently done, it makes me think that 20-40% of our game collections consist of duplicates.  NES carts in particular pretty much always had a dual JPN/USA cart released in both countries.  If you download a "japanese pack" to add the japan exclusives in, you are also going to get hundreds of other games you already have in the US collection, just named differently.

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #22 on: July 08, 2011, 01:32:59 am »
I have no clue why I keep working on this seeing as how there doesn't seem to be much interest but it's strangely addictive to add support for reading a particular game type. 


Added a name lookup table for virtual boy games, essentially making them "conversion ready". 

I also added support for neogeo-cd.  I was going to add support for neogeo aes as well, but then I realized how stupid that would be considering neogeo arcade roms are identical to the aes roms and thus those roms are already standardized.

Also because neogeo collectors are obsessive about details, there was already a master list of all neogeo releases and their id numbers at neo-geo.com and thus I made a name lookup table for neogeo cd games as well.   

I also added support for the Dreamcast. 


So in terms of the era of gaming I've been concentrating on that leaves the portable neo-geo systems, the turbo-grafx systems, the wonderswan and the atari jaguar?

I've been thinking about it and for anything older than the NES would there really be any point to standardizing the game names?  I mean the entire 2600 library, for example is around 8mb.   When you are dealing with rom sizes that small you can simply include the roms with the artwork packs.  Then again, that isn't exactly legal.

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #23 on: July 08, 2011, 02:59:44 am »
Added PSX support. 

Also, much like the neo-geo, there are complete lists out there that compare the product id to the actual game title, so adding a name lookup list will be quite simple.  The problem is sony fanboys seem to be in constant argument over which list is the best, so I don't know which one to convert. 

Any help on that?

As a side note: 

Man you would think that a console as popular as the playstation would have header lookup info readily available on the net but anytime I searched for "psx header" it linked to people complaining about not being able to pirate a game! 

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #24 on: July 08, 2011, 04:09:07 pm »
I Found the "official" psx id list and converted it into a lookup table.

Playstation IDs are wierd though.  They have a 4-digit (sometimes 3, somtimes 5 digit) letter code which "sort of" indicates the region and a release number.  The tricky thing is the release number is the release number for that region, so disc 000001 in japan and disc 000001 in the US are totally different games!  This means no "parent rom" distinction, which also means a butt load of artwork.   But hey, there's nothing you can do. 

I want to clean things up a bit, but I think I want to go ahead and release it tomorrow so everybody can play with it. 

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #25 on: July 09, 2011, 06:06:53 pm »
I ran across a little site called "nesworld" it has the most complete list of nintendo games and their id's I've seen and it's regularly updated so I wrote  a little program to convert their html tables to the gamelist format.  It still needs a bit of work, but that will do it for the n64 titles and I have a list for the nes games as well.  The problem with nes games though is that the pid isn't contained in the header, so I'll have to do a crc to id list for it as well.  

I can't just do a crc btw, this is why rommanagers like romcenter fail so miserably with console titles.  See nes roms have a emulator created header tacked on to them.  Of course this header isn't official and it can vary or might not even exist!  So to do a crc with nes games (like most roms) you need to do a crc of the rom, not the header.  So you can't just do a standard file check.


Oddly enough nesworld doesn't seem to have master lists for the snes???
« Last Edit: July 09, 2011, 08:13:03 pm by Howard_Casto »

Howard_Casto

  • Idiot Police
  • Trade Count: (+1)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 19434
  • Last login:Today at 06:49:52 pm
  • Your Post's Soul is MINE!!! .......Again??
    • The Dragon King
Re: Hypothetical Console rom Standardization.....
« Reply #26 on: July 12, 2011, 08:24:51 pm »
I guess I'm just talking to myself at this point, but I'll make a post anyway just for documentation purposes.

Found another nes site called NesCartDB.  The people over there seem to understand the problem and are working on a similar solution, but unfortunately it's only for nes/famicom games.  This is why such projects never get off the ground, it needs to be universal! 

But the bright side is they have this huge database of useful information and this info includes the headerless crc32 (remember where I said that was the way to go for nes games specifically?) the product ID and the Official game title.   On top of this they have an xml file of the whole database available for parsing!  This means I can write a program to pull the data we need out of the xml and make a gamelist and crc lookup list for nes games!

I added crc32 support to my little app and thus far the crcs of my personal nes collection match up to their crcs. 

So that takes care of NES support!

What's left is those pesky snes games, and a game list for the sega games.  The sega games are particularly troublesome because none of the sega fansites seem to list the product ID (not suprising considering it's internal and seldom printed on the cart).