Main Restorations Software Audio/Jukebox/MP3 Everything Else Buy/Sell/Trade
Project Announcements Monitor/Video GroovyMAME Merit/JVL Touchscreen Meet Up Retail Vendors
Driving & Racing Woodworking Software Support Forums Consoles Project Arcade Reviews
Automated Projects Artwork Frontend Support Forums Pinball Forum Discussion Old Boards
Raspberry Pi & Dev Board controls.dat Linux Miscellaneous Arcade Wiki Discussion Old Archives
Lightguns Arcade1Up Try the site in https mode Site News

Unread posts | New Replies | Recent posts | Rules | Chatroom | Wiki | File Repository | RSS | Submit news

  

Author Topic: How to scrape media in Windows with Skyscraper (for dummies).  (Read 3920 times)

0 Members and 1 Guest are viewing this topic.

FinnJävel

  • Trade Count: (0)
  • Full Member
  • ***
  • Offline Offline
  • Posts: 25
  • Last login:May 11, 2020, 11:58:44 am
  • A disgrunteled cabbie.
Sometimes you just want to use your PC to scrape media files for you games. Seeing as Skyscraper doesn't have any official support for the Windows version and I didn't find any PC-related instructions online I scratched my head for quite some time and put together a short(ish) guide for any other dummies out there. There really isn’t that much to it, this post just may be over-explaining.

Basically, scraping with Ss involves two stages:
a) scraping the media (images and/or videos) from a website (or your own media files) to Skyscraper’s own resource cache
b) writing the gamelist.xml and copying the media to your preferred folders. (You might not even care about the gamelist as often frontends can write one themselves)

So this is how you do the whole thing in four steps:

1) Download Skyscraper for Windows: At the moment of writing it's in http://www.muldjord.com/downloads/Skyscraper_3.4.6_unsupported_win_version.zip but if not, you'll find it in https://github.com/muldjord/skyscraper and scroll down to "How to install Skyscraper" > Windows.

Unpack the zip somewhere (preferably a short path).

As the readme-file says, you'll "need to to copy all folders from within the "deploy" folder to the C:\Users\<YOURUSER>\ folder. If you don't IT WON'T RUN!!! You need to copy them so you end up with the following folders:
C:\Users\<YOURUSER>\.skyscraper
C:\Users\<YOURUSER>\RetroPie"

2) We'll be using good ol' Windows command prompt to do this and I'll assume you're able to navigate yourself to the Skyscraper-unpack folder with cmd, so you can execute skyscraper.exe.

3) Now we begin the scraping.
The basic command is: skyscraper -<COMMAND> -<COMMAND2> etc. You get a full list of commands with skyscraper -?.

If you just want to get the job done without too much studying, here are my command lines  and the breakdown:

Code: [Select]
skyscraper -s openretro -p amiga -i "E:\FS-UAE\lha2" --videos
-As I was scraping Amiga, -s gets the media from openretro.org, but other sites work better for other platforms. Check out https://github.com/muldjord/skyscraper/blob/master/docs/SCRAPINGMODULES.md for details. You can scrape several sites (or modules as they’re called in Ss) with the same game folder, Skyscraper skips the games already scraped by default.
- The -p platform this time is Amiga, but it can be nes, 3do, daphne or whatever. The command skyscraper -? displays the full list.
- I have my games in E:\FS-UAE\lha2, so the -i points there. Edit as needed.
- --videos also scrapes the video file, as this is not enabled by default. (note two dashes --)

When you hit ENTER, the scraping begins and can take a while (that’s why I had my games in several folders and did them one at a time).

Full explanation of commands: https://github.com/muldjord/skyscraper/blob/master/docs/CLIHELP.md

Here’s another example of a scraping command line:

Code: [Select]
skyscraper -s screenscraper -u <MYUSERNAME:MYPASSWORD> -p msx --nowheels --nomarquees -i "E:\msx\roms”
- Now we’ll scrape (-s) screenscraper.fr and we need a password (it’s free) so we’ll use -u to specify our login:pass.
- Scraping -p MSX now.
- I don’t want wheels or marquees for this one so --nowheels --nomarquees
- Games are in E:\msx\roms

When a scrape is done, you’ll see a summary of successful/missed scrapes. If you’re happy with this, you can proceed to

4) writing/compiling a gamelist and copying the media neatly into folders.
NOTE: By default, Ss will create a composite image as detailed by artwork.xml in C:\Users\<YOURUSER>\.skyscraper
If you just want folders with clean media, edit artwork.xml to (file also attached, rename to artwork.xml):
Code: [Select]
<artwork>
  <output type="cover" width="512" height="597"/>
  <output type="screenshot" width="640" height="480"/>
  <output type="marquee" width="640" height="480"/>
  <output type="wheel" width="640" height="480"/>
  <output type="video"/>
</artwork>

(edit the dimensions as you wish, of course. Or leave them out for no resizing) Details: https://github.com/muldjord/skyscraper/blob/master/docs/ARTWORK.md


Code: [Select]
skyscraper -p amiga --unattendskip -i "E:\FS-UAE\lha2" -g "E:\FS-UAE\lha2"
-We just leave out the -s and Skyscraper rummages through its cache and does its thing.
- --unattendskip will relieve you from answering on-screen prompts
- the -i is probably unnecessary here, actually…
- I want the media in E:\FS-UAE\lha2. Edit as you wish. -g will copy it there in a new /media -folder Ss creates for you. Screenshots, marquees etc. all get their own folder automatically, you don’t have create anything yourself.

DONE!

You'll find more detailed info on Skyscraper's github, but here are a couple of tips:

- If you are planning to scrape like this a lot, it might be easier to put the most often used commands in a .ini -file. There is a config.ini in the C:\Users\<YOURUSER>\.skyscraper -folder so you can (take a backup and) play with it. -c config<NAME>.ini will load the named .ini file if you want to bypass the default config.ini. Detailed info on you-know-where….

-m will let you adjust the accuracy of the scrape search results, as it is often done by reding the filename of the gem and comparing it to the websites list. -m sets the minimum percentage needed to match the filename, lower will get more but often false matches while higher numbers get fewer but more accurate matches. The default is 65 but if your scrape doesn’t return any/enough hits (check summary at the end of scrape), you might want to try a lower number.

-s import scrapes your own media into the cache. You can have media from several websites (modules) in your cache before you write a gamelist so this means you can scrape your own media into the cache first and save time downloading stuff you already have. This one can a bit tricky, though: the name of the media file must match the game filename exactly. No 65-rules here. Also you’ll have to copy your media into the C:\Users\<YOURUSER>\.skyscraper\import\screenshots, covers, wheels, marquees or videos folders. Ss reads them into the cache from there blindly, so if you have cover images in your screenshots folder….

- --cache report:missing=<cover>,<screenshot> if you want a list of games not scraped. This is created in C:\Users\<YOURUSER>\.skyscraper

Hope that makes sense and is uselful to someone somewhere...