I need some help parsing a document

Get help with using AutoHotkey (v1.1 and older) and its commands and hotkeys
User avatar
dd900
Posts: 121
Joined: 27 Oct 2013, 16:03

I need some help parsing a document

21 Apr 2021, 09:56

I'm planning to create a database (SQL) to hold metadata for video games that I can then export data to/from various formats used by different frontends. Most of the code is rather easy as I know how to use SQL already and XML (used by most frontends) is pretty straight forward. I'm having an issue with one of the Frontend formats though. Pegasus Frontend metafile docs Here. It uses Debian control file format.
    Here's a sample
      More samples can be found here.
        I know I could hard code all the keys, but then I would have to parse the list of keys every line. I've also considered regex, but my regex skills are weak at best. Every key/value is delimited by ":", but ":" can also be in the value. And as far as I know there is no escaping anything. Everything in "value" is treated as literal. Preferably I would like to parse each "game" into its own object. Any Suggestions?
        User avatar
        mikeyww
        Posts: 26937
        Joined: 09 Sep 2014, 18:38

        Re: I need some help parsing a document

        21 Apr 2021, 10:48

        Code: Select all

        arr := {}
        str =
        (
        game: 005
        file: ./005.zip
        sort-by: 005
        developer: S:E:GA
        publisher: SEGA
        genres:
        )
        For index, line in StrSplit(str, "`n")
         RegExMatch(line, "(.+?): *(.*)", part), arr[part1] := part2
        MsgBox, 64, arr.developer, % arr.developer
        User avatar
        dd900
        Posts: 121
        Joined: 27 Oct 2013, 16:03

        Re: I need some help parsing a document

        21 Apr 2021, 13:29

        Thanks. That would work when there is one value per line. But almost any key can also be plural making it a list which would have its values listed one per line starting on or after the line the key is on. Values like "description" can have multiple lines and also a single "." on a line to mark paragraphs. This seems like it should be really simple... Writing the document is very easy. I wish parsing it was to.
          more examples
          Last edited by dd900 on 21 Apr 2021, 13:33, edited 1 time in total.
          User avatar
          mikeyww
          Posts: 26937
          Joined: 09 Sep 2014, 18:38

          Re: I need some help parsing a document

          21 Apr 2021, 13:31

          You have described the input but not an example of the intended output.
          User avatar
          dd900
          Posts: 121
          Joined: 27 Oct 2013, 16:03

          Re: I need some help parsing a document

          21 Apr 2021, 13:35

          SQL database is the final resting place for the data. I got that part covered. I need to turn each "game" entry into an object with keys matching the metafile keys. The rest of the code is already there for the most part. Or turn the whole file into an object. Like JSON.Load(file).
          User avatar
          mikeyww
          Posts: 26937
          Joined: 09 Sep 2014, 18:38

          Re: I need some help parsing a document

          21 Apr 2021, 14:25

          Code: Select all

          game := {}
          str =
          ( %
          game: 1941: Counter Attack
          file: ./1941.zip
          sort-by: 1941: Counter Attack
          developer: Capcom
          publisher: Capcom
          genres:
          	Shooter
          	Flying Vertical
          release: 1990-02-01
          players: 2
          rating: 80%
          description:
          	The goal is to shoot down enemy airplanes and collect weapon power-ups (POW). One is only able to perform three loops per level and a bonus is awarded at the end of the level for unused loops. Player 1 uses a P-38 Lightning and Player 2 uses a Mosquito Mk IV. The game shifts from the original Pacific Front setting with that of the Western Front.
          	.
          	The game consists of six levels.
          	.
          	It was the first shoot 'em up to add +1 to the score when a continue is used.[1]
          assets.boxfront: ./media/mixart/1941.png
          assets.video: ./media/snap/1941.mp4
          assets.logo: ./media/wheel/1941.png
          assets.screenshot: ./media/screenshot/1941.png
          assets.background: ./media/background/1941.jpg
          
          
          game: Gran Turismo 2
          files:
          	./Gran Turismo 2 (USA) (Arcade Mode) (Rev 1).chd
          	./Gran Turismo 2 (USA) (Simulation Mode) (Rev 2).chd
          description:
          	Gran Turismo 2 is fundamentally based on the racing game genre. The player must maneuver an automobile to compete against artificially intelligent drivers on various race tracks. The game uses two different modes: arcade and simulation. In the arcade mode, the player can freely choose the courses and vehicles they wish to use. However, the simulation mode requires the player to earn driver's licenses, pay for vehicles, and earn trophies in order to unlock new courses. Gran Turismo 2 features nearly 650 automobiles and 27 racing tracks.
          developer: Polyphony Digital
          publisher: Polyphony Digital
          genre: Racing
          release: 2000-01-12
          players: 2
          rating: 80%
          assets.background: ./media/background/Gran Turismo 2 (USA) (Arcade Mode) (Rev 1).jpg
          assets.boxfront: ./media/mixart/Gran Turismo 2 (USA) (Arcade Mode) (Rev 1).png
          assets.logo: ./media/wheel/Gran Turismo 2 (USA) (Arcade Mode) (Rev 1).png
          assets.screenshots: ./media/screenshot/Gran Turismo 2 (USA) (Arcade Mode) (Rev 1).png
          assets.videos: ./media/snap/Gran Turismo 2 (USA) (Arcade Mode) (Rev 1).mp4
          
          
          game: Grand Theft Auto
          file: ./Grand Theft Auto (USA).chd
          description:
          	In Grand Theft Auto, you assume the role of a street thug working for an organized syndicate of criminals that specializes in stealing cars. Every time your pager beeps, the boss needs you to steal something for him -- if you don't get to a payphone and answer the message, you'll be in hot water (or is that cold water at the bottom of the East River).
          	After receiving your mission, you'll have to navigate city streets that are bustling with pedestrians, cops (your worst enemies), and traffic jams. Because you're pressed for time, you can't just walk to your destination on foot -- that would take too long. Instead, you'll have to go car jacking, the art of physically removing people out of their vehicles and stealing it.
          	.
          	.
          	.
          	This is the premise behind Grand Theft Auto, an overhead action game of thievery. There are four large cities in which you'll have to steal cars, shoot rival gang members, and complete missions for the boss. Each mission requires you to get something from point "A" to point "B". For example, the head honcho wants to pull off a bank heist -- instead of using regular cars, they want you to steal a couple taxicabs for them. This way, the cops won't be able to find them as easily in traffic.
          developer: DMA Design
          publisher: DMA Design
          genre: Action
          release: 1997-12-01
          players: 1
          rating: 70%
          assets.boxfront: ./media/mixart/Grand Theft Auto (USA).png
          assets.logo: ./media/wheel/Grand Theft Auto (USA).png
          assets.screenshots: ./media/screenshot/Grand Theft Auto (USA).png
          assets.videos: ./media/snap/Grand Theft Auto (USA).mp4
          )
          Loop, Parse, str, `n
          { If ("" = line := Trim(A_LoopField))
             Continue
            If RegExMatch(line, "^[\w.]+:") { ; Line starts with a key
             RegExMatch(line, "(.+?): *(.*)", part)
             If (part1 = "game") {
              gameName := part2
              Continue
             }
            } Else part2 := line = "." ? "`n"
               : (game[gameName, part1] > "" ? (SubStr(game[gameName, part1], 0) = "`n" ? "" : ", ") : "") line
            game[gameName, part1] .= part2
          }
          MsgBox, 64, game["1941: Counter Attack"].developer, % game["1941: Counter Attack"].developer
          MsgBox, 64, game["1941: Counter Attack"].description, % game["1941: Counter Attack"].description
          MsgBox, 64, game["1941: Counter Attack"].genres, % game["1941: Counter Attack"].genres
          MsgBox, 64, game["Grand Theft Auto"].developer, % game["Grand Theft Auto"].developer
          MsgBox, 64, game["Grand Theft Auto"]["assets.logo"], % game["Grand Theft Auto"]["assets.logo"]
          MsgBox, 64, game["Gran Turismo 2"].genre, % game["Gran Turismo 2"].genre
          MsgBox, 64, game["Gran Turismo 2"].files, % game["Gran Turismo 2"].files
          MsgBox, 64, game["Gran Turismo 2"].description, % game["Gran Turismo 2"].description
          User avatar
          dd900
          Posts: 121
          Joined: 27 Oct 2013, 16:03

          Re: I need some help parsing a document

          21 Apr 2021, 15:09

          Nice, thank you. I'll run that through some full files when I get a chance. I found the code they are using in the frontend to parse the file. Looks like the keys are all hardcoded and regex is used.
            PegasusMetadata.cpp

            Return to “Ask for Help (v1)”

            Who is online

            Users browsing this forum: jaka1, penguinautomator, Spawnova and 216 guests