pyIFBabel: a Python implementation of the Treaty of Babel

jakobcreutzfeldt · April 5, 2012, 9:59am

I’ve taken the Treaty of Babel code from Grotesque, cleaned it up greatly, packaged it nicely and made it available separately so other developers can put it to use. This is a pure-Python implementation of the Treaty of Babel, which you can use to identify (ie find the IFID) of any story file covered by the Treaty. It also handles metadata and cover art extraction for the story formats (tads2/tads3) and wrappers (blorb) that support it. It is, of course, a bit slower than the official C API, but the code is (hopefully) a bit more Pythonic than bindings to the C API would be. For my purposes, at least, the speed isn’t a problem.

The package includes a script called pyIFBabel which replicates the babel commandline utility distributed on the official Treaty page. It’s mostly for demonstration purposes, though I suppose you could use it if you want to.

This is a beta release, so expect future releases to possibly make some small API changes.

Version 0.2.2
What works:

all story format and wrapper handlers
IFiction XML parsing and creation
blorb extraction
IFDB IFiction retrieval

What doesn’t work:

IFiction XML verification/lint, sparse file completion
blorb creation
Python 3 support

Download & Install
Source: download
To install: extract the archive and run python setup.py install in the package directory

PyPi: pyIFBabel is available on PyPi so you can install it via pip install pyIFBabel.

Arch Linux: pyIFBabel is available in the AUR.

Development
You can fork my git repository at Gitorious.

Note:
Most of the code was written from scratch but some of it was admittedly translated directly from the original C code so credit must be given to L. Ross Raszewski for that.

jakobcreutzfeldt · April 5, 2012, 10:05am

Example:

Python 2.7.2 (default, Jan 31 2012, 13:19:49) 
[GCC 4.6.2 20120120 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from treatyofbabel import babel
>>> babel.get_ifids("tests/glulx/FerrousRing.ulx")
['GLULX-1-070928-AD26C29D']
>>> babel.deduce_format("tests/glulx/FerrousRing.ulx")
'glulx'
>>> babel.get_ifids("tests/blorb/AnchorheadDemo.gblorb")
[u'F76B2529-27F8-44ED-B643-C5F03492881D']
>>> print babel.get_meta("tests/blorb/AnchorheadDemo.gblorb")
<?xml version="1.0" encoding="UTF-8"?>
<ifindex version="1.0" xmlns="http://babel.ifarchive.org/protocol/iFiction/">
    <story>
        <identification>
            <ifid>F76B2529-27F8-44ED-B643-C5F03492881D</ifid>
            <format>glulx</format>
        </identification>
        <bibliographic>
            <title>Anchorhead: Special Edition Demo</title>
            <author>Michael Gentry</author>
            <headline>prelude to a tale of horror</headline>
            <genre>Horror</genre>
            <firstpublished>2007</firstpublished>
            <description>This is the demo version for Anchorhead: Special Edition, to be released in November 2007. It is not a complete game.</description>
            <language>en</language>
            <group>Inform</group>
        </bibliographic>
        <releases>
            <attached>
                <release>
                    <releasedate>2007-02-02</releasedate>
                    <version>3</version>
                    <compiler>Inform 7</compiler>
                    <compilerversion>4K41</compilerversion>
                </release>
            </attached>
        </releases>
        <colophon>
            <generator>Inform 7</generator>
            <generatorversion>4K41</generatorversion>
            <originated>2007-02-02</originated>
        </colophon>
        <glulx>
            <serial>070202</serial>
            <release>3</release>
            <compiler>Inform 7 build 4K41</compiler>
        </glulx>
    </story>
</ifindex>

>>> babel.deduce_format("tests/blorb/AnchorheadDemo.gblorb")
'blorbed glulx'

The pyIFBabel commandline tool:

$ ./pyIFBabel --identify tests/tads/Banana.t3 
"The Quest of the Golden Banana" by Eric Eve
IFID: TADS3-C2DAA2AFED843DA41084EA1031BDF250
tads3, 709k, no cover

bcressey · April 5, 2012, 3:00pm

Excellent, thanks for making this!

zarf · April 5, 2012, 5:04pm

Sweet.

jakobcreutzfeldt · May 5, 2012, 12:29pm

I’ve released version 0.2 today (see the original post for updated links).

This release includes some bug fixes in the treatyofbabel.ifiction module, which should work a bit more smoothly now. Additionally, I’ve fixed up the previously non-functional treatyofbabel.ifstory module and the IFStory class that it defines. This module gives you an object-oriented means of working with story files and is the recommended way of working with the library.

I still haven’t written proper documentation, but here is something like a tutorial:

There are two primary ways to use this library. The first way is via the treatyofbabel.babel and treatyofbabel.ifiction modules. The babel module contains functions for extracting metadata from interactive fiction story files. Until proper documentation is written, it is best to use Python’s introspection capabilities to view the available functions (dir(treatyofbabel.babel)). In general, you will pass these functions a string referring to the file’s location. So, for example, to determine the format of the file (ie glulx, tads2, etc.), you would do:

>>> babel.deduce_format("path/to/file")

To calculate the IFID(s) of a file, you would do:

>>> babel.get_ifids("path/to/file")

The ifiction module contains functions for working with IFiction files. These are simply XML files, so the module essentially just consists of IFiction-specific convenience functions. The functions typically return objects belonging to the built-in Python module xml.dom.minidom, though you generally won’t have to do anything with these other than pass them between ifiction functions. One can build up an IFiction file manually:

>>> ifdom = ifiction.create_ifiction_dom()
>>> story = ifiction.add_story(ifdom)
>>> ifiction.add_identification(ifdom, story, list_of_ifids, story_format, story_bafn)
>>> ifiction.add_bibliographic(ifdom, story, truncate=False, title="My Story", 
    author="Pat Smith")

…and so on. Alternatively, given an IFiction file, you can extract information from it:

>>> ifdom = ifiction.get_ifiction_dom("path/to/file.ifiction")
>>> assert ifiction.is_ifiction(ifdom)
>>> stories = ifiction.get_all_stories(ifdom)
>>> story = stories[0]
>>> ident = ifiction.get_identification(story)

…and so on.

There is another, object-oriented means of doing this, contained in the treatyofbabel.ifstory module. This module defines the IFStory class, which has fields corresponding to the information contained in an IFiction file. There are various ways you can use this class. You can build up a story description manually:

>>> story = ifstory.IFStory()
>>> story.ifid_list = ["ZCODE-88-840726-A129"]
>>> story.bibliographic["title"] = "Zork I"

This is tedious, though, so better options are available. If you have an IFiction file, you can use that to fill in all of the fields:

>>> ifdom = ifiction.get_ifiction_dom("path/to/file.ifiction")
>>> story_node = ifiction.get_all_stories(ifdom)[0]
>>> story = ifstory.IFStory(ific_story_node=story_node)
>>> print story.bibliographic["title"]
"Zork I"

You can use the capabilities provided by the treatyofbabel.babel module to automatically fill in some information (note that for most formats, this will only be able to generate an IFID and determine the story format):

>>> story = ifstory.IFStory(story_file="path/to/storyfile")
>>> story.load_from_story_file()
>>> print story.ifid_list
["ZCODE-88-840726-A129"]

Given an IFID, you may fill in the rest of the fields and optionally fetch cover art by remotely querying the IFDB (ifdb.tads.org) [continuing from the last example]:

>>> story.load_from_ifdb()
>>> print story.bibliographic["title"]
"Zork I'
>>> print story.cover.img_format
"jpg"
>>> with open("ZorkI.jpg", "wb") as img_handle:
...     img_handle.write(story.cover.data)
>>> with open("ZorkI.ifiction", "w") as ific_handle:
...     ific_handle.write(story.to_ifiction())

Once again, until proper documentation has been written, it is recommended to use introspection or to read the (hopefully readable) code to see all of the functions available.

Alex · May 5, 2012, 1:56pm

Quest isn’t part of this so-called “treaty”, but if anyone’s interested then since v5.1 each game gets its own GUID.

The .quest file is simply a ZIP file, and the GUID can be found in the contained game.aslx file. This file is XML and the id can be found within the tag which appears inside .

There’s other info inside the tag too. Not sure what other info would be needed but if anyone is interested at all then let me know.

jakobcreutzfeldt · May 5, 2012, 2:18pm

Well, I can’t speak for the anybody from the official treaty and the original C tool, but I’d certainly like to add a Quest handler to pyIFBabel (selfishly, it will help with my other project Grotesque, so people can also manage their Quest games with it)! I can implement it similar to the TADS handler, which takes metadata from the embedded gameinfo text file and coverts it to use the standard Treaty terminology.

Is the full .aslx XML specification online anywhere?

Thanks!

Alex · May 5, 2012, 2:27pm

Info on the format is here: quest5.net/wiki/ASLX_Elements

Let me know if I can help with anything.

jakobcreutzfeldt · May 5, 2012, 4:27pm

Ok thanks! I think I’ve already got something going for it. I’ll download some story files to test it out.

Is the gameid guaranteed to be defined in the file? If not, is there an algorithm to calculate the gameid for a given file?

I couldn’t find it in the wiki, but are these all the metadata fields that might appear in the aslx file: version, author, gameid, description, category, start?

Alex · May 5, 2012, 11:50pm

Gameid is not guaranteed - it won’t be there for games created with 5.0.x, and also I suppose there’s nothing to stop it from being deleted (though that’s unlikely). In its absence there’s nothing defined that would uniquely identify a game though you could always use some kind of hash.

Those metafields look correct to me (not “start” though - don’t know where that’s come from?). It looks like some of those are indeed missing from the wiki (they were added in 5.1 and looks like I forgot to add them to the wiki).

jakobcreutzfeldt · May 6, 2012, 8:39am

Ok. The standard thing to do in the case of not having a defined means of calculating an ID is to use the MD5 sum, so I’ll just do that.

Scratch the “start” node. I saw it under the “game” node in one of the files that I downloaded but it doesn’t matter anyway since it’s not bibliographic data.

Thanks again for the help!

jakobcreutzfeldt · May 6, 2012, 1:23pm

I’ve just released version 0.2.1, which adds support for Quest games!

$ pyIFBabel --identify tests/quest/Dragon.quest 
"Dragon" by Craig Dutton (c) 2012
IFID: 0fd8a779-cf0e-4d87-b243-237cf4a1fce4
quest, 1945k, no cover

$ pyIFBabel --meta tests/quest/Escape\ from\ Byron\ Bay.quest 
<?xml version="1.0" encoding="UTF-8"?>
<ifindex version="1.0" xmlns="http://babel.ifarchive.org/protocol/iFiction/">
  <!--Bibliographic data translated from Quest ASLX-->
  <story>
    <identification>
      <ifid>5ed0a10a-a86f-4adb-a926-a168cc013e78</ifid>
      <format>quest</format>
    </identification>
    <bibliographic>
      <genre>Puzzle</genre>
      <author>Allen Heard</author>
      <description>A bustling tourist attraction is turned on its head when a medical company makes an error. Bryneli-Med staff flee leaving you to face more than the music. Can you solve the puzzles and escape from Byron Bay.</description>
      <title>Escape from Byron Bay</title>
    </bibliographic>
  </story>
</ifindex>

Alex · May 6, 2012, 10:48pm

Cool. I’m thinking of adding some kind of “cover art” support to v5.3. Is there any other useful data or metadata that could be added?

jakobcreutzfeldt · May 7, 2012, 7:29am

For bibliographic data, the only fields that are missing relative to the Treaty are the year/date that it was first published, the “headline” (subtitle, like the “An Adventure” in “Some Title: An Adventure”), and the “forgiveness” (zarf’s scale of IF difficulty). Of those, I’d say the date would be the most interesting.

Other than that, there’s an area for format-specific information, such as the version of Quest that generated the game, which does not have to be bibliographic. I don’t know how strict other Treaty-compatible software is, and whether they would recognize this info if I put it into an IFiction, but I don’t think it should be a problem.

Alex · May 7, 2012, 9:57am

Thanks. I’ve logged those fields for Quest 5.3: quest.codeplex.com/workitem/1069

jakobcreutzfeldt · May 7, 2012, 2:25pm

Sounds good. I’ll be keeping an eye out for the release so I can implement it in pyIFBabel.
I forgot to mention that in the Treaty, those are all, of course, optional fields; only the author and title are required.

jakobcreutzfeldt · May 23, 2012, 7:34pm

I bumped the version to 0.2.2. It’s a minor update, adding just a couple functions for interoperating between the treatyofbabel.ifiction module and the treatyofbabel.ifstory a bit more convenient.