Transcripts collection proposal

Yeah, the title of this thread made me think “email some text to the author” and that was that.

Surely a literal transcript of each play is trivial to create from the data collected (here’s an example made with my current Parchment hack), but even going with a pessimistic estimate it’s reasonable to assume a good game will get at least a 100 plays during some stretch of time and I doubt any author will go through that many or more full transcripts by hand searching for error points. That’s where the extra data comes in – if the system flags error points automatically, the author doesn’t have to wade through parts where everything works as expected.

But if the game has changed the status line that’s exactly what we should send! If scores and turns are irrelevant, don’t send them. If room names aren’t important (one room game etc) then why send it? The status line might have the current problem being worked on (Violet) or an indication of some other game variables (Hoist sail…). I think it would be very odd to have a game where rooms are important with a large map but which doesn’t display that in the status line. Can you think of any?

Blue Lacuna.

If it’s trivial, great; it’s still more functionality than I have access to right now!

But I think my point is the value that a transcript can add goes way above “errors” - text games in general have very few alpha issues but tons of beta issues: very few crashes, but lots and lots of missing responses, alternate solutions, typos…

cheers
jon

I’ve realised this is really a non-issue… the server can use any kind of internal ID, whereas the session ID should be cryptographically random.

As to your other suggestions for the handshake, I really think we only need one session ID. This is an open and anonymous protocol, so no matter what we do, it won’t be possible to protect it from abuse. I think security and sanity checking really belongs in the server software (ie, unspecified.) What I am planning is that game authors will upload versions of their games to the server first, and then can see the transcripts grouped according to game version. (Including the checksum is a good idea, even if only for distinguishing between different releases made on the same day.) Any transcripts submitted for a game that hasn’t been uploaded first will be listed separately, with appropriate warnings to the author.

Juhana, I don’t like the idea of always sending the room title, when it just may not be relevant. The status line is likely to have the state information which the author thinks is most relevant. But we could also have a “status” item with which the game can send additional state data to the transcript server. This could include the room title if the author needs it, or any other internal variables. How does that sound?

joningold, those beta issues are really what this proposal is all about. Sarah Morayati has made a convincing case that authors need to clean up their act in respect to custom library and parser messages. This proposal would make it easier to detect those. That in return will allow the authors to fix the missing responses and consider adding alternative solutions.

(Edit: It was Sarah not Juhana who made the blog post abut error messages.)

Precisely. As an author, I’m looking for commands and synonyms that the player tries that I didn’t think to implement, missing or miscued responses, typographical and spelling errors, areas with poor gameplay or prose that just doesn’t “read” well. None of which would be flagged by an interpreter.

I have happily read through, okay maybe not hundreds, but at least dozens of transcripts looking for issues like these, and would do so again. It’s a pretty essential tool for beta-testing and THE main reason why I want an easy way for testers to send the transcript to me.

The proposal we’re working on here will make it possible for interpreters to flag some of those things, such as unknown commands and missing responses. Poor gameplay and prose isn’t really something that can be detected automatically… you’ll still need to rely on beta tester comments for that.

Yes.

In fact, poor gameplay and prose are things that even beta-testers can’t always detect. I have to read through their transcripts to find it for myself. Thus my desire for testers to have the ability to easily and instantly send me their transcripts in a readable format.

I’m just thinking, while it is certainly simple to print out the tags, might it be cleaner to print to a buffer array and pass the address instead? What do you think are the advantages and disadvantages of each method?

Another new idea, what about using one of the unicode control characters as delimiters? The Inform 7 docs say they don’t support characters less than 32, but I’m sure we can work around that.

I’ve been contacted recently by several people expressing interest in the transcript-saving Parchment so I’d like to start the conversation again. I’d like to actually make a Parchment fork with transcript collection capabilities from where it could eventually be merged into the official branch.

I’ve made some changes to Dannii’s original proposal: I’ve dropped everything that would require changes in the Z-machine standard (tagging library messages) and moved some stuff around. The idea is to start with the bare minimum and expand later if needed. The original proposal was meant as a standard for any interpreter that would support it and I’ve kept that in mind whenever possible, but my immediate goal is to implement this in web interpreters.

[size=150]The transcript protocol[/size]

The data is sent as JSON over HTTP POST requests.

The handshake

Firstly, the interpreter sends a message to the server giving the details of the story file. Either an IFID or a URL must be sent, with an IFID being strongly preferred. The other items are all optional, though as much information should be given as is possible.

{ "ifid": "ZCODE-2-080406-A377", "url": "http://mirror.ifarchive.org/if-archive/games/zcode/LostPig.z8", "release": 2, "serial": "080406", "interpreter": "Gargoyle 2009-08-25 (Git 1.2.6)", "session": 7253, "parent-session": 5629 }

(Note that I’ve moved the session variable creation from the server to the interpreter, and added a “parent-session” variable; the motivation for this relates to saving/restoring. More on this later.)

The server replies with HTTP code 200 and content “OK”. Any other HTTP code or content means the server does not support transcript logging and the game should not send them.

Sending transcripts

Again, the client sends transcripts over HTTP POST requests, but after the handshake, it does not need to listen to the response.

{ "session": 7253, "log": { "time": "2010-07-18T03:04:12", "turn": 17, "input": "x fsih", "response": "You can't see any such thing." } }

(I’ve made a couple of changes: The library message classifications are removed, turn count is included, and the interpreter sends only one turn at a time. Turn count is required because the log is sent asynchronously and there’s a small but nonzero chance that the server receives the requests in the wrong order. It can also tell the server that some commands have not been received if the turn count doesn’t add up. Note that turn count is more accurately the input count and it’s not the same as the turn count the game keeps track of.)

A client first sends its session ID, followed by the logged commands. Time stamps are sent in the following ISO 8601 format: YYYY-MM-DDTHH-MM-SS. If timestamps are not sent then the server should do its best to keep commands in order, and may use the time of the HTTP request instead.

Closing the transcript

(The following is my addition:)

When the interpreter detects that the player has stopped playing (in case of Parchment, the game ends or the player closes the page) it sends the following:

{ "session": 7253, "end": "2010-07-18T07:52:39" }

This lets the server know that it should not expect any more input for this play session and it can more accurately calculate the total play time.

[size=150]Saving and restoring[/size]

This is the part where things could get messy. What happens if the player saves the game, closes the interpreter and restores later? Ideally the interpreter would save the session number and its own turn count and on restore inform the server that it’s continuing from a restore.

This is where the “parent-session” variable in the handshake comes in: the interpreter generates a new session after the restore and informs the server what was the original session that was restored. With this information the server can link sessions together.

(I don’t know how Parchment handles save/restore, but I assume this would be doable.)

[size=150]Changes to Parchment[/size]

There should be an option to opt out from sending the transcripts. This could be done with a URL GET parameter called “feedback” where value 0 would mean no transcripts are sent. For example “http://example.com/parchment.html?story=zork.z8&feedback=0” would disable transcript collection. The author could then provide this link for people who wish to opt out.

Also there should be options in Parchment for the author to define the transcript collecting server’s URL and other stuff, but those are minor details.

So there it is, comments are most welcome.

Should we name the “turn” filed as something else, so as not to conflict with the game’s turn count? I’d prefer something like “input-count” so we don’t have to keep explaining that “turn” in the transcript protocol is different from the game turn.

Other than that, I think this looks great!

I like the changes, and agree with input-count, or perhaps input-id. I still think that including the server url in the blorb is the way to go… it’s just so easy to extract, and shouldn’t be hard to add either.

I suggest that the session id and input count be included in the QUETZAL so that they can be continued at a later time.

Ah, I missed that part. Yes, url in the blorb would be the best option, but there should also be a Parchment option for games that don’t provide their own url.