Proposal for improved Inform 6 debug file

I’m splitting this out into a new thread.

(I think I got the link right…)

1: Overview

I am just a little antsy at the idea of specifying that text does not use CDATA. If you’re specifying XML, then CDATA is legal. But I get that you want this to be easily parsable by I6 code. Maybe note that fact and say that the debug file should avoid XML fanciness.

2.2: Source Files

The idea is that given-path is what’s handed to the compiler as an Include constant or a command-line argumment, and the resolved-path is after all the adding suffixes and searching through +source_path, +include_path, etc. Right?

If the resolved-path is supposed to be for loading the file contents, you’ll have to say what it’s relative to. I assume the answer is “the current working directory of the compiler invocation”. (I rather often type lines like “inform +…/i6lib unittests/test.inf”. Barbaric, I know.)

3: The Source File Level

This plan has a set of source files, with definitions nested inside them. Is that what we want? The alternative would be having a list of source files at the top of the file, each with an arbitrary (or sequential) index number, and then definitions could refer back to those.

I’m not opposed to what you’ve got, but it does presume that every definition exists in exactly one source file. This is only sort of true, and you’ve already fudged in a fallback for things not in a file.

(If we’re going to get really picky, an object or function can start in one file and end in a different one. Not that this is a good idea.)

Let me make the problem of file location more complicated!

Recently somebody (don’t remember who, sorry) suggested listing locations of I7-defined objects as the position in the I7 source file, rather than the I6 source file.

In many cases this is meaningless, and even when possible it would require support from the I7 compiler that we don’t have. But it’s worth leaving space in the spec for this.

On the other end of the stick, the current I6 compiler does not store file positions, only line numbers. (Not even column numbers.) So a first iteration of this might only have the tags. Rather than requiring everything, I would rather take an as-much-information-as-is-available approach.

3.2: Named Addresses; Global Variables, Classes, and Objects

In Z-code, objects are identified by a sequential index number, not by an address!

In the current debug format, function addreses are given relative to the function segment rather than the absolute position in the game file. Will that still be true?

Do we want to have declarations for verb grammar lines?

“Not all of the MAP_DBR values are available as system constants under both targets. But is there any reason not to make them so?”

A lot of them are missing from Glulx solely because I was lazy and the game had no runtime use for them. I have no objection to making them available. On the other hand, the list of system constants does not match up perfectly with the list of MAP_DBR fields. (E.g.: “Unicode table” and “alphabets table” are missing, strings_offset and code_offset are divided by scale_factor in Z-code, etc.) You might not want to transplant the quirks of I6 sysconstant generation to the debugger.

vaporware: “Are Glulx variables referenced by index or offset within the local frame? (Is it still legal to have different sized locals and/or gaps in the local frame?)”

As EmacsUser said, different-sized locals are deprecated and unsupported. Gaps are something I’ve never considered… I think it would still be best to refer to local variables by offset, if only because that’s how the opcodes and the VM implementation refer to them.

Thanks, and sorry for the threadjack.

Actually, Ron’s comment made me realize that I don’t gain anything by excluding CDATA (see the other thread), so let’s allow it.

Right. At least, this is what I got from reading the current C code.

Yes, that is the answer at the moment (because that’s what version 0 does), though it’s not clear to me that it’s the right answer, or that there is a right answer. Other alternatives I can see arguments for are (1) to leave things as they are, but also include the working directory in the debug information file, (2) resolve everything to an absolute path, and (3) record the source path, include path, etc., and resolve files at debug time (this is what DWARF does).

Now that I think of it, we should also have a way to say ``this file is Blorb TEXT resource N’'.

How about this? Right now we have elements that hold either zero or one elements. Let’s make that zero or more and throw in children elements for file indices as you describe. So if we begin an object declaration in foo.h, continue it in bar.inf, and finish it in baz.h, it gets three elements, one for each contiguous region. Similarly, a someday I7 compiler might give us one location element for the I7 source, to be supplemented by a different location from the I6 compiler. The two situations are distinguished by adding a type attribute to the elements; in the former the types are all “Inform 6”, but in the latter case one is “Inform 6” and the other “Inform 7”. I guess we’ll also need index attributes so that the debugger can order the snippets for a file-crossing declaration.

That’s fair.

Ah, right.

I wasn’t planning on it, because I know lots of situations where the absolute position is useful and none where the offset is. If we have some of the latter, we can change it back.

It wouldn’t hurt. I’ll put them in.

Adding the missing ones does not bother me much, but division by scale_factor is a little more awkward. I guess we’ll include a tag for MAP_DBR entries.

That can be done.

Right, this came up earlier.

The weird requirement here (unlike most Blorb contents) is that the debug information might be used by an interpreter feature (a specialized debugging interpreter) or a game feature (an I6/I7 debugging extension). So it has to be available to both levels of code. My Glulx demo was an interpreter feature, so it was more or less sufficient to specify the debug data location with an interpreter option. But I agree this is not good enough in general.

The simple option would be to stick the debug info in a custom Blorb chunk. But I’d rather read the data in with the glk_stream_open_resource_uni() call, so it needs to be listed in the index as a DATA chunk. The chunk type will be TEXT, as you say, for this XML format (because that fits the UTF-8 encoding requirement). For a version-zero debug file, the chunk type should be BINA.

How to find it? For this, we will use a custom Blorb chunk: chunk ID ‘Dbug’, four bytes, containing the index number of the DATA resource. A debugging interpreter will be able to look for this chunk directly, and then read the data (either using glk_stream_open_resource_uni() or the lower-level Blorb API).

To make this data available to a Glulx game extension, I propose adding a Glulx gestalt selector: DebugChunk (12). Query this to get the DATA chunk number, and then call glk_stream_open_resource_uni(). If the chunk number comes back as zero, there is no debug data (or the interpreter doesn’t support this stuff).

(There is no direct equivalent – a Z-code game extension – because Z-code game code can’t read data directly from the Blorb.)

By the way, I was briefly considering adding a general “interpreter options” chunk to the Blorb spec. This bit of information could have been stuffed in there, instead of being its own four-byte chunk. But what does that extra abstraction really buy us? Not much, I decided, so the heck with it.

That sounds reasonable to me. I actually meant something else by the comment—that source code could be blorbed, in which case elements would want a way to say where—but fleshing out this part of the plan is good too.

I’ll post a revised draft soon, though probably not today.

Maybe you should just add a glk function to open all chunks.

Here is another revision. I’ll post future revisions to the same place.

Barring the option to open arbitrary chunks as Dannii suggests, it occurs to me that it would be good if we could distinguish these failure cases. Most authors won’t have an easy way to narrow the problem down if the story code doesn’t help them.