Dividing up a source (revisited)

severedhand · December 18, 2023, 1:08pm

It’s now three and a half years ago I asked speculatively about potential ways to divide up a large source. (Splitting the story into multiple files)

I didn’t need them then. The WIP was much smaller and I said ‘I’ll have to keep pining for that newer computer.’

I got the newer computer, and one of its processors alone was maybe 5 times faster at compiling than my whole old computer? Then I found Inform cannot make use of the multiple processors to compile, so no more improvement is coming.

WIP is now 250k words with 900 things, so I expect at least 500k words when done. Compile time’s now 20 seconds. That may not sound bad, but due to size and complexity of game, it’s starting to feel painful given how often I have to rebuild, including attempts that fail, and how much I will have to, before the end.

The source is very readable at the moment. Each chapter is its own volume. I keep the introduction of variables and mechanisms in the source pretty much right at the spot when I first realised I needed them. This gives a logic (prior to the definition, that thing wasn’t needed) and history and organisation.

After all the chapters is a huge action block with more master code for actions, or hacks to those actions for this game. This block references things from all chapters.

For these reasons, it’s hard to break up. My dream would be to keep the full readability but just be able to turn individual chapters off at compile time. I can’t imagine a way that’s possible. I’d have to make sure no chapter referenced anything outside itself, and the huge action block at the end would basically be impossible to keep.

The closest practical way I can think of achieving a division is to move all thing definition lines to one volume, then turn all chapters into extensions. The process of doing this feels fraught (possible bringing bugs in that I might not notice) but second, it would make me grumpy that those thing definitions would not be in the chapters where the things are. I suppose everything after the definition line, which is the bulk of the description of the thing, would still be in the chapter. Readability would go down, that’s for sure.

Anyway, I’m considering trying what I just described, as an experiment. Because the compile time is getting to me, and I also just watched the 30 years of Doom Romero-Carmack Twitch recording, in which Romero mentioned, ‘Anything you can do to get your core iteration time down is worth doing.’

So my questions are – can you think of any other approach or technique? Or observations on what I’ve suggested to myself? My idea will reduce readability, but still maximise it versus, say, putting all of each type of anything in its own block. I think the total division approach guarantees a source that’s not very readable, and useless as a practical history of thought process. This is extra important in a game that’s very novel-like.

Thanks.

-Wade

mathbrush · December 18, 2023, 3:49pm

I had some issues with my recent code (350K) so I can only speak from that experience.

The biggest thing that slowed down my compiling was the skein; whenever I deleted the skein, compiling improved significantly. That may be different for Mac, and you may have already handled that.

Other than that, one thing I had resorted to that may be useful to you was making your command block refer to kinds rather than individual things. You can have a long list of kinds defined, and then base your actions on those kinds, and then the individual sections make things those kinds.

For instance, I had doors that were ‘rifts’, enterable scenery objects that were ‘rifts’, and non-enterable scenery objects that are ‘rifts’. So I made a definition: a thing is 'rifty' if.... And then all my commands referred to ‘rifty’ things (I know this isn’t a kind, but it’s fairly similar and I did use kinds in other situations).

Volume 1 - Main code

A bloody-knife is a kind of thing.

Instead of dropping a bloody-knife:
   say "You don't want to incriminate yourself!";

Extension 1 - Streets of the city

The bloody knife is a bloody-knife. The description of the bloody knife is...

zarf · December 18, 2023, 4:17pm

Dividing up the source into extensions can be good for readability. (It was for me.) It does not improve compile time at all, as far as I know.

Your idea of dividing up the source and then omitting certain chapters – that will help, but as you say, it could be an organizational headache. I did something like this in HL where I implemented the rooms in a separate file so I could omit them. That way I could work on the objects and alchemy rules in a single-room game with all the ingredients available!

It is the same on Mac.

gfaregan · December 18, 2023, 5:10pm

Why can’t inform compile using multiple threads?

zarf · December 18, 2023, 5:19pm

Because that’s hard. Compilers are generally badly suited to being multi-threaded. Particularly in a language like Inform 7, where the whole program has to be compiled as a logical unit because (most) any declaration can occur anywhere.

gfaregan · December 18, 2023, 5:32pm

Is this also true for TADS and other parser frameworks?

Draconis · December 18, 2023, 5:58pm

Inform 6 previously had the ability to compile in parallel (to a certain extent), but it was unwieldy enough for the author that nobody ever used it and it was eventually dropped from the language. I’m pretty sure Dialog can also only use one CPU core at a time; no idea about TADS.

zarf · December 18, 2023, 6:29pm

But not for I7-generated games, and not for Glulx compilation from any source. Just in case you were getting excited. :)

severedhand · December 19, 2023, 1:17am

That’s interesting re: the Skein. I’ve been telling people for years that deleting it can speed up play in the IDE, but I didn’t know that may speed up compilation? I’ll try it.

Re: kinds in the command block, another interesting idea. Unfortunately it probably won’t suit my Check-heavy programming style, since once you go to kinds, you can’t use Check rules on them. EDIT: Hang on, you can. There’s something like this somewhere. What am I thinking of?

So far I’ve got two omittable blocks. I’ve programmed the graphics so omitting one extension will compile without the automap (so I can leave that out most of the time). And there’s a 17k-word mechanic used in only one chapter that I was able to put in an omittable extension. That was a big deal on my old computer, but now those 17k words are a drop in the ocean

In bed last night my brain came up with a slight improvement on my original idea. Instead of gathering up and moving thing definitions, then moving leftovers to chapter extensions, maybe I could leave all the things and rooms in place in the main source, and split mostly just the rules of each chapter off into chapter extensions. I might try this on a small chapter and see what happens.

The lack of multi-processor action is sort of depressing, though.

Thanks all for your input.

-Wade

Draconis · December 19, 2023, 2:59am

You can’t use kinds of actions with Check rules, because while there’s a single Before, Instead, and After rulebook, the Check, Carry Out, and Report rules are actually separate Check Taking, Check Dropping, etc rulebooks. So if a Check rule applies to a kind of action, what rulebook does it go in?

Timewalker · December 19, 2023, 7:21am

TADS allows you to split the source code into multiple files, and thus will only recompile files that have changed.

zarf · December 19, 2023, 4:39pm

It doesn’t affect compilation per se, but it gets into the basic interation loop of “change the source, compile, run, see the effect of the change.” It’s quite possible for the majority of that loop to be time wasted loading a bloated-up skein file.

I don’t know how much it will help you to clean that but, but it will help some.

HanonO · December 19, 2023, 9:09pm

Most everyone has answered, but yes. Using headers and organizing your source into a readable tidy document is nice for organization, but does not affect compile (unless things get defined out of order).

From what everyone has described, there is zero difference between including an extension and copy-pasting the extension text into your source. Including it by name removes that extension text visibly from the source, but for compile purposes, it’s just reading the same text in as it goes whether it’s in the source or an external extension file.

Zed · December 20, 2023, 12:15am

The lack of multi-processor action is sort of depressing, though.

I’m afraid linear compilation is kind of fundamental to Inform. Each given line could mean so many different things, so the compiler can’t really start to make decisions until it has ingested the whole thing and then it does like umpteen passes over everything to narrow down its interpretations.

Given that optimizing for speed hasn’t been the project’s greatest priority I think that at such time as a close analysis of the code with speed in mind is undertaken, some low-hanging fruit will be found. But without Inform becoming very different from the absurdly syntactically flexible thing it is today, compilation will remain, at least primarily, a linear affair. (There are probably some isolated bits that could be parallelized; my guess, though, would be that doing so would introduce complexity way out of proportion to any gains.)

severedhand · December 20, 2023, 10:18am

I started trying to split out the smallest chapter to an extension. Then I realised that if a location’s used in more than one chapter, there are rules referring to it in multiple chapters, with clauses checking which chapter. (You don’t have to remind me I could split it into two rules - sometimes it’s better, sometimes it’s worse. Also, you have to identify them first, which makes the task worse.)

Maybe… if I only split out some chapters whose locations don’t get repeated. Which will ultimately be most. But I made a mistake starting on a chapter whose locations will be.

-Wade

clintmbishop · December 20, 2023, 10:44am

Sounds more and more like you just need to bite the bullet and go with TADS. As popular as Inform is, it’s just not designed to handle large games. Like trying to write a novel in Notepad.

severedhand · December 20, 2023, 11:04am

Does TADS not have the same problem? Let me return to Timewalker’s comment:

TADS allows you to split the source code into multiple files, and thus will only recompile files that have changed.

My assumption was the reason nobody commented was that this feature was not going to get around the issue. It sounds much smoother than Inform’s extension alternative, but cosmetic organisation of the code is different to interdependencies.

Let’s say I split a TADS game into 10 files arbitrarily. Are these files absolutely independent of all the game content? It doesn’t matter which bit is where in any of them? I couldn’t imagine how this could be so. My expectation would be that if I edit something in file 10 that affects what’s in file 1 in the game, but then file 1 doesn’t get recompiled (because I didn’t touch it), the result won’t work. Or that the reference in file 10 would force file 1 to recompile, leading back to the original compile time. That’s the real challenge I’m facing – the interdependency of things spread throughout chapters. You can clear me up if I don’t understand what you’re saying about how TADS works.

-Wade

ArdiMaster · December 20, 2023, 12:05pm

If file 10 happens to be a header file that gets included everywhere, then yes, the entire game would need to be recompiled. For regular source files, there is a final linking step that ties everything together.

If you want to change how an object responds to some action,for example, you would probably use the modify myObj syntax, and the compiler will amend the object definition at link time.

blindHunter · December 22, 2023, 5:23pm

This is why I much prefer TADS over Inform, aside from the fact that its C-style syntax is something I can easily wrap my head around. You can organize the source of your project in any way you like, and as @ArdiMaster said previously, unless it’s a header file that has been modified, you can make any changes to the source and not have to do a full rebuild.