A hobby project of absurd scale: building an IF schema language with AI as the workforce

You know the halting problem all too well and the advantages of a declarative system, but that is essentially the answer to your question. Monte carlo simulation, dead end detection, reachability analysis.
I love anything with a node based user interface, so being able to build such kind of tools on top would excite me a lot. With a descriptive schema, i hope to be able to take a break and work with some nice looking UI stuff.

Will I succeed, honestly no idea. But Urd give me a nice playground to work with, as it sits at the intersection of technology and art. That’s the place I always lived in.

Will it degenerate into halting problem issues because of lambda extensions, for sure. I have not spent too much time thinking about where the lines exactly are going to be drawn. But I am hopeful that AI will unlock something that is more than: it’s a faster code typewriter. I think that an AI is uniquely positioned to reason about a world without having to execute it, and thus enabling something that a programming language could not.

You would first need to define something to succeed at.

Again, if you direct your attention to automatically testing Inform games, that would be actively useful.

And… there’s another problem, too. A bunch of people have been posting on this forum about the new IF development system they just cooked up (most of them significantly AI assisted). People post about them like they’ve built a better mouse trap. They get some polite inquiries, but approximately nobody but the authors themselves are developing games in their own platforms (and, usually, not even the author has written a game in their system, because they developed their IF system while procrastinating writing a game).

As it stands, even if you accomplish this “hobby project of absurd scale” by whatever standards you hallucinate, literally no one will benefit from it. No one but you will use it, and perhaps not even you. You might use it to direct an AI to make games in it, but, if you do, you’ll have wasted your time building the system, because an AI can already make games in existing systems.

(And then there’s the fact that AI generates slop by design, and this can’t be fixed with more training data or faster GPUs.)

3 Likes

I don’t know, the current IF systems seem to be doing it pretty well, and without the need for AI. I’m not sure if this clarifies anything.

Ehhh… “they’re trained to predict the next word” isn’t the deep insight into LLMs’ capabilities or limitations that many people seem to think it is.

(FWIW, I regularly get (intentionally) funny content out of ChatGPT that makes me and others laugh. Just not by asking it for jokes.)

3 Likes

In recent weeks AI has been solving and proving math problems no humans could in the past. At an alarming rate. But that is besides the point.

I don’t want to get into a big to AI or not to AI debate, I am sure everyone has seen enough threads go pear-shaped.

I will only quote Amara’s law. “We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run”.

I am in the camp of people that believe that most of the hype is real, and some more because LLM’s are just the beginning. In the academic world, token prediction is already ancient stuff. The diffusion models that are coming will blow our minds.

I will not attempt to convince anyone who is sceptical. It’s such a polarising subject and time could be better spent having fun. With or without AI.

2 Likes

In recent weeks there’s been a very impressive press release and a preprint.

Reality will take a few months.

2 Likes

3 posts were split to a new topic: “World Schema Specification”

A post was merged into an existing topic: “World Schema Specification”

I would temper this declaration. I played around with ChatGPT and Claude last year to see how well it did with Inform 7 programming. It’s really bad at it. Like not even functional.

One of the core reasons I switched to Typescript for Sharpee was because it (and python) are its two strongest skills.

So now you get to the thing I’ve been advocating, which is code generation isn’t a big deal. Heck, Inform 7 used to generate Inform 6. Claude generating Typescript specifically around Sharpee’s API’s (to me) is no different, as long as the logic, prose, dialogue, and mechanics are defined and designed by the author.

It does make a difference that we’re building these tools with GenAI in mind. It allows code generation to be far more successful and well-tested than anything you could generate in any legacy IF platform.

2 Likes

What a waste of resources and community time…

Quick update for those still following along (and apologies to those who feel they speak for everyone who isn’t).

The Urd v0.1.0 compiler is implemented and passing tests. Five-phase pipeline in Rust (PARSE, IMPORT, LINK, VALIDATE, EMIT), 370 tests all passing, including end-to-end integration tests that compile complete scenes through the full pipeline. ~6,600 lines of Rust, 64 diagnostic codes with structured errors and suggestions, deterministic output, and error recovery throughout.

The six implementation briefs were written first, defining every phase boundary, data structure, and acceptance criterion. Claude Code (4.6) then implemented against those briefs, one phase at a time. The compiler worked on the first pass without requiring redesign cycles!!

This is just a small step but it feels like a major milestone to me. If anything meaningful was achieved here, it’s the validation of the methodology: front-load design into precise specifications, close ambiguities before implementation starts, then hand execution to AI against those contracts. Having the code printer churn through such a big body of briefs in one night and pass integration tests, is in my experience, not the norm.

More information can be found on the site, including a dashboard. To be 100% clear. You can’t play anything yet. A runtime hasn’t been implemented yet, that is the next step. However, before I get there, I will be adding more tests and try to break the rust compiler to see how it behaves with very large dependency chains and big files.

Happy to answer questions. And as before, if you see fundamental problems, I’d rather hear it now.

Thanks to everyone entertaining me.

If you would rather not have any updates on progress, I am more than happy to stay quiet.

I, for one, love experimental systems. The more the better, I say. Of course most such systems tend to die on the vine and not develop much of a following, but once in a while someone is going to make something that has some traction.

7 Likes

I can see a potential role for an AI as an interpreter rather than a parser. Different players use interpreters for various reasons - some are plain text only, some allow font/color customization and might support multimedia. If someone wanted to use an AI/LLM as part of play optionally, it could be integrated in a specialized Glulx interpreter. The parser would still do its thing, and an AI assistant would basically co-pilot and might jump in and suggest how to help the player phrase a command if it senses they’re having trouble. It might be good for new players as a universal tutorial/help option.

The first tactile interaction with the compiler is now live on the site under the “Playground” section.
Every time there is a new successful build of the compiler, the wasm will be updated automatically on the site.

Before I get to a sample runtime or analyser, I will need to tighten up a couple of things and update the writer reference documentation together with some quality of life improvements when editing the definition file and multi-file support.

How do you know the compiler is working without a runtime?

Because it passes a very strict json schema spec file and I can already visualise the world.
The visualisation is experiemental, not on the site.

I am at the moment going through the documentation for the writer and inspecting every supported input by hand. Already had to apply a fix related to integer ranges and how they get stored in the output.

It’s not a compiler in the sense of bytecode. It outputs a schema. The schema specs are published.

Ah, okay. So if I play with the compiler, I can have it generate a human-readable schema?

It’s an intermediate schema. Human readable enough for a technical person, but ready for an interpreter.

The compiler as we would more commonly think of might come later. If that is a direction I will take Wyrd in. But the architecture of Wyrd is a whole other thing that would only confuse now.

Whatever happens, urd.json is the intermediary step. I think a lot of games and systems can just use the JSON and simple world model for state management.

Let’s say that you would want to build a visualiser or a node based editor (the thing everybody and their dog is working on), you would do that straight on top of the urd.json.

However, I think there is a lot of value in very specific tools just for the writer with good ‘code’ completion and LSP features. Edit: One thing I am personally very excited about is observability and verification. Being able to use ‘symbols’ and see “where the heck that behaviour came from”. I think that for anything that brings value and safety to something that further up in the chain will use AI, this is vital. Don’t ask me about the details, it’s very conceptual :smiley:

The process went from spec definition → early compiler build → test engine in less than 7 days.

This is wild!! I need to make more changes to the specs to add support for 2 missing features to be able to do “Cloak of Darkness” using this pipeline.

1 Like

Here is a brief update:

The heavily AI-assisted specs-driven development is coming along nicely (what a mouthful).

  1. Exactly 25 days after the first document “Landscape analysis & gap assessment” was completed, it looks like this has a real chance of going from complete moonshot to an early alpha release
  2. A particular shoutout goes to Daniel and Tara for the invaluable feedback and challenges to the project that came timely and at the exact right time. Thank you!
  3. There are 3 key artefacts that capture what Urd is, what it is, what it will become and what it is not. These are:
    1. Architectural Boundaries - governance for v1 (article): ( The Document to Read First · Urd )
    2. Architectural Boundaries - governance for v1 (actual document): ( Architectural Boundaries · Urd )
    3. An essay “The world is not the interface”: The World Is Not the Interface · Urd
    4. v1 Completion Gate: The Gate · Urd
    5. The “a quiet introduction” has also been updated

There is a lot of content here and many things are now clarified and verbalised that have emerged as part of the process. A very good way to get an overview is from your favourite LLM and to ask it to fetch the latest update and give you a tl;dr. Since AI generated posts are not permitted, I won’t drop the summary here.

The site is heavily optimised towards being indexed by LLM’s and AI search engines. So they are doing a really good job at explaining Urd and Wyrd. Better in fact, than I ever could when put on the spot.

To say that the latest documents reflect what I always wanted to do from the get-go would be delirious. To say that the process is yielding the type of forward momentum that I was hoping for is true. I think that AI has become a lot more than just an incredible code printer.