There have been some interesting experiments in the “AI writing tools” world with summarization loops, where the LLM is asked (behind the scenes) to generate summaries of its own output and keep certain state-tracking document up to date. These are then appended to subsequent queries to improve self-consistency. This is still not a perfect solution (and I think current LLM development is more focused on the “ideal” approach of improving the LLM’s comprehension of a single unsummarized input stream), but the continued improvement in context-consistency in, e.g., GPT-4o suggest that this is an area in which LLMs can continue to advance.
For decades (!) there has been a pattern where newbies will pop up saying, “AI will solve all our parsing problems!”, and then slowly realize why AI does not, in fact, solve all our parsing problems. I still don’t think LLMs have much to offer traditional parser IF, but I’m starting believe in the feasibility of a wholly different sort of parser IF where the author defines the geography, story, puzzles, characters, etc. in fairly granular detail, but the actual physical interaction and mechanical prose is simulated (hallucinated?) by an LLM.
A few tools (like AI Dungeon) have started to go down this path by maintaining basic human-generated worldbuilding reference books, but without a really rigorous system of state-tracking this still rapidly descends into stream-of-consciousness weirdness. But what if, rather than just trying to provide loose guiderails for creative hallucination, there was a whole system of prompts, state-tracking feedback loops, handwritten prose, and even bespoke summarization and revision bots all laser-focused on keeping the world and story in lockstep with the author’s original intentions? In other words, what if instead of improving LLM’s creativity, we focused on limiting that creativity to act as directly as possible as a simple linguistic bridge between the author’s intentions and the player’s experience?
(For example, one part of this system might be a secondary revision loop that takes the LLM’s initial draft output and screens it for consistency with the state model, aggressively revising or excising any references to anything - objects, events, whatever - that aren’t explicitly mentioned in the world model and plot databases.)
The types of IF experiences well-suited to this sort of system would naturally be different than those well-suited to traditional parser IF, but I think there are some. For example, LLMs are particularly well-suited to “type anything”-style conversation with NPCs, if given enough guardrails for consistency.