Why can't the parser just be an LLM?

smwhr · August 25, 2023, 12:20pm

A term coined by Kate Compton (galaxykate) (who may be on this forum) related to procedurally generated content : Kate Compton on Tumblr: So you want to build a generator...

So your algorithm may generate 18,446,744,073,709,551,616 planets. They may each be subtly different, but as they player is exploring them rapidly, will they be perceived as different ? I like to call this problem the 10,000 Bowls of Oatmeal problem. I can easily generate 10,000 bowls of plain oatmeal, with each oat being in a different position and different orientation, and mathematically speaking they will all be completely unique. But the user will likely just see a lot of oatmeal . Perceptual uniqueness is the real metric, and it’s darn tough.

manonamora · August 25, 2023, 12:49pm

I immediately thought of this ad:

You can’t rely on voice-command only…

OH: there’s another one:

grimjim · August 25, 2023, 1:36pm

An LLM probably could be trained to function as a parser for a text adventure, but the training corpus would have to incorporate transcripts of play. LLM capacity for hallucination/remixing means that breaking out of the intended bounds of play can’t be eliminated. LLMs are great at modelling language by design, but modelling world mechanics not so much, as that’s all abstracted via inference.

A more modest goal could be training an LLM to accept general language input and reduce it to an output of conventional text adventure commands, or even sequences commands, and let the text adventure engine manage the mechanics. This should be feasible, since research LLMs are being used to guide robots.

Encorm · August 25, 2023, 1:44pm

I had thought of this one, glad to see the joke doesn’t get old!

zarf · August 25, 2023, 4:14pm

Note that the original (1970) SHRDLU demo was highly rigged. If you went off the path of commands that the demo supported, it just wouldn’t work. (See Wikipedia article.) Recent reimplementations may be smarter, if maybe only at classic-IF-level.

Lancelot · August 25, 2023, 4:30pm

Now we only need to wait and see if someone suddenly decides to make an Inform7 version of SHRDLU. To me it was interesting at the time because it did some simple reasoning about an extremely limited “world”. As you pointed out, its success was only achieved when “sticking to the plan”. All they wanted to achieve is “pass the demo”, it was not really useful beyond that.

HanonO · August 25, 2023, 6:02pm

If this hasn’t been mentioned, or it sort of has been…

I agree with that: perhaps the LLM might be better employed and trained not as the actual parser, but to provide intelligent contextual and constructive error messages instead only when the parser doesn’t understand and gently coax the player into using correct syntax? Essentially the AI stands back unless the player needs help.

>PLEASE FIND THE CHAIR AND PLACE IT UNDER THE HIGH WINDOW

I’m sorry, I didn’t understand that.

…You don’t need to prefix any command with “Please”; I’m already your obedient servant! …Unfortunately there isn’t a chair anywhere nearby nor in your inventory, but I remember you last saw one in the Dining Room. …It looks like you’re trying to move a chair under the high window? Once you find a chair, you might try typing PUT CHAIR UNDER WINDOW if that’s what you mean.

StJohnLimbo · August 25, 2023, 7:02pm

Personally, I haven’t used LLMs/DNN-AI in the context of game-making, and I’m also sceptical about their utility for IF at the moment.

But I could imagine that they’ll become more useful in the near future, maybe with the aid of intermediate structural layers or an ontology, as described above by Jeff.

Hmmm, I think I get your point and partly agree, but I’d say you would not necessarily have to use all of these techniques (or any, actually) if you don’t want to?

Some people like to make games using their own engine, written in a low-level language close to the metal. That’s part of the fun of game-making for them, and a personal challenge and a creative outlet as well.

Some eschew that part of the fun and instead write their engine in a high-level language which takes advantage of the work done on lower levels.

Many more people wish to avoid that sort of challenge, so they use existing engines.

Many people use extensions, plugins and snippets which they haven’t written.

Some people work together, leaving parts of the plot planning, the writing and/or implementation to their collaborators.

Some people use complex systems and/or procedural generation to create emergent stories and gameplay which they didn’t foresee (as Mike mentioned).

At least at first glance, these methods of game-making don’t seem to diminish the fun that people have, or their sense of exercising their creative faculties. I don’t think it’s so clear that AI methods should necessarily diminish that; depends on how they are used.

You could just use a subset of methods for various purposes, and maybe mix-and-match as you see fit, and the AI could help authors in areas where they are less skilled.

Lots of people have said over the years how some (or a lot) of their projects have stalled or had to be shelved because

they were too ambitious regarding the scope;
they had good puzzle ideas, but lost steam after writing 30 out of 70 room descriptions;
they had stories which they desperately wanted to tell, but couldn’t think of puzzles;
they were better at writing descriptions than at writing natural-sounding dialogue, or vice versa;
they dream of making a game, but lack the coding skills (*);
… and so on.

((*) Twine and other systems were and are justifiably lauded as opening up game development and interactive storytelling to more people than before. Why should AI not be able to play the same role?)

I mean, I also enjoy the process of programming and game-making, and would probably not want to use a hypothetical “make-my-game” machine (at least not generally), and insofar I agree.

But even so, honestly, imagine you could describe a game idea to an AI as follows:

“It’s about the adventures of a young nobleman who gets caught up in the chaos of the French Revolution. He is torn between loyalty to his family and sympathy for the ideals of the revolutionaries. The NPCs include: his brother Jean, who is a priest; his mother Marguerite, who is a widow trying to keep the family and the estate together; Odette, who is a daring sans-culotte; and the historical Danton and Saint-Just. There should be a ballroom scene at a château, a prison escape scene, a puzzle involving the new Republican Calendar, a moral dilemma concerning whether …”

… and so on, you’d give a specification of the game that could incorporate various levels of detail, from a high-level overview down to the descriptions of items if you wanted to.

And then the AI would spit out a working game.

Not too long ago, if anybody had said that this would be within the realm of possibility in the near future, people would have said “Holy moly – that’s awesome!”
In some ways, it’s sad that this is either dismissed as uninteresting or declared to be a terrible dystopia. One could say it’s one step closer to the Holodeck.

pinkunz · August 25, 2023, 7:31pm

I don’t have enough time or spoons to give this the response it deserves, but I’ll leave you with this. The biggest suspension of disbelief I had watching Star Trek TNG was that people would read recreationally at anywhere near the frequency shown on the show when the holodeck could create any custom world on demand that was indistinguishable from reality itself. Indeed, the show itself ironically demonstrates this; why would you read Sherlock Holmes when you could experience it first hand? The most realistic part of that was that they were always reading centuries old books, mostly, I assumed, because how many books would realistically be written when a reality generator was at hand.

grimjim · August 25, 2023, 9:19pm

Here’s the Google LLM project which connected an LLM to a robot that I vaguely referred to earlier. Their examples would seem to map quite well to a game like Infocom’s Suspended.

Alianora_La_Canta · August 25, 2023, 9:48pm

I can’t get any modern voice AI to understand anything I speak to it yet, despite being employed in a job requiring me to speak with people on the phone. Even if I could, I’d probably insist on “W”, “W as in water/whisky” or maybe “letter W” because that’s less likely to be misinterpreted than “walk west”. One eventually gets exhausted with disambiguating by voice, even if forced to use that mode. The problems some of my colleagues have determining whether some people are saying certain dipthongs or not suggests it’s not something that can be absolutely solved by AI, only assessed statistically, individually/relatively trained or a combination of both.

Also, some people (including multiple people in this thread) cannot depend on having free access to their voice 100% of the time, so text mode would always be a valid accessibility aid.

Quote from @pinkunz , post 31:

Ummm, if all of this gets figured out flawlessly, what exactly are we doing? Are we making games anymore or simply ordering content set to our preferences?

We continue figuring out what it is to be human. Remember that part of this includes disagreeing with any number of opinions that may appear to be otherwise set: “AI might work well for this part of many other people’s games, but I think another method is better for that part of my game, and I’m going to have fun showing why!”

I think most LLM proponents vouch for putting the LLM on a server, at least until personal devices become up to the task in a few decades.

I’d go so far as to say that a more effective method than the LLM parser would be the LLM interface element, that would try to translate what the player typed into something the game is set to understand. It would still feature most of the same problems as an LLM parser, but would at least only have to be installed the once! Even then, it’s only likely to be useful for new players who don’t understand how the games have been programmed, or for people needing to input terms in a different languages to the one(s) that the parser can process unassisted. For everyone else, cutting the layer would be more efficient.

Quote from @rovarsson , post 39:

Uhm… No room for the dried apricots?

Seriously, what is this oatmeal problem? A logical paradox, a thought-experiment, a metaphor,…?

It’s an analogy. Modern large language models and some other statistic-based systems (such as badly-calibrated random tables) make a lot of filling content, but it’s all really boring. Fixing this requires something that makes it more interesting: human creativity for the IF, dried apricots for the oatmeal. So you were on the right lines with your suggestion even if you did not know it.

TheGrandRascal · August 26, 2023, 3:14am

Re: SHRDLU for Windows…

Thanks, I’ve snagged it. But I have my doubts – my system is 64-bit, and has been known to balk at older DOS programs (the site says you have to start a *.BAT file to run it). But I guess I’ll see.

evouga · August 26, 2023, 7:13am

I tend to agree with Brian Rushton and others that with current technology, trying to put an LLM in the role of translator from natural language to actions on a world model (or even more ambitiously, asking the LLM to maintain the world model itself) doesn’t seem worthwhile.

I do think that there is tremendous potential, though, for enhancing IF using generative models; custom error messages have been mentioned in this thread, and I think also that using LLMs to give life to NPCs is not so far-fetched given current technology. The author could write a few paragraphs of context about the NPC’s world knowledge and motives and let the LLM handle the ASKs and TELLs.

I tried a quick experiment on ChatGPT, though unfortunately it didn’t take long for the seams to show:

You are Galatea, a marble sculpture of a woman who came to life in an art gallery in 2000. You have spent the last 20 years living in the real world and learning what it is like to be an animate human. Please answer my subsequent questions from the point of view of Galatea.

Of course, dear interlocutor. I am Galatea, once a lifeless marble figure, now imbued with the experiences and understanding of the world around me. Feel free to ask me anything you’d like to know about my unique perspective and the journey I’ve undertaken since coming to life in the art gallery in 2000.

What have you learned about love?

From my observations and interactions, I’ve come to understand that love is a fundamental aspect of the human experience, a force that drives people to connect, care for one another, and find meaning in their lives. It’s a deeply enriching and powerful emotion that contributes to the beauty and complexity of being human.

Have you fallen in love with anyone?

As Galatea, I am a creation that came to life from a sculpture, and I don’t experience emotions or consciousness in the same way humans do. While I have learned about love through observation and interaction, I do not have personal feelings or experiences, such as falling in love. My understanding of love is based on the experiences and stories of those around me, as well as the knowledge I’ve gained since coming to life.

Why can’t you have personal feelings or experiences? What about being an animated sculpture makes you unable to experience emotions?

I apologize if my response gave the impression that I cannot have personal feelings or experiences due to being an animated sculpture. Let me clarify:

As a creation of fiction, I can certainly describe emotions, thoughts, and experiences from a conceptual standpoint. I can offer insights and perspectives based on the information available up until my knowledge cutoff date in September 2021. However, I don’t possess consciousness, self-awareness, or subjective experiences like a living being does.

While I can simulate conversations and provide information from various viewpoints, including that of a fictional character like Galatea, I don’t possess the capacity for emotions, experiences, or self-awareness. My responses are generated based on patterns and information in the data I’ve been trained on, rather than from a personal, conscious perspective.

In essence, I can discuss and describe emotions and experiences, but I don’t have them myself. My responses are informed by the data I’ve been trained on, which includes a wide range of human experiences, concepts, and viewpoints.