Yeah, this is interesting. One of the more salient aspects I see all the time in gaming discussions is people want to feel like they’re in a “lived in world” but also one that is responding to what’s going on – not just with what the player does, but also with dynamic events not generated by the player.
Yet people do want a consistency that allows for a coherent story or theme.
Scenario
So let’s say you have a text adventure with a defined beginning and end. And a story that the author wants to convey. But the pacing of the story and the events the player encounters are all based on what they do in going from the beginning to the end, of course.
Traditionally, a lot of the development time goes into accounting for how the player might encounter things and possibly gating them so that they only encounter things in a given order. (This is the same thing “open world” games had to adapt to as they removed a lot of the linearity in pathing, which led to conformity of experience.)
Structure
So let’s assume the story is modeled as an ontology as is the game world (“world model”).
In that case, could an LLM based approach along with a command interpretation layer and perhaps an ontology API having some interesting possibilities here?
I think yes.
The command interpretation layer powered by the AI could allow players to input natural language commands, even if they are not predefined in the game’s parser. They don’t have to be long commands but they could be. So you don’t necessarily worry about how much or how little people want to type. You have a system that adapts. And I agree: this would largely be a classification exercise to an extent.
So, first, an LLM can serve as the tool for generating dynamic narrative content. As players progress through the game, the LLM can create personalized descriptions, dialogues, and events that adapt to the choices and interactions the player has.
However, by using the LLM-generated content with an ontology API, the game can dynamically adjust the pacing of the story based on player decisions. Significant choices could trigger branching storylines, unexpected events, or different character interactions.
An Ontological Mystery
So let’s say we create an ontology for a mystery story where the player’s choices impact the outcomes of various events.
I work with OWL (Web Ontology Language) a lot, so let me give an idea of how I might start doing this:
Ontology: <mystery-story>
Class: ex:Story
Class: ex:Branch
Class: ex:Path
Class: ex:Event
Class: ex:ConvergencePoint
Individual: ex:MainStory
Types: ex:Story,
ex:Branch ;
Facts: ex:hasPath ex:MainPath .
Individual: ex:BranchA
Types: ex:Branch ;
Facts: ex:hasPath ex:PathA .
Individual: ex:BranchB
Types: ex:Branch ;
Facts: ex:hasPath ex:PathB .
Individual: ex:ConvergencePointA
Types: ex:ConvergencePoint ;
Facts: ex:leadsTo ex:MainPath .
Individual: ex:ConvergencePointB
Types: ex:ConvergencePoint ;
Facts: ex:leadsTo ex:PathB .
Individual: ex:MainPath
Types: ex:Path ;
Facts: ex:hasEvent ex:Event1,
ex:Event2 ;
ex:hasConvergence ex:ConvergencePointA .
Individual: ex:PathA
Types: ex:Path ;
Facts: ex:hasEvent ex:Event1,
ex:Event3 ;
ex:hasConvergence ex:ConvergencePointA .
Individual: ex:PathB
Types: ex:Path ;
Facts: ex:hasEvent ex:Event1,
ex:Event4 ;
ex:hasConvergence ex:ConvergencePointB .
Individual: ex:Event1
Types: ex:Event ;
Facts: ex:description "Player starts the story." .
Individual: ex:Event2
Types: ex:Event ;
Facts: ex:description "Player collects evidence." .
Individual: ex:Event3
Types: ex:Event ;
Facts: ex:description "Player confronts a suspect." .
Individual: ex:Event4
Types: ex:Event ;
Facts: ex:description "Player gains detective's trust." .
Granted, this is bare-bones here so my apologies on that. But the ontology, simple as it is, captures the structure of the branching narrative by defining story branches, paths, events, and – crucially – convergence points.
Okay, so then how does this allow for a sort of emergent gameplay where the player can still have a consistent story experience but a very tailored ludic experience?
Each path within the ontology represents a unique sequence of events and choices that the player can take. By following different paths, players can thus have individualized gameplay experiences tailored to their choices.
Yeah … okay. But that’s sort of like what we can do now, right?
So the convergence points in the ontology come into play here. They represent moments where different branches come back together. These points allow for emergent gameplay, as players can take varied paths and still arrive at shared narrative moments, ensuring that consistent story experience.
But … wait? Is this really emergent at all? Let’s ask it this way: would this structure allow for interactions not programmed in assuming the LLM and ontology API layers were operative?
Alright, so we have a murder mystery we’re talking about. In that context, let’s say the player could witness a bit of dialogue between two characters if the player happens to be at the right place at the right time. (Maybe the “right place at the right time” will differ because the two characters are not on set paths, but rather guided by events. Meaning, the dialogue takes place when they happen to meet up in the same location.)
But it’s also the case that the two characters will eventually go their own way. One of those characters, however, has a crucial piece of evidence in their pocket that they get from the other character.
So what can happen here?
Well, the player could intercept the conversation and try to get the evidence on the spot. Or the player could watch the characters behind cover, see the transaction, and then follow the character with the incriminating item. The player could try to pickpocket the item. Or they could confront the character. Or maybe just continue to wait and see what the character does with it, if anything. Or maybe the character eventually hangs up their jacket in a closet.
But … how about this scenario: perhaps a random encounter happens. The character with the evidence bumps into another character and that causes the item to fall out of their pocket.
Can all of this could be modeled with what I’m talking about?
Maybe?
Modeling the Mystery
We could certainly model the dialogue between the two characters as events or interactions within the ontology. The player’s presence or absence at the location can trigger these events. (Note: even their absence can trigger this. This leaves entirely open what else can; that’s where the emergence would come in.)
Equally certainly the characters’ movements and actions can be represented as part of the ontology. The ontology can keep track of their current locations, planned paths, and interactions with the player or other characters.
Planned paths? But what about unplanned paths? Based on characters’ behaviors and the game’s context, the ontology could be set up such that the characters can make dynamic decisions about where to move next. For instance, a character might decide to move to a location where they heard an important conversation is taking place or when they come into possession of something that they believe is important. Or maybe the character has gotten suspicious of the player and actively tries to go only where the player isn’t or where the player can’t go. (These would be modeled as somewhat equivalent to weights and biases in the model.)
The evidence itself can be an entity within the ontology, linked to the character who possesses it. The ontology can define rules for how the evidence can change hands and how it can be interacted with. And not just that piece of evidence, but anything that can be treated as evidential in the story. (These would be like attention masks applied to the entities.)
Here’s where we go to your interpreting command part as well. Player actions such as intercepting the conversation, pickpocketing, or confronting characters can be mapped to specific events or interactions within the ontology. The LLM-based layer can interpret the player’s commands and trigger the corresponding ontology events. But the range of possible interpretations could be interesting. For example:
FOLLOW THE SUSPECT UNTIL THEY STOP
Or:
WHEN THE SUSPECT STOPS, APPROACH THEM AND LOOK AT THEIR POCKET
Or:
TAKE THE EVIDENCE FROM THEIR POCKET WHEN THEY ARE DISTRACTED
I’m probably not conveying this well but the idea is you could open up a whole range of interaction. The LLM-based interpretation layer processes the input and extracts the intent, actions, and relevant entities.
Crucially, the interpretation layer must identify the main intent of the command, which is to take the evidence from a character’s pocket. But the intent is also to do so stealthily or without causing a scene or, perhaps more crucially, without the suspect knowing.
The interpretation layer then recognizes the entities in the command:
- “THE EVIDENCE” as the item to be taken.
- “THEIR POCKET” as the location of the evidence.
- “WHEN THEY ARE DISTRACTED” as a condition.
Key to this is that the ontology stores information about characters, their pockets, and conditions for distraction. (And this is just for pockets! We can imagine many other scenarios here.) The ontology, remember, defines relationships between characters, items, and conditions on a very broad scale.
The interpretation layer generates a query for the ontology based on the extracted intent and entities. The query then seeks to find a way to fulfill the command, just as any AI-based task is handled by a learning model. This is where the inherent prediction-based nature of AI comes in.
The ontology responds to the query by checking if the conditions are met. It assesses, for example, whether the character is indeed distracted and if the evidence is in their pocket. (What if, for example, the character dropped the evidence in the trash when the player wasn’t aware of that because they weren’t in the same location?)
The concept of “being distracted” can encompass a range of possibilities, of course.
So you could create “distractor entities” within the ontology. These could include things like loud noises, sudden events, engaging conversations, unexpected occurrences, and whatever else. Then represent different states that characters can be in, including “distracted.” The ontology can define factors that contribute to a character’s distraction, such as their focus, attention, and emotional state. (Example: character recognize they are being followed by the player and is no longer distracted but very, very focused.)
Then you model events that can serve as distractions in the game world. These events might be associated with specific locations, characters, or conditions. For instance, a loud crash in the next room could be a distraction event. Or the character happening to run into another NPC who stops them to talk. Or the player gets another character to call the suspect. Imagine if you could do something like:
SUSAN, CALL THE SUSPECT AND KEEP THEM TALKING FOR A FEW MINUTES
So our ontology starts to look like this:
Ontology: <distraction>
Class: ex:CharacterState
Individual: ex:Distracted
Types: ex:CharacterState .
Class: ex:DistractorEntity
Class: ex:DistractingEvent
Individual: ex:LoudNoise
Types: ex:DistractorEntity,
ex:DistractingEvent .
Class: ex:InteractionModifier
Individual: ex:PickpocketModifier
Types: ex:InteractionModifier ;
Facts: ex:enhances ex:PickpocketingInteraction .
Class: ex:Interaction
Individual: ex:PickpocketingInteraction
Types: ex:Interaction .
Individual: ex:Event
Types: ex:Interaction ;
Facts: ex:description "A general game event." .
Individual: ex:DistractedByEvent
Types: ex:Interaction ;
Facts: ex:description "Character gets distracted by an event." ;
ex:requires ex:DistractingEvent .
ObjectProperty: ex:enhances
Domain: ex:InteractionModifier
Range: ex:Interaction
ObjectProperty: ex:requires
Domain: ex:Interaction
Range: ex:DistractingEvent
So broadly, but also simplistically, speaking, the ontology provides a structured foundation for tracking and reasoning about all these interactions, while the LLM layer enhances player engagement through intent-based natural language interactions and tailored responses based not only on how the intent was expressed but also on the conditions that might make the intent easier or harder to implement.