Microsoft Copilot is Introducing IF to Wider Audiences

So I’ve just started using Microsoft Bing as a browser on my phone. Then I found a feature where I can utilize AI using simple taps to perform different kind of actions. One of the categories I found is “fun”, opened it and there’s an Interactive Game feature.

This got my attention because I’m pretty sure this is gonna be an IF game. And yes, I was right. It’s definitely an IF game created spontaneously by Copilot AI. It’s a chose your own adventure style.

Awile ago, I posted a topic about playing IF using ChatGPT, and I talked about how people nowadays interact with AI which surprisingly enough identical to how people play IF.

Then, I found this feature that’s purposefully made by microsoft for the user to play IF. By the time I wrote this article, this feature has already been used 90.5k times. That being said, IF is being introduced to wider audiences in this era of AI. Thanks to Microsoft.

But, I’m quite skeptical about the involvement of the AI technology in this particular circumstance. And now I’m not sure how am I supposed to react to this kind of situation.

Here is the screenshot I took.

I’m open for discussion and I’m eager to hear your opinion about this.

3 Likes

How far did you get before it started losing the thread?

I’m not really playing it, lol. I don’t use copilot as much as chatGPT. I’ll probably gonna ask the AI to do parser based to do some stress check lmao. We’ll see

Most of these “AI who can tell an interactive story” things are generally experimental and tend to be like storytelling improv you’d do with a friend where you say “I was walking down the street and I met a duck” and your friend responds “That’s no duck, that’s my best friend Percy and we decided to go to the grocery store.” “At the grocery store we bought bananas and a loaf of bread for Percy, but the cashier ripped off a mask and was a troll in disguise! Battle ensues!”

These are kind of fun to play with but tend to be somewhat absurd where the AI is just using what it knows about the things you tell it about to sort of weave a narrative. There’s very little structure in most cases. Some will barely pay attention to what you’re doing and just keep modifying the scenario; some will play along with what you ask it to do.

This form of AI improv isn’t great at telling a satisfying story as it’s basically making things up as it goes along, and AI tends not to have much of a long term memory and it will contradict itself about facts it gave you ten turns ago since most AI can’t remember everything it says. This is the phenomenon where people encounter situations where an AI seems to be “lying” to them.

It doesn’t build up a world model as it goes and doesn’t care if you’re in London at one moment and Chicago in another since it’s just throwing everything at the wall.

For AI to function as a GM, it tends to need a database of game rules and a world model to follow for consistency. We had a thread about a version of ZORK which basically plays as normal, but it can use the AI to improvise appropriate responses to actions that would normally not be understood and throw a parser error and I was shocked when PARKOUR THROUGH WINDOW worked and LISTEN FOR HOUSE OCCUPANTS actually gave a legit response.

The AI isn’t making up the story, it’s just helping to tell it.

5 Likes

I downloaded Bing to check it out. It’s pretty cool. Reminds me of chat.ai or those other character/scenario generators.

How far did you get before it started losing the thread?

I’ve commented in the past about techniques that could improve on this. One of my recent thoughts is whether this is a good use case for AI Agents. Have one Agent be the GM and another a HistoryKeeper of sorts, where it is given a memory of all things the player has done and then have the two Agents confer back and forth till they get a story forward that doesn’t contradict the things that have already happened.

On the other hand, given my experience with AI so far at work, I’m mildly impressed it didn’t outright contradict itself within that first screen or ignore your prompt completely. The signature waffle is present and I feel a certain loss of congruency towards the end, but AI may finally be learning to mimic the first screen of a text-based choice-based game.

I am concerned about this, because LLMs are much better at pretending to be creative and hiding its mistakes than producing useful results.

10 Likes

This. LLMs are extremely stupid and extremely good at masking that fact. Even calling their errors “hallucinations” is generous; a better term is “conflations”, because an LLM is not going to have a single original “thought”. They may be useful in redefining what creativity is, but creative they are not.

Don’t get me wrong. I use LLMs. They are useful tools, perhaps the single most useful tool ever invented.

'How is what an LLM does a conflation?

(I don’t voluntarily use LLMs because the ones I’ve been introduced to have made so many mistakes that more time would be spent correcting them than it takes to start from scratch, and makes me mistrust anyone whose writing shows signs of using the LLMs with which I am familiar… Yes, that includes the LLM my workplace is forcing everyone in my department to use (name of LLM withheld for legal reasons)).

For example, I was doing some research into whether there have been explosions in harbors of Liquid Natural Gas carrier ships. ChatGPT said yes, but had conflated a ship explosion in a port in Texas (ammunition I think) with the explosion of a natural gas storage tank near a port in Chicago (if I remember correctly). It had conflated the two events.

1 Like

That does look like conflation.

I’m used to LLMs making blunders like making up a completely different list of attendees for a meeting, including some who didn’t exist, despite the fact that the file it was supposed to use as its source had a clearly-indicated correct attendee list…

(The LLM had actually been told to convert the minutes into a report of a specific format and style).

I suspect that is conflation as well. The attendees that didn’t exist existed somewhere/somehow in the LLM’s addled brain and it didn’t have the basic common sense to know to restrict its answer to those on the list.

Or the ability to interpret a prompt which told it to use only information in the minutes and the template. Adventure could do better!