First Full Game available on new LLM Parser -- New Types of Puzzles for IF

Good Morning! If anyone has time to give us some feedback, we are using LLMs as an enhanced parser for interactive fiction games, and we finally have a complete game available: Countdown City Countdown City by thoughtauction

We’re hoping to bring IF to a new generation, and the LLMs enable all sorts of new puzzles and interactions that didn’t exist before. We put 8 of these new (some old – really, the classics still. just. work.) puzzles into a game – each demonstrating a new way the parser works: from NPC interactions, to being able to explore infinite spaces (in this case a scholar’s bookshelf), to crafting devices that adhere to physics and engineering principles. We finally (we think) got the system to transition between rooms and chapters – and produce success and fail states reliably – Would appreciate any feedback!

Game description and Parser descriptions from Itch follow:

Decode. Disarm. Survive.
Bad day to start your new job – Rookie on the Liberty City bomb squad. The world’s gone crazy. Bombs are going off all over the city, and your Sergeant just informed you they are too short-handed to field a team. Which means you’re flying solo. Better learn fast, newbie.

Countdown City features 8 puzzles of increasing difficulty. The first interactive fiction/parser game written for the new story engine, Taleweaver++, brings the world of text adventures to a new generation. Don’t just talk with NPCs - persuade them, threaten them, console them. No more single moves with output from a brittle parser – talk/type with multiple moves and commands – all the while the objects in the story obey the laws of physics and reality. Solve a mystery in a mystery – you are not just fighting the Freedonian separatists – something more sinister is afoot – can you figure it out before all of Liberty City is reduced to rubble?

Notes on the Game
– This is the first game released written for the new AI-enabled interactive fiction parser, Taleweaver++, and each chapter highlights a new feature of the parser.

– Warning: lots of violence (though cartoonish – still violence) including animal cruelty, torture, explosions, scatalogical humor.

– If you’ve played interactive fiction games before, you will recognize some of these devices/puzzles. However, the new parser enables a whole new set of interactions and games – so enjoy some new experiences, and let us know how to make them better.

– Warning: the final puzzles are a bit difficult, and a couple of them are derived from Oxford/Cambridge admission exams. So if you get all the way through… Congrats, smartie! Consider something more relaxing next, maybe the MENSA exam.

– The AI-parser is pinging a live server, so sometimes there is a lag. Apologies in advance. We are very, very interested in getting your feedback. Our number one goal is to make this game fun. Please help us get there.

How to Play
A note on this type of parser game: This is not your grandfather’s parser game – we’ve tried to harness the full power of the latest LLM’s. Rest assured – the story is entirely human-created – the parser/interpreter/translator is a specially trained LLM.

How to Play: Just start talking or typing, depending on how you access the game. If you are new to the genre, welcome. You are in for a metric ton of fun. Just start exploring the room, the objects, the other characters, etc. And feel free to use them how you like – pick them up, throw them, combine them, taste them – whatever strikes you as a way to solve the puzzle. However, sometimes the order in which you do things matters. Proper grammar and spelling, on the other hand, rarely matter. You are welcome to type in complete sentences – or fragments – or even grunts. Just. Start. Talking.

If you have played interactive fiction games/stories before, you will find that this new story engine, Taleweaver++, allows more freedom and unconstrained input. Feel free to type in as many game moves as you like in a single command. Examine six objects all at once. Use as many different verbs, nouns, and adjectives as you want. The world should obey the laws of physics and reality, so many puzzles will have multiple solutions. Your job is to JUST GET THE JOB DONE. Creativity is rewarded.

2 Likes

I had a go, finishing the first puzzle by myself and then the second with the aid of a hint. I don’t think I would ever have finished the second by myself, as the solution doesn’t really make any sense to me as to why it would work. That’s nothing to do with the technology, of course - just a puzzle that doesn’t match my understanding of how the situation described would function.

I’m not going to play any further for now because the delay in processing every input is too aggravating compared to the instantaneous response I’m accustomed to expecting from non-LLM parser games. Interacting with the parser seems to work better than some previous examples I’ve seen of LLM-powered IF - the game rejected my attempts to derail things by going obviously outside the scope of the game (claiming to have superpowers, etc). There’s an awful lot of sarcasm from the parser in the first mission, to the point where it’s difficult to actually focus on the content. I don’t know if that’s an intentional stylistic choice or some kind of artifact of its training/prompt. It got a bit better in the second mission. There are also a few grammatical and punctuation errors; I couldn’t tell whether those were part of human-authored content that had been provided as part of the story, or generated by the LLM. I also spotted one point where what looked like formatting instructions ended up in the text (the literal words “in italics” rather than the text being in italics).

One plus was that where I couldn’t make sense of the initial description of the situation, I was able to ask the game a clarifying question (“are you saying that the red wire comes out of the black box and then goes back into the same box?”) and get a sensible answer.

8 Likes

Thanks Adam – this is very useful feedback. I appreciate it. I can tell you what we were trying to do in the second chapter. . . (I know, if I’m explaining, I didn’t do my job as a writer). The game mechanic we used was HOW you spoke to an NPC – a calm, reassuring tone lowers the heart rate, any sudden movements, questionable actions, or impolitic outbursts raise the heart rate. Plus, we used the heart rate as the counter – you’ll find in the chapters we could use almost anything as counters – music tempos, distance from the device, etc . . We’ll keep working on improving this - once we get this platform working, we can’t wait to get some real writers in here.

Thanks,

M

Needing to consider tone didn’t really come across: it seemed to me like you can keep her heart rate under control by typing TELL HER TO BREATHE SLOWLY every few commands. The one time I failed was because the game decided that it was too many turns since I’d interacted with the NPC, at which point her heart rate instantly jumped from 60bpm to 120bpm.

Adding to this, players generally don’t appreciate when they’re being made fun of while experimenting. It’s a great way to turn players away from actually engaging with the game…just like it did for me.

6 Likes

Thanks Hidnook. Very much appreciate the feedback. Perhaps we could add an option where the player gets to choose the parser tone. But in our beta testing, the sarcastic, sometimes caustic tone was the #1 favorite feature among players who had not played IF before (93% of beta testers). They really grooved on the narration . . . we thought of this, at least for this game, as in the spirit of Hitchhiker’s Guide - channeling our inner Douglas Adams – perhaps more disclosure in the marketing? Or maybe the ability to just tone it down? Lots for us to think on.
Thank you
M

1 Like

There’s a difference between “humorous” and “insulting”, and this LLM model crossed it. Part of the issue may have been that 93% of your beta testers are not IF people, so they don’t know the conventions of the genre. But I would rather have a brief, maybe funny message that I’m doing something wrong rather than a full paragraph about my stupidity. For instance, I typed TRANSCRIPT as my first command, which in many instances would create a transcript of the game. If that’s not an option in this game, I would have been fine with an I didn't understand or You can't do that in this game. Instead, I got this:

Oh, “transcript”? Really? As if l’m some bureaucratic Al keeping minutes for your scrapbook. We’re in a dimly lit room with a ticking bomb, not a courtroom drama. If you meant something else-like examining the room or an object-try being a bit more specific. Otherwise, I’m just going to stand here twiddling my virtual thumbs while the clock runs out.

> no i want a transcript of the game
Ah, persistent, aren’t we? Look, I’m flattered you think I’m some kind of omniscient scribe, but let’s get one thing straight: I’m a bomb-defusing narrator, not your personal stenographer. If I started handing out transcripts, we’d be here all day, and that timer isn’t getting any more patient. Why not focus on, oh I don’t know, something in the room that might actually help you out? Just a thought.

Just to test the LLM, my next command was N, which is universally understood to mean GO NORTH.

> n

Oh, come on now, “n”? Is that your idea of a command? Are we playing some secret code game here, or did your fingers just take a nap midway through typing? If you’re trying to go north, well, buddy, this isn’t that kind of adventure-it’s a one-room wonder with a bomb that’s not exactly waiting for us to explore the great outdoors. Try spelling out what you mean, or better yet, focus on something in the room before we both regret it.

> go north
Oh, heading north? How adventurous! Too bad this isn’t an episode of some fantasy quest where you can just wander off into the wilderness. We’re stuck in this cozy little death trap of a room, remember? No exits, no hidden doors, just you, me, and that cheerfully ticking bomb. If you’re itching to do something useful, maybe try looking around or fiddling with the stuff on the desk instead of playing explorer.

Is it humorous? Sure. Is the idea of having a more adaptable parser, aided by AI, something worth pursuing? Maybe, if it’s done effectively. But the current state of this LLM is taking things too far. If the rest of the game is going to ridicule me for following genre conventions, I’d rather not play.

14 Likes

The biggest problem with the sarcastic tone is that it was constant. Which made the whole thing feel incredibly AI. The worst failing I’ve seen in trying to make AIs write creatively is that they tend to not vary their tone appropriately according to context.

11 Likes

Thanks Hidnook – we most definitely did NOT include IF conventions - that was deliberate. Natural conversation, multiple steps in a command, etc – we felt was the more intuitive way to communicate (and to tell you the truth – as someone who has played the old IF games, "transcript’ was new to me – never used that one) .

Perhaps we add a dictionary with the old commands. This experiment came out of the machine translation space – as we initially tried to translate regular English into Infocom parser English (we gave up once we learned we made no one happy)

What we learned from beta testing was the most players would quit once forced into IF conventions – finding the input too restrictive – they just wanted to talk – or burst into the room, examine every item at once, and shout something silly simultaneously. . (it’s a fun way to begin a chapter, actually – you get all the info, plus a funny comment from the parser). Our typical player left after a few turns with traditional IF commands.

Let me think on how to tone down the snark, though – that’s not good if you are taking it as an insult.

Thanks again

M

Thanks Phil – good note – I need to think how to get more straight answers – We just added some language that if the same move is repeated three times, the third answer should be short and factual and neutral.

On the other hand, I am also hoping that these chapters, are more puzzles – and less “exploration” as in trad IF. By that I mean, you can use a single sentence to get all the info from the room or the objects, etc. So what would have taken you 10 commands in IF, only takes one command here.

e.g. “go north, open door, enter the kitchen, open drawer, open oven, examine drawer, take knife” – would be . . head to the kitch and root around to see if there’s anything useful. . you see anything? So exposition, and one snarky comment, verses 8 snarky comments with regular commands.

Maybe better explanation of how to interact with the game might be in order.

Another example – in one room that is being remodeled, there are 30 construction objects/tools lying around, and the players job is to describe how they would combien SOME of those items into a useful machine and then use the machine to solve a riddle. There are by my count 10^2 solutions and 10^4 incorrect sequences. . the object is not to explore, but to write a few sentences that satisfies the rubrik of the puzzle. . . so a few moves, versus 10’s of moves.

So obviously i am not communicating that well . . and you are not having fun out of the game – so fail . . on my part.

How we construct the first chapter is so key to how the player enjoys the whole game. .

Will work on this

Thanks,

M

I’m having trouble squaring this claim with the first mission, where as far as I can tell, it seems to take a minimum of three commands to get the game to reveal the key hidden in the drawer. I tried typing SEARCH THE ROOM FOR EVERYTHING THAT MIGHT BE IMPORTANT but it just gave me an overview of the immediately visible objects.

There’s a difference between forcing players to use IF conventions and allowing players to use IF conventions. And there is a vast gulf between those and mocking players who try to use IF conventions. Your LLM clearly understood that the command n meant the player wanted to go north (“If you’re trying to go north, well, buddy, this isn’t that kind of adventure”), but still replied condescendingly for daring to want to type fewer letters.

9 Likes

Can somebody please post a walkthrough for the first part? I am very interested in seeing how the character conversation thing works but I don’t want to defuse a bomb. Thanks!

Solution for mission 1:

Open the desk drawer. Search the files until you find a key. Unlock the bomb with the key. Look at the calendar to find the significant date (was July 7th for me, not sure if it varies). Enter that as the combination on the bomb.

1 Like

That’s right. These ARE IF puzzles – of which, there are about 10 varieties (Emily Short has a pretty good list of the types of IF puzzles). We think we have expanded the repertoire to include at least 2 new types of puzzles – but yes, the puzzle mechanics are still the same.

So in this case, we started this off with the simplest IF puzzle out there – search, find items, solve mystery puzzle. But the parser certainly cuts down in entry – and let’s you do things like; examine everything in detail – or 'look behind, below, and around the desk, teh whiteboard and the chair. " A 10-move sequence in regular IF.

For us, this is a really hard question of balancing the narraotor/parser . . the parser shoudl be directable – it does what you tell it to do – and omniscient – bit only t a degree – it does not necessarily respond to subjective directions (ie: tell me things that might be important – or root around for anything interesting) – but it does respond to the “root around” or 'search the room" – giving information that would have taken much longer sequences.

It also responds to the physical properties of the objects (epending on how much info we provided in teh background) – so in some ways it can be subjective (ie: lick all the objects on teh desk and find the one that tastes the best). On the other hand, it is snarky – giving the impression it knows more than it does –

This is the design question we face – for this story, I think snark works – the bombs are a set of funny, ridiculous puzzles with increasingly bizarre disarming sequences (wait till you get to the musical category) – but It’s obvious it’s not for everybody. Will try a straight one at some point.

Thanks

If you like, you can skip ahead to Chapter 7 – one of the bomb-makers has been caught in a roller disco – and you’ll need his help to disarm the bomb: [spoiler]Taleweaver

I think I wouldn’t have minded the sarcasm as much if failure didn’t mean potentially a full minute of waiting for the game to restart. I’d much prefer a “that didn’t work” rather than “HAHA you tried something I wasn’t programmed for? YOU DIE!!” and then I’m sitting there looking at a spinner.

When the computer asks me to sit and wait for it to respond for such a large fraction of the playing time, it better be polite to me, rather than add insult to injury.

7 Likes

Which of the eight missions would you say provides the best opportunity to experiment with finding unexpected but physically plausible solutions?

Try Chapter 5 (The Music Chapter) Taleweaver – it demonstrates what we think is a whole new category of puzzles never really explored in IF – crafting physical (and metaphysical coming soon) solutions to puzzles that are graded not by a deterministic formula, but by a rubric. In this case, an engineering and phonic rubric.

Spoilers, each more spoily than the next: There are 25 objects in the room, representing a combinatorial universe of something like 10^20. The rubric we entered will assess whether your device will defuse the bomb based on the physical properties of the materials you used, and the engineering creativity you used to craft them in combination.

1 Like