Audio IF

Like Trouble In Sector 247? (Or, you know…)


You might be interested to know about Kenji Eno’s Real Sound: Kaze no Regret, which works mostly in the way you describe, and which was apparently made for blind users.

However, it didn’t use voice commands (just game pad buttons, I think, because it was for the Sega Dreamcast).

The game has not been adapted to English, but English translations of the script exist.


I was thinking about this. If anyone seriously wants/needs free voice over work, I’m no professional, but I’d be willing to do my best.

The Quick Brown (87.2 KB)

Let me know.


I’m just skeptical that parser IF can be played passively, while concentrating on something else, or that it would be that comprehesible via audio. People say they want it, but are they picturing it clearly?


Good question. I really don’t know. I’ve never listened to an audiobook because I want to read with my eyes, so I’m not in sync with the people who say they want to listen. It’s probable that not every game would translate well-- Hadean Lands was mentioned and I don’t know if a highly puzzly, command-heavy game with a huge map like that would be doable. But certainly some games, especially narrative-driven games, or limited command games, or less puzzly games, might translate very well to such a medium.

1 Like



I’ve been thinking a lot about (and experimenting with) voice recognition and IF. This thread really interests me!

There are puzzly / adventure IF audio games built for smart home devices like Alexa. Earplay is a studio that makes them, including ones with large budgets and big IP like Jurassic Park. As a developer, I looked into building Alexa games and they’re complex from a programming perspective. (At least, they’re outside of my Inky comfort zone.)

I like what’s being described here - basically, interactive voice-controlled audio podcasts. I’d definitely play games like that in the car or washing the dishes, if they were designed with that application in mind. (Games that are conversational or exploratory and didn’t require intense concentration.)

MP3 files already have chapters, so you can skip to particular sections. Maybe this is essentially a podcast player with voice recognition. In the MP3 file description, there could be hidden instructions for the app on what chapter to go to if the user says a specific word. For example…

Ch1 - “East” - Ch17

[At the end of Chapter 1, if the player says “East,” go to Chapter 17.]

Ideally, the app could replicate the voice commands as on-screen buttons as a fail safe and for accessibility. I agree with what folks are saying about keeping commands simple and limited, so there’s a greater likelihood the voice recognition will understand the command.

If all devs needed to produce an IF game was an MP3 with chapter breaks and a list of commands, that’s a pretty low lift IMO that would open up IF development to a lot of people.

You could go beyond IF games, too. Interactive audio documentaries, interviews, and quiz shows come to mind.


This kind of play makes the most sense to test the waters of a fully audio IF. It would essentially be a CYOA book with voice recognition for the available choices. The spoken commands would produce very little interpretation errors, if any, due to only 2 or 3 choices being available at any given time. (It wouldn’t even have to understand the command, just produce the most likely based on the spoken words.)

The only thing I would add is that upon a second play through, it would be nice to skip to the choices if you’re heard it enough and just want to get to the new story line part.

For me, videogames that were better because of the narrator are The Stanley Parable (Kevan Brighting) and Leisure Suit Larry 6 (Neil Ross).

However, in those games, the narrators are commenting on your choices and actions. They supplement the game. I just mention them because I don’t know of many games with a narrator to glean insight as to what works and what doesn’t in the case of a fully voiced IF.

I wonder if there are some audio books that people swear are better for having the audio than reading the actual book?

A long time ago, when VCRs ruled the landscape and music cassettes were the bees knees, I recorded Star Wars to an audio tape and played it in the car as I drove to work. It was surreal to me because the sound effects are actually so good in that movie. The beeps and boops of R2. The roar of the tie fighters. The hum of the lightsabers. I could see the movie in my head just by listening to the recording. I think quality sound effects would really make an audio IF game far more enjoyable than it has any right to be.

It might be worth listening to some old radio dramas for inspiration too. War of the Worlds and that sort of stuff. Also, I don’t listen to podcasts, but I believe there are podcast dramas being made that might inspire.


Why didn’t I think of it earlier ?!

@mulehollandaise published an article last year about the 2-XL : Le 2-XL et les contraintes sur jeux à embranchements – (:fr:)


Almost everything is playable on if you want to give it a try (90% of them are in english) with additional ones on the Internet Archive

In the article, Hugo talks about a Jurassic Park game that is quite fun to play.


Delight Games has few of their titles in audio format, with voice input. I played one of them and it was fun experience. It did require a bit more attention than regular audiobook.


Unfortunately this would get close to AI rearrangement of actor voicework, which SAG-AFTRA is currently discussing with a view to going on strike against large computer game companies doing this. (It’s already the subject of strikes in films). If you ever plan to do any work with SAG-AFTRA, this plan (even for a non-commercial game) would endanger it.

While consumer speech software has improved since the old telephone voice software (that by the way, quite a few organisations still use), I still can’t get it to detect my voice as a voice. It’s still easily defeated by common and trivial factors (such as having a different accent to the people who provided the corpus).


That’s fair and valid. I was imagining a more manual approach, but I see the distinction is sort of trivial now that you point that out.

Sometimes I approach things too much from a pragmatic rules-lawyery direction, asking first how and not who gets shortchanged and should I. In intent, this isn’t much different from trying to cut out your art budget with AI art. The actors deserve to see returns for the use of their faces and voices just as much as any artist deserves to see returns for the use of their work.

Consider the idea cremated in my brain.

ETA: Also, thank you for pointing it out, sincerely.


I love the 2-XL! A few years ago, I created a new game for the Tiger 2-XL called “Facts About the Robot Uprising.” It was featured at IndieCade.


Having audio solutions for accessibility in text-based IF is a natural fit. I’m not in anyway disparaging that. Here we’re talking of mainstream Audio-IF:

I was thinking a Choice of Games style story might make a good Audio IF, but…

This is a bit of my thought. However…

I don’t think even standard IF is ever an activity conducive to multitasking where your attention is split between real world and narrative world.

IF, by nature is almost always immersive in some way. The player feels they are in the world or are roleplaying a character who is not them in 2nd person but still immersed in the world. Unlike a standard novel, which is 99% of the time in 3rd person (or 1st person with the narrator telling you the story) which allows the reader to remain separate from the narrative. A book might immerse you in its setting and plot, but the immersion is more akin to following a railed walkway through a diorama in a museum.

I think texting and reading/posting on social media requires a different type of attention-span in short bursts. Based on my own waiting-room phone behavior, I’m usually switching between texting one or several people and playing a non-narrative (‘mindless’) game or reading news articles or maybe posting on microblog social media. Often this “waiting room/public transport” behavior is because you cannot get too wrapped up in what you’re doing - you might need to stop abruptly if someone talks to you, your number is called, your approach your bus stop…

Many people do read e-books or long form media on the go, but I think psychologically reading still can be a ‘time-passing’ rest activity since the player isn’t expecting to switch modes and direct the action. And like Mathbrush and others express, I certainly don’t want to be that person talking to my phone in public. The only other situation I can picture is bluetooth while driving, but unless on a long solo highway road trip, I would find two-way interaction very distracting in normal traffic the same way an extended phone conversation would be.

I think the “people don’t want to read” splits into two camps:

  • People who want to zone out while playing games to pass time instead of thinking or doing something productive, and reading a text wall and engaging the “solve a problem/make an important decision” section of their brain feels like what they are trying to avoid is less desirable.
  • People who do want to read but don’t have time to sit with a book and just read so they will listen to audiobooks while doing other tasks such as cleaning or driving as above. In this case making decisions likely involves a third band of width that may be counterproductive to resting or getting something else done.

I do think a specific IF created from the ground up as an audio-only experience might be a unique event-style presentation, but likely isn’t a genre that will catch on for every game or foster a burgeoning new genre.

(Like Bandersnatch was a neat experience but I don’t always want my Netflix to do that, and 3D movies are a novel occasional experience, but not every movie benefits from a 3D presentation.)

Me personally, I’m one of those people (I assume there are others like me) who doesn’t ever turn on the TV as “background noise.” if I’m going to engage with narrative media or music, I usually want to engage with it actively instead of it just being sonic wallpaper. I’m listening to a song because I want to listen to it. So for me to play an audio game means I want to sit on my couch with a drink, likely by myself so I’m not sounding like I have Zork-tourettes in public, and just do that. That means it needs to be extra-immersive like a radio show with sounds, positional audio, music, etc. I think it might require a very specific type of narrative like Infocom’s Suspended - or a game where I’m pretending to sleep while eavesdropping on unsuspecting conversations around me, or I’m conducting a seance and listening for spirits - something to justify the presentation.

TL;DR - I think a method to create Audio-IF of normal games is a good thing for accessibility, but creating a game where audio is the “feature” requires specific intention and environment (headphones, eyes closed, ‘take this journey in sound to experience a completely different kind of experience’).


Sort of tangential, but audio dramas are lots of fun- and plenty incorporate the listener. Whether that’s just addressing them directly, (as in Wolf 359’s Dear Listeners in the earlier episodes with Doug recording an audio log for presumed listening on earth by personnel upon their return, or the Listeners in Welcome to Nightvale who are thought to be tuning into their local community radio station to get updates on weather and emerging oddball news) or to actually drag the listener in as a full blown character (like the immersion of SAYER, where you, the listener, wake up on a strange space station with an AI talking directly inside of your head thanks to a brain chip, and you have to navigate a terribly perilous journey, on which you can die several times over, like the flesh eating clean up nanobots, air being vacuumed out of a room, poisonous gases, walls crushing you into a pulp, and so on…)

Maybe something to look into for inspiration if someone’s thinking about heading in this direction with IF.


:heart_eyes: Is this playable anywhere ?


It’s a shame they only tried it once more (AFAIK), but the final episode of Unbreakable Kimmy Schmidt was the perfect way to use it, and a brilliant way to end that particular series!


I read an article how they spent money to develop the technology to make choices mid-video stream and were trying to hire someone to develop these. I suspect as I mentioned it’s like 3D where at first it’s like everything is going to take advantage of this! and then they find out it’s not a simple matter of flipping a switch to make it work properly. Like writing a novel-length CoG isn’t a matter of slapping choices into an existing story; it requires producing the equivalent of 2-3 books. Bandersnatch required five hours of video content to customize what normally ran about 90 minutes for a viewer, so the budget for one fully “choice based movie” likely requires the budget for several.


Yeah, I can’t do that. I have trouble focusing on things even with background music let alone TV. It distracts me terribly.

Edit: I also have trouble engaging with anything lightly, even conversation. I would miss my bus stop, drive past my exit, etc.


Ironically, given how often full rewrites and reshoots are happening in Hollywood right now, there’s often a few movies worth of unused scenes and dialogue by the time some films make it to cinema anyway. X-Men: Dark Phoenix and Justice League both come to mind as recent examples, but extensive reshoots still happened going back, like Back to the Future for example.