my name is Sebastian Kästner, I’m a freelance developer and interactive fiction enthusiast from Germany. I’ve recently built a tool that can generate complete audiobooks from written stories using AI-driven voice synthesis.
The system now supports multiple languages and multiple voice actors, allowing me to create fully voiced, dynamic, and immersive experiences. My current focus is on producing interactive audiobooks — stories where the listener can speak back to make choices, and the narration responds in real time.
I’m looking to collaborate with writers, developers, and anyone interested in bringing interactive fiction to life through sound and speech.
If you’d like to turn your story or book into an audiobook — or even an interactive voice-driven experience — feel free to reach out.
From time to time i check in on the state of TTS. It’s getting better, but i usually find it inadequate for use in most games.
Years ago, before we had AI in the form it’s known today, we had a speech mark-up language like SSML. Unfortunately, this has been lost with AI. Whenever i try AI based TTS, i get a random pronunciation. I basically have no control over the intonation. And this really sucks.
I would like to have instructions like:
Say with anger, “I won’t do that!”
or
Say with remorse, “I won’t do that.”
Which would be quite different results if done right.
The other big omission is there appears no way to add sounds like; laugh, cough etc. Without these, reproductions are more unnatural.
Have you any progress with these sorts of problems?
Oh yes, these things are absolutely solved. Even for me it’s hard to recognize the voices as artificial nowadays, because you have this type of emotions, natural breaks, giggling, stuttering an emotional emphasis.
Though, using such high developed and completely natural sounding speech synthesis is way more expensive. That’s why I focus on audiobooks, where you don’t need so much voice acting.
But I can implement sofisticated voices with emotions, as well.