Wow, I thought everyone moved on from MP3 forever ago. If it’s just a band-aid until Safari joins the modern world, what’s wrong with plain old WAV instead?
File size, mostly; I wouldn’t be opposed to adding more sound types while we’re at it, alongside MP3, but I’m not aware of much demand for WAVs in the IF space.
I actually didn’t know Safari supported Ogg Vorbis until Zarf pointed it out, but that’s barely a year ago now!
Honestly, mp3 seems to be the lich of audio file formatts and seems to remain what most people default to for stand alone audio files… Also, is anyone even using Internet Explorer this day and age? Even for the retro gaming folks running a Win98 virtual machine for some old Windows games that are somewhere between a royal pain in the anatomy, requiring a god tier hacker to decompile/disassemble, tweak, and recompile, or outright impossible to run on a modern Windows box, it’s hard to imagine them actually going online within such a VM, and didn’t Microsoft ditch Internet Explorer for Edge with Win 7 or 8? Or has IE come to refer to a different browser I haven’t heard of?
FWIW, AAC is also widely supported and the patents on its most common variants have also expired.
If there’s demand for it, I don’t see any real harm in letting Blorb support more formats; but it also means more cases for programs like blorbtool to check. I started with MP3 because that’s the format I’ve seen people specifically wanting to use, but if interpreter authors want to support AAC, WAV, etc, they could also be added to the spec at the same time.
I’m going to veto wav. AIFF was also a mistake. We should not be distributing uncompressed audio.
(Was AIFF included because that’s how the Infocom audio files were already being shared in that era? Actually, reading the Blorb spec it seems more because AIFF files were used by the deprecated Song format.
In the age of 4K video, do a few uncompressed sound files make much difference? Lossy compression is worse than no compression.
As I understand it, the only way for the Z-machine to decide whether a sound should be played on the “sample” or “music” channel is to examine the file type, so the main benefit of keeping AIFF around is to make that possible. WAV seems like a more convenient path to the same goal; all the tools I’ve used default to WAV instead of AIFF.
I believe there’s also some inherent latency in MP3 decoding that makes it a suboptimal format for real-time sound effects (but fine for music). I had to switch sound effects from MP3 to WAV in the .NET version of Rascal because of that; if I backport sounds to the Z-machine version, presumably I’ll have to convert them to AIFF. Not sure how much this problem affects other compressed audio formats, though.
Did Lectrote ever get audio support?
I think WAV and AIFF both wrap basic PCM sound data, so it should be trivial to convert between them.
And thanks for the info on the Z-Machine. I’ve never looked into how its sound support works.
MP3 and Ogg/Vorbis can both compress roughly to about 10%. Using WAV would turn the 18MB Kerkerkruip blorb into about a 98MB blorb.
320kbit MP3s are close to indistinguishable. Audiophiles may love FLACs but I’ve read there’s little evidence they can actually be distinguished.
The short answer is, not very well!
There are two sound channels, but no way for the author to specify which sounds go on which channel, so the current spec bases it on the format of the audio file. The AIFF channel queues (so if you launch five sounds in quick succession, it will play them one after another), while the OGG channel replaces (so if you launch five sounds in quick succession, each one will interrupt the one before it). Also, the volume scaling is different for the two channels.
It’s an utter mess, which is why Dialog doesn’t support sound at all on Z-machine.
OGG doesn’t have this issue.
For similar reasons to what you described, MP3’s are also not useable for background sounds which are intended to loop without gaps. Again in this situation, I would recommend OGG if you still want a smaller lossy file.
-Wade
That’s messy. With hindsight, an extra chunk to say which audio resources belong in which channel would have made sense.
I definitely agree! If we’re amending the spec, that might also be worth considering, now that there are more possible audio formats: a field in the sound chunk specifying whether it’s “sound” or “music” (with defaults determined by file type as before, to not change existing behavior).
I’m torn between “the Z-machine format has been stable for so long that interpreters are unlikely to see major updates” and “the Z-machine is a widely-used format that’s had sound support since 1987, and this channel information was captured by the informal pre-Blorb resource systems”. But I’m not the maintainer of any Z-machine interpreter, so my voice isn’t the one that really matters there.
(I just gave Dialog the ability to put arbitrary “options” on resource files, and built all the sound processing I wanted into the Å-machine interpreters instead. That spec’s much younger and thus much easier to update.)
The Blorb specs allows MP3, WAV, MIDI and GIF formats for ADRIFT interpreters.
Would it make sense to allow all these formats for all interpreters, and remove the carve-out for ADRIFT?
Those media formats are used only by Adrift storyfiles in the Adrift player. It’s a tightly controlled environment, unlike opening up those media formats for all the other storyfile formats in all the other interpreters (and multi interpreters). Each of them should be considered on their own merits.
MP3s were requested for Inform mainly for web play. Until very recently it was the only reliable all-platform option (I too didn’t know that Safari recently added support for Ogg Vorbis.) And in particular it’s needed for separated mode as browser restrictions would mean Parchment couldn’t use its own codecs like it can when loading an audio file out of a Blorb. Adding support to non-web terps shouldn’t be hard.
I don’t know why Adrift supports WAV, but as I said before, I think it’s a mistake. Just like supporting BMP images would be a mistake.
MIDI would be great, but it’s unlikely to be supported by Parchment. (I can’t even support the spec standard of MOD files.)
GIF would be easy for web interpreters, but potentially much trickier for non-web terps, especially if authors expect animated GIFs to be supported.
This is not what the standard says. In article 9.4.2:
“starting any new sample sound effect stops any current sample sound playing.”
When playing sound in a z5+ game, you can use sound interrupts to chain sounds, e.g. play sound 1, then fire code that plays sound 2.
In z3 however, that’s not possible, as it doesn’t have the interrupt concept. Infocom’s Lurking Horror releases would stop certain sounds from overlaping, simply by being slow. If you write an interpreter today, for faster computers, and want sounds in Lurking Horror to work well, you need to implement an enqueuing mechanism. We added this to Ozmoo for MEGA65 maybe a year ago, and since then Lurking Horror works fine.
I spent a lot of time working on sound in The Lurking Horror for my interpreter. It carves out a minimum set of behavior to get TLH working properly by essentially putting the interpreter into a wait state until one iteration of the current sound plays if a second sound is ordered to play concurrently, then it switches to the new sound. Because of that it is impossible to queue up more than one pending sound (which TLH never does) and avoids needing any kind of real queue.
Additional: Of course the standard 1.1 post-dates TLH by…a lot…and the music channel is unaffected by this mechanism.
Additional Additional: You also have to do the whole same "wait for 1 iteration to complete’ thing for stop commands - not just when playing a second sound. In at least one instance TLH will fire a STOP command very soon after a PLAY and if you don’t deal with that you never hear the sound. Queueing sounds is almost, but not quite sufficient to get TLH right.
How do web interpreters handle media. Are they in blorbs or outside of the blorbs? I’m wondering, if they are in a blorb, how the interpreter fetches data from inside the remote blorb?