Help needed with Unix Frotz

I need someone who’s familiar with audio programming in C on Linux to help me out. I have some basic sound-effects code written up for Unix Frotz, but it doesn’t work and I ran out of ideas for why. The best I can get is a faint click. The latest code is at To build, you’ll need libao installed with the development headers. Debian users should do “sudo apt-get install libao-dev”.

I tried building the ao-curses branch, and it looks like there’s a missing file: src/curses/ux_audio.c.

Whoops! That one is a new file that I forgot to git-add. It’s should be buildable now.

Here’s a test program that should be easier to decipher:

I’m attaching a small patch which aids in the loading of samples from the Blorb file. However, there is more work to be done in order to get sounds working properly. libao is an output only library, meaning that it does not handle decoding of AIFF (or other) files. You’ll need to get the files decoded somehow, such as with the sndfile library.

However, writing the entire file at once this way will block the UI as long as it is playing. As such, I’d recommend looking at SDL_mixer and SDL_sound, both of which are used by Gargoyle. These libraries will allow sounds to be played in a separate thread so that the UI will not block during playback.
libao.txt (593 Bytes)

Thanks. I’m not worrying about blocking right now. Since libao pulls in pthreads, I plan to use that to accomplish nonblocking audio. I really don’t want to mess with SDL for a variety of reasons – I want to make this as lightweight as possible, and SDL is not lightweight.

For what it’s worth, here’s my attempt at simply playing a file: … with-libao.

The problem with the code at stackoverflow is that libao is expecting PCM data and you’re directly passing an AIFF or WAV.

The good news is that AIFF and WAV files are (typically) just PCM data with a header; but you have to extract the number of channels, sampling rate, bits per sample, and possibly endianness, instead of passing defaults of 2, 44100, 16, and little. It’s pretty easy to parse WAV and AIFF headers manually to get this information which you’d then pass on to libao before handing it the PCM data.

The bad news is that Blorb also supports more complex file formats which, if you want to handle them, more or less require a third-party library such as libsndfile.

I suppose I will use libsndfile for AIFF and OGG. MOD chunks will be decoded with libmodplug. I will have to decode AIFF manually for the DOS port and ignore OGG and MOD. I don’t think I could shoehorn in OGG or MOD decoders into DOS Frotz and still have it work with 16-bit DOS.

All this time I was under the impression that libao handled AIFF and WAV decoding.

My test cases for playing Blorb sounds seem to work out fine now. I decided to ditch libsndfile because it was causing strange problems. I wrote up some code to make sense of AIFF. I’m using libvorbisfile to handle OGG and libmodplug for MOD. I should have something pushed out to the repo in a few days.

Sounds now work independently. I need to somehow get AIFF clips to play mixed with OGG or MOD music. This is how Sfrotz and Windows Frotz work, and I believe this is how I should do it for Unix Frotz. Test game is at To build Unix Frotz on Debian, one will need to install libao-dev, libsndfile-dev, and libvorbis-dev.

My problem is mixing. For the time being, I am ignoring sample size and rate and concentrating on mixing the C64 music and the noise the red button makes. Both of these effects are 44100hz and 16-bit. Once I get this done, I can move on to converting the button noises on the fly. Build Unix Frotz from the latest code in the ao-curses branch. Start the game and fiddle with things one by one. If you turn on the Commodore 64, and then the Amiga, the Amiga will take over. Turning off either of the computers when music is playing will turn off the music no matter which one started it. I believe this behaviour is in agreement with the Z-machine spec regarding “bleeps” playing with “external sounds”.

The test is to start the C64 and then press the red button. This should cause the C64’s music to be mixed with an old-fashioned car horn. In src/curses/ux_audio.c, after passing the awooga horn through the pipe, the resulting mixture is distorted, but recognizable. The C64 music continues to play. If you try any of the other buttons while the C64 is playing, the C64 music will cut out altogether. I assume this is because the mismatched mess is such that ao_play() can’t recover. Another effect of pressing a button while the C64 is playing is that you can no longer stop the music by turning off the C64. I’ve gone through things and found that music_pid is being set to zero when playaiff_pipe() is called. That music_pid variable is what is being checked to see if the OGG or MOD player is active.

So, here are two questions: 1) How can I get audio mixed correctly? 2) Why does playaiff_pipe() cause music_pid to be set to zero? I know there’s a problem in my code about what might happen if the OGG music stops before the AIFF clip is done, but let’s take one thing at a time.

I wrote a mixer for an audio library for decoding, resampling and mixing (hopefully I’ll find time to finish it up and release it someday) and the approach I used is to do floating point mixing.

When decoding, you either instruct the decoder to give you FP samples (libvorbisfiles and libmpg123 can do that), or if the decoder doesn’t support that, you convert the integer samples to FP. Nominal range should always be [-1, 1). The code for mixing is rather straight forward, you just add the FP values together:

mix = sample1 + sample2;

Before sending the resulting mix to the output, you convert back to integer samples. You should at least do clipping to the [-1, 1) range before the conversion (more fancy approaches would use a limiter instead and also do dithering):

/* Convert and clip a float sample to an integer sample. This works for
 * all supported integer sample types (8-bit, 16-bit, 32-bit, signed or
 * unsigned.)
template <typename T>
void floatSampleToInt(T& dst, float src)
    if (src >= 1.f) {
        // Overflow. Clip to max.
        dst = std::numeric_limits<T>::max();
    } else if (src < -1.f) {
        // Underflow. Clip to min.
        dst = std::numeric_limits<T>::min();
    } else {
        dst = src * (float)(1UL << (sizeof(T) * 8 - 1))
              + ((float)(1UL << (sizeof(T) * 8 - 1))
                 + (float)std::numeric_limits<T>::min());

In case you want to look up the rest of the code (it support all formats: MP3, Vorbis, FLAC, WAV, AIFF, MOD, MIDI), I’m attaching it to this post. Again though, it’s C++, but the decoders and resamplers should be straightforward to port to C (or you can introduce C++ code to Frotz.)

The library depends on SDL, so you will need to strip out the relevant code if you don’t want to use SDL.

If you want to use the decoders, they depend on libvorbisfile, libmpg123, libsndfile, libfluidsynth and libmodplug).

The resamplers are using libsamplerate, soxr and the included “speex” resampler from the Opus codec. You only need to use one of them, of course. The “speex” one is a good choice, as it doesn’t need external dependencies.
SDL_Audiolib.tar.bz2 (33.7 KB)

I tried using floats as you suggested, but the mixed audio sounds exactly the same.

This is probably happening because the samples are mostly near the 0dBFS maximum with no headroom in them. Mixing them will result in clipping. The way around this is to attenuate the signals when mixing. To see if that’s really the problem, try multiplying the samples by something like 0.7 (or even lower, until there’s no clipping) and then mix them. The dumb way to do this consistently is check whether the mixed sample would overflow (>= 1.0) and if yes, attenuate the input samples by just enough to get the result under 1.0. Ideally, you would apply less attenuation to the stronger signal, so that the weaker signal would not change its perceived loudness compared to the stronger one.

A more smart solution would implement look-ahead, where you watch the samples that are yet to be played and attenuate linearly over time.

You can research “audio limiter algorithms” on the net if you want to get fancy.

Multiplying by the samples before the mix did nothing. To cut problems down to one at a time, I had the playogg() copy AIFF data from the pipe to the ogg file’s play buffer. I expected this to cause the oggfile to cut over to the AIFF and back again and sound at least like a file played with the wrong rate with some crunching noises. This doesn’t even work. All I’m getting are screeches and kerchunks roughly matching the cadence of the AIFF file. The tonality doesn’t change with any of the suggestions you’ve made. This is my main problem.