1
Audio Engine Internals
minjaesong edited this page 2025-11-24 21:24:45 +09:00

Audio Engine Internals

Audience: Engine maintainers working on the audio system implementation.

This document describes the internal architecture of Terrarum's audio engine, including the mixer thread, DSP pipeline, spatial audio calculations, and low-level audio processing.

Architecture Overview

The audio engine is built on a threaded mixer architecture with real-time DSP processing:

┌─────────────────┐
│  Game Thread    │
│  (Main Loop)    │
└────────┬────────┘
         │ Control Commands
         ↓
┌─────────────────┐
│  AudioMixer     │  ← Singleton, runs on separate thread
│  (Central Hub)  │
└────────┬────────┘
         │
    ┌────┴────┬──────────┬──────────┐
    ↓         ↓          ↓          ↓
 Track 0   Track 1   Track 2   Track N
    │         │          │          │
    ↓         ↓          ↓          ↓
 Filter   Filter    Filter    Filter
 Chain    Chain     Chain     Chain
    │         │          │          │
    └────┬────┴──────────┴──────────┘
         ↓
    ┌────────────┐
    │   Mixer    │  ← Combines all tracks
    └────┬───────┘
         ↓
    ┌────────────┐
    │  OpenAL    │  ← Hardware audio output
    │  Output    │
    └────────────┘

Threading Model

AudioManagerRunnable

The audio system runs on a dedicated thread:

class AudioManagerRunnable(
    val audioMixer: AudioMixer
) : Runnable {

    @Volatile var running = true
    private val updateInterval = 1000 / 60  // 60 Hz update rate

    override fun run() {
        while (running) {
            val startTime = System.currentTimeMillis()

            // Update all tracks
            audioMixer.update()

            // Sleep to maintain update rate
            val elapsed = System.currentTimeMillis() - startTime
            val sleep Time = maxOf(0, updateInterval - elapsed)
            Thread.sleep(sleepTime)
        }
    }
}

Thread Safety

All audio operations are thread-safe:

@Volatile var trackVolume: TrackVolume  // Atomic updates
private val lock = ReentrantLock()      // Protects critical sections

fun setMusic(music: MusicContainer) {
    lock.withLock {
        currentMusic?.stop()
        currentMusic = music
        music.play()
    }
}

Update Rate

Audio updates at 60 Hz (every ~16.67ms) to match game frame rate:

  • Position updates for spatial audio
  • Volume envelope processing
  • Filter parameter interpolation
  • Crossfade calculations

TerrarumAudioMixerTrack

Track Structure

Each track is an independent audio channel:

class TerrarumAudioMixerTrack(val index: Int) {
    var currentMusic: MusicContainer? = null
    var trackVolume: TrackVolume = TrackVolume(1.0f, 1.0f)
    var filters: Array<AudioFilter> = arrayOf(NullFilter, NullFilter)
    var trackingTarget: ActorWithBody? = null
    var state: TrackState = TrackState.STOPPED

    val processor: StreamProcessor
}

Track State Machine

enum class TrackState {
    STOPPED,    // Not playing
    PLAYING,    // Currently playing
    PAUSED,     // Paused, can resume
    STOPPING,   // Fading out before stop
}

State transitions:

STOPPED ─play()─→ PLAYING
         ↑          ↓
       stop()    pause()
         ↑          ↓
       STOPPING ←─ PAUSED
         ↑          ↓
      (fadeout) resume()

Stream Processor

Handles low-level audio streaming:

class StreamProcessor(val track: TerrarumAudioMixerTrack) {
    var streamBuf: AudioDevice?  // OpenAL audio device
    private val buffer = FloatArray(BUFFER_SIZE)

    fun fillBuffer(): FloatArray {
        val source = track.currentMusic ?: return emptyBuffer()

        // Read raw PCM data from source
        val samples = source.readSamples(BUFFER_SIZE)

        // Apply volume
        applyStereoVolume(samples, track.trackVolume)

        // Process through filter chain
        val filtered = applyFilters(samples, track.filters)

        // Apply spatial positioning
        if (track.trackingTarget != null) {
            applySpatialAudio(filtered, track.trackingTarget!!)
        }

        return filtered
    }
}

Buffer Size

Buffer size is user-configurable (from MixerTrackProcessor.kt and OpenALBufferedAudioDevice.kt):

class MixerTrackProcessor(bufferSize: Int, val rate: Int, ...)
class OpenALBufferedAudioDevice(..., val bufferSize: Int, ...)

Range: 128 to 4096 samples per buffer Sample rate: 48000 Hz (constant from TerrarumAudioMixerTrack.SAMPLING_RATE)

Example buffer duration calculations:

  • 128 samples: 128 / 48000 ≈ 2.7ms latency
  • 4096 samples: 4096 / 48000 ≈ 85ms latency

Trade-off:

  • Smaller buffers → Lower latency, higher CPU usage, risk of audio dropouts
  • Larger buffers → Higher latency, lower CPU usage, more stable playback

Audio Streaming

MusicContainer

Wraps audio files for streaming:

class MusicContainer(
    val file: FileHandle,
    val format: AudioFormat
) {
    private var decoder: AudioDecoder? = null
    private var position: Long = 0  // Sample position

    fun readSamples(count: Int): FloatArray {
        if (decoder == null) {
            decoder = createDecoder(file, format)
        }

        val samples = FloatArray(count)
        val read = decoder!!.read(samples, 0, count)

        if (read < count) {
            // Reached end of file
            if (looping) {
                seek(0)
                decoder!!.read(samples, read, count - read)
            } else {
                // Pad with silence
                samples.fill(0f, read, count)
            }
        }

        position += read
        return samples
    }
}

Audio Decoders

Support multiple formats:

interface AudioDecoder {
    fun read(buffer: FloatArray, offset: Int, length: Int): Int
    fun seek(samplePosition: Long)
    fun close()
}

class OggVorbisDecoder : AudioDecoder { ... }
class WavDecoder : AudioDecoder { ... }
class Mp3Decoder : AudioDecoder { ... }

Streaming vs. Loading

  • Streaming — Large files (music): read chunks on-demand
  • Fully loaded — Small files (SFX): load entire file into memory
val isTooLargeToLoad = file.length() > 5 * 1024 * 1024  // 5 MB threshold

val container = if (isTooLargeToLoad) {
    StreamingMusicContainer(file)
} else {
    LoadedMusicContainer(file)
}

DSP Filter Pipeline

Filter Interface

interface AudioFilter {
    fun process(samples: FloatArray, sampleRate: Int)
    fun reset()
}

Filter Chain

Each track has two filter slots:

fun applyFilters(samples: FloatArray, filters: Array<AudioFilter>): FloatArray {
    var current = samples.copyOf()

    for (filter in filters) {
        if (filter !is NullFilter) {
            filter.process(current, SAMPLE_RATE)
        }
    }

    return current
}

Built-in Filters

LowPassFilter

Reduces high frequencies (muffled sound):

class LowPassFilter(var cutoffFreq: Float) : AudioFilter {
    private var prevSample = 0f

    override fun process(samples: FloatArray, sampleRate: Int) {
        val rc = 1.0f / (cutoffFreq * 2 * PI)
        val dt = 1.0f / sampleRate
        val alpha = dt / (rc + dt)

        for (i in samples.indices) {
            prevSample = prevSample + alpha * (samples[i] - prevSample)
            samples[i] = prevSample
        }
    }
}

HighPassFilter

Reduces low frequencies (tinny sound):

class HighPassFilter(var cutoffFreq: Float) : AudioFilter {
    private var prevInput = 0f
    private var prevOutput = 0f

    override fun process(samples: FloatArray, sampleRate: Int) {
        val rc = 1.0f / (cutoffFreq * 2 * PI)
        val dt = 1.0f / sampleRate
        val alpha = rc / (rc + dt)

        for (i in samples.indices) {
            prevOutput = alpha * (prevOutput + samples[i] - prevInput)
            prevInput = samples[i]
            samples[i] = prevOutput
        }
    }
}

Convolv

Convolutional reverb using impulse response (IR) files and FFT-based fast convolution:

class Convolv(
    irModule: String,        // Module containing IR file
    irPath: String,          // Path to IR file
    val crossfeed: Float,    // Stereo crossfeed amount (0.0-1.0)
    gain: Float = 1f / 256f  // Output gain
) : TerrarumAudioFilter() {

    private val fftLen: Int
    private val convFFT: Array<ComplexArray>
    private val sumbuf: Array<ComplexArray>

    init {
        convFFT = AudioHelper.getIR(irModule, irPath)
        fftLen = convFFT[0].size
        sumbuf = Array(2) { ComplexArray(FloatArray(fftLen * 2)) }
    }

    override fun thru(inbuf: List<FloatArray>, outbuf: List<FloatArray>) {
        // Fast convolution using overlap-add method
        pushSum(gain, inbuf[0], inbuf[1], sumbuf)

        convolve(sumbuf[0], convFFT[0], fftOutL)
        convolve(sumbuf[1], convFFT[1], fftOutR)

        // Extract final output from FFT results
        for (i in 0 until App.audioBufferSize) {
            outbuf[0][i] = fftOutL[fftLen - App.audioBufferSize + i]
            outbuf[1][i] = fftOutR[fftLen - App.audioBufferSize + i]
        }
    }

    private fun convolve(x: ComplexArray, h: ComplexArray, output: FloatArray) {
        FFT.fftInto(x, fftIn)
        fftIn.mult(h, fftMult)
        FFT.ifftAndGetReal(fftMult, output)
    }
}

Impulse Response (IR) Files:

  • Binary files containing mono IR with two channels
  • Loaded via AudioHelper.getIR(module, path)
  • FFT length determined by IR file size
  • Supports arbitrary room acoustics via recorded/synthesised IRs

BinoPan

Binaural panning for stereo positioning:

class BinoPan(var pan: Float) : AudioFilter {
    // pan: -1.0 (full left) to 1.0 (full right)

    override fun process(samples: FloatArray, sampleRate: Int) {
        val leftGain = sqrt((1.0f - pan) / 2.0f)
        val rightGain = sqrt((1.0f + pan) / 2.0f)

        for (i in samples.indices step 2) {
            val mono = (samples[i] + samples[i + 1]) / 2
            samples[i] = mono * leftGain       // Left channel
            samples[i + 1] = mono * rightGain  // Right channel
        }
    }
}

Volume Envelope Processing

Fade In/Out

class VolumeEnvelope {
    var targetVolume: Float = 1.0f
    var currentVolume: Float = 0.0f
    var fadeSpeed: Float = 1.0f  // Volume units per second

    fun update(delta: Float) {
        val change = fadeSpeed * delta

        currentVolume = when {
            currentVolume < targetVolume -> minOf(currentVolume + change, targetVolume)
            currentVolume > targetVolume -> maxOf(currentVolume - change, targetVolume)
            else -> currentVolume
        }
    }
}

Track Allocation

Dynamic Track Pool

class AudioMixer {
    val dynamicTracks = Array(MAX_TRACKS) { TerrarumAudioMixerTrack(it) }

    fun getAvailableTrack(): TerrarumAudioMixerTrack? {
        // Find stopped track
        dynamicTracks.firstOrNull { it.state == TrackState.STOPPED }?.let { return it }

        // Find oldest playing track
        return dynamicTracks.minByOrNull { it.startTime }
    }
}

Common Issues

Buffer Underrun

Symptoms: Audio stuttering, crackling

Causes:

  • Update thread lagging
  • Heavy CPU load
  • Inefficient filters

Solutions:

  • Increase buffer size
  • Optimise filter algorithms
  • Reduce track count

Clipping

Symptoms: Distorted, harsh sound

Causes:

  • Excessive volume
  • Too many overlapping sounds
  • Improper mixing

Solutions:

  • Normalise audio files
  • Implement limiter/compressor
  • Reduce master volume

Latency

Symptoms: Delayed audio response

Causes:

  • Large buffer size
  • Slow update rate

Solutions:

  • Reduce buffer size
  • Increase update rate
  • Use hardware audio acceleration

Future Enhancements

  1. Hardware-accelerated DSP — Use OpenAL EFX effects
  2. Audio compression — Reduce memory usage with codecs
  3. Dynamic range compression — Automatic volume normalisation
  4. 3D HRTF — Head-related transfer functions for realistic 3D
  5. Ambisonics — Full spherical surround sound
  6. Audio streaming from network — Multiplayer voice chat
  7. Procedural audio — Generate sounds algorithmically

See Also