libtavenc is a high-performance video encoding library implementing the TSVM Advanced Video (TAV) codec. It provides a clean C API for encoding RGB24 video frames using discrete wavelet transform (DWT) with perceptual quantization and GOP-based temporal compression.

Features

Multiple Wavelet Types: CDF 5/3, CDF 9/7, CDF 13/7, DD-4, Haar
3D DWT GOP Encoding: Temporal + spatial wavelet compression
Perceptual Quantization: HVS-optimized coefficient scaling
EZBC Entropy Coding: Efficient coefficient compression with Zstd
Multi-threading: Internal thread pool for optimal performance
Color Spaces: YCoCg-R (default) and ICtCp (for HDR)
Quality Levels: 0-5 (0=lowest/smallest, 5=highest/largest)

Building

# Build static library
make lib/libtavenc.a

# Build with encoder CLI
make encoder_tav

# Install library and headers
make install-libs PREFIX=/usr/local

Quick Start

Basic Encoding

#include "tav_encoder_lib.h"
#include <stdio.h>

int main() {
    // Initialize encoder parameters
    tav_encoder_params_t params;
    tav_encoder_params_init(&params, 1920, 1080);

    // Configure encoding options
    params.fps_num = 60;
    params.fps_den = 1;
    params.wavelet_type = 1;        // CDF 9/7 (default)
    params.quality_y = 3;            // Quality level 3
    params.quality_co = 3;
    params.quality_cg = 3;
    params.enable_temporal_dwt = 1;  // Enable 3D GOP encoding
    params.gop_size = 0;             // Auto-calculate (typically 16-24)
    params.num_threads = 4;          // 4 worker threads

    // Create encoder context
    tav_encoder_context_t *ctx = tav_encoder_create(&params);
    if (!ctx) {
        fprintf(stderr, "Failed to create encoder\n");
        return -1;
    }

    // Get actual parameters (with auto-calculated values)
    tav_encoder_get_params(ctx, &params);
    printf("GOP size: %d frames\n", params.gop_size);

    // Encode frames
    uint8_t *rgb_frame = /* ... load RGB24 frame ... */;
    tav_encoder_packet_t *packet;

    for (int i = 0; i < num_frames; i++) {
        int result = tav_encoder_encode_frame(ctx, rgb_frame, i, &packet);

        if (result == 1) {
            // Packet ready (GOP completed)
            fwrite(packet->data, 1, packet->size, outfile);
            tav_encoder_free_packet(packet);
        }
        else if (result == 0) {
            // Frame buffered, waiting for GOP to fill
        }
        else {
            // Error
            fprintf(stderr, "Encoding error: %s\n", tav_encoder_get_error(ctx));
            break;
        }
    }

    // Flush remaining frames
    while (tav_encoder_flush(ctx, &packet) == 1) {
        fwrite(packet->data, 1, packet->size, outfile);
        tav_encoder_free_packet(packet);
    }

    // Cleanup
    tav_encoder_free(ctx);
    return 0;
}

Stateless GOP Encoding (Multi-threaded)

The library provides tav_encoder_encode_gop() for stateless GOP encoding, perfect for multi-threaded applications:

#include "tav_encoder_lib.h"
#include <pthread.h>

typedef struct {
    tav_encoder_params_t params;
    uint8_t **rgb_frames;
    int num_frames;
    int *frame_numbers;
    tav_encoder_packet_t *output_packet;
} gop_encode_job_t;

void *encode_gop_thread(void *arg) {
    gop_encode_job_t *job = (gop_encode_job_t *)arg;

    // Create thread-local encoder context
    tav_encoder_context_t *ctx = tav_encoder_create(&job->params);
    if (!ctx) {
        return NULL;
    }

    // Encode entire GOP at once (stateless, thread-safe)
    tav_encoder_encode_gop(ctx,
                           (const uint8_t **)job->rgb_frames,
                           job->num_frames,
                           job->frame_numbers,
                           &job->output_packet);

    tav_encoder_free(ctx);
    return NULL;
}

int main() {
    // Setup parameters
    tav_encoder_params_t params;
    tav_encoder_params_init(&params, 1920, 1080);
    params.enable_temporal_dwt = 1;
    params.gop_size = 24;

    // Create worker threads
    pthread_t threads[4];
    gop_encode_job_t jobs[4];

    for (int i = 0; i < 4; i++) {
        jobs[i].params = params;
        jobs[i].rgb_frames = /* ... load GOP frames ... */;
        jobs[i].num_frames = 24;
        jobs[i].frame_numbers = /* ... frame indices ... */;

        pthread_create(&threads[i], NULL, encode_gop_thread, &jobs[i]);
    }

    // Wait for completion
    for (int i = 0; i < 4; i++) {
        pthread_join(threads[i], NULL);

        // Write output packet
        if (jobs[i].output_packet) {
            fwrite(jobs[i].output_packet->data, 1,
                   jobs[i].output_packet->size, outfile);
            tav_encoder_free_packet(jobs[i].output_packet);
        }
    }

    return 0;
}

API Reference

Context Management

`tav_encoder_create()`

Creates encoder context with specified parameters. Allocates internal buffers and initializes thread pool if multi-threading enabled.

Returns: Encoder context or NULL on failure

`tav_encoder_free()`

Frees encoder context and all resources. Any unflushed GOP frames are lost.

`tav_encoder_get_error()`

Returns last error message string.

`tav_encoder_get_params()`

Gets encoder parameters with calculated values (e.g., auto-calculated GOP size, decomposition levels).

Frame Encoding

`tav_encoder_encode_frame()`

Encodes single RGB24 frame. Frames are buffered until GOP is full.

Parameters:

rgb_frame: RGB24 planar format [R...][G...][B...], width×height×3 bytes
frame_pts: Presentation timestamp (frame number or time)
packet: Output packet pointer (NULL if GOP not ready)

Returns:

1: Packet ready (GOP completed)
0: Frame buffered, waiting for more frames
-1: Error

`tav_encoder_flush()`

Flushes remaining buffered frames and encodes final GOP. Call at end of stream.

Returns:

1: Packet ready
0: No more packets
-1: Error

`tav_encoder_encode_gop()`

Stateless GOP encoding. Thread-safe with separate contexts.

Parameters:

rgb_frames: Array of RGB24 frames [frame][width×height×3]
num_frames: Number of frames in GOP (1-24)
frame_numbers: Frame indices for timecodes (can be NULL)
packet: Output packet pointer

Returns: 1 on success, -1 on error

Packet Management

`tav_encoder_free_packet()`

Frees packet returned by encoding functions.

Encoder Parameters

Video Dimensions

width, height: Frame dimensions (must be even)
fps_num, fps_den: Framerate (e.g., 60/1 for 60fps)

Wavelet Configuration

wavelet_type: Spatial wavelet
- 0: CDF 5/3 (reversible, lossless-capable)
- 1: CDF 9/7 (default, best compression)
- 2: CDF 13/7 (experimental)
- 16: DD-4 (four-point interpolating)
- 255: Haar (demonstration)
temporal_wavelet: Temporal wavelet for 3D DWT
- 0: Haar (default for sports/high motion)
- 1: CDF 5/3 (smooth motion)
decomp_levels: Spatial DWT levels (0=auto, typically 6)
temporal_levels: Temporal DWT levels (0=auto, typically 2 for 8-frame GOPs)

Color Space

channel_layout:
- 0: YCoCg-R (default, efficient chroma)
- 1: ICtCp (for HDR/BT.2100 sources)
perceptual_tuning: 1=enable HVS perceptual quantization (default), 0=uniform

GOP Configuration

enable_temporal_dwt: 1=enable 3D DWT GOP encoding (default), 0=intra-only I-frames
gop_size: Frames per GOP (8, 16, or 24; 0=auto based on framerate)
enable_two_pass: 1=enable two-pass with scene change detection (default), 0=single-pass

Quality Control

quality_y: Luma quality (0-5, default: 3)
quality_co: Orange chrominance quality (0-5, default: 3)
quality_cg: Green chrominance quality (0-5, default: 3)
dead_zone_threshold: Dead-zone quantization (0=disabled, 1-10 typical)

Entropy Coding

entropy_coder:
- 0: Twobitmap (default, fast)
- 1: EZBC (better compression for high-quality)
zstd_level: Zstd compression level (3-22, default: 7)

Multi-threading

num_threads: Worker threads
- 0: Single-threaded (default for CLI)
- -1: Auto-detect CPU cores
- 1-16: Explicit thread count

Encoder Presets

encoder_preset: Preset flags
- 0x01: Sports mode (finer temporal quantization)
- 0x02: Anime mode (disable grain)

TAV Packet Types

Output packets have type field indicating content:

0x10: I-frame (intra-only, single frame)
0x11: P-frame (delta from previous)
0x12: GOP unified (3D DWT, multiple frames)
0x24: TAD audio (DWT-based audio codec)
0xF0: Loop point start
0xFC: GOP sync (frame count marker)
0xFD: Timecode metadata

Performance Notes

Threading Model

Library manages internal thread pool when num_threads > 0
GOP encoding is parallelized across worker threads
For CLI tools: use num_threads=0 (single-threaded) to avoid double-threading with external parallelism
For library integration: use num_threads=-1 or explicit count for optimal performance

Memory Usage

Each encoder context allocates:
- GOP buffer: gop_size × width × height × 3 bytes (RGB frames)
- DWT coefficients: ~width × height × 12 bytes per channel
- Thread pool: num_threads × (GOP buffer + workspace)
Typical 1920×1080 encoder with GOP=24: ~180 MB per context

Encoding Speed

Single-threaded: 10-15 fps (1920×1080 on modern CPU)
Multi-threaded (4 threads): 30-40 fps
GOP size affects latency: larger GOP = higher latency, better compression

Integration with TAD Audio

TAV files typically include TAD-compressed audio. Link with both libraries:

#include "tav_encoder_lib.h"
#include "encoder_tad.h"

// Encode video frame
tav_encoder_encode_frame(video_ctx, rgb_frame, pts, &video_packet);

// Encode audio chunk (32kHz stereo, float samples)
tad32_encode_chunk(audio_ctx, pcm_samples, num_samples, &audio_data, &audio_size);

// Mux both into TAV file (interleave by frame PTS)

Error Handling

All functions return error codes and set error message accessible via tav_encoder_get_error():

if (tav_encoder_encode_frame(ctx, frame, pts, &packet) < 0) {
    fprintf(stderr, "Encoding failed: %s\n", tav_encoder_get_error(ctx));
    // Handle error
}

Limitations

Maximum resolution: 8192×8192
GOP size: 1-48 frames
Single-tile encoding only (no spatial tiling)
Requires even width and height

License

Part of the TSVM project.

README.md Unescape Escape

libtavenc - TAV Video Encoder Library