mirror of
https://github.com/curioustorvald/tsvm.git
synced 2026-03-07 19:51:51 +09:00
TAD documentation update
This commit is contained in:
53
CLAUDE.md
53
CLAUDE.md
@@ -316,20 +316,23 @@ Implemented on 2025-10-15 for improved temporal compression through group-of-pic
|
|||||||
- **Adaptive GOPs**: Scene change detection ensures optimal GOP boundaries
|
- **Adaptive GOPs**: Scene change detection ensures optimal GOP boundaries
|
||||||
|
|
||||||
#### TAD Format (TSVM Advanced Audio)
|
#### TAD Format (TSVM Advanced Audio)
|
||||||
- **Perceptual audio codec** for TSVM using DWT with 4-tap interpolating Deslauriers-Dubuc wavelets
|
- **Perceptual audio codec** for TSVM using CDF 9/7 biorthogonal wavelets
|
||||||
- **C Encoder**: `video_encoder/encoder_tad.c` - Core Encoder library; `video_encoder/encoder_tad_standalone.c` - Standalone encoder with FFmpeg integration
|
- **C Encoder**: `video_encoder/encoder_tad.c` - Core Encoder library; `video_encoder/encoder_tad_standalone.c` - Standalone encoder with FFmpeg integration
|
||||||
- How to build: `make tad`
|
- How to build: `make tad`
|
||||||
- **Quality Levels**: 0-5 (0=lowest quality/smallest, 5=highest quality/largest; designed to be in sync with TAV encoder)
|
- **Quality Levels**: 0-5 (0=lowest quality/smallest, 5=highest quality/largest; designed to be in sync with TAV encoder)
|
||||||
- **C Decoder**: `video_encoder/decoder_tad.c` - Standalone decoder for TAD format
|
- **C Decoder**: `video_encoder/decoder_tad.c` - Standalone decoder for TAD format
|
||||||
|
- **Kotlin Decoder**: `AudioAdapter.kt` - Hardware-accelerated TAD decoder for TSVM runtime
|
||||||
- **Features**:
|
- **Features**:
|
||||||
- **32 KHz stereo**: TSVM audio hardware native format
|
- **32 KHz stereo**: TSVM audio hardware native format
|
||||||
- **Variable chunk sizes**: 1024-32768+ samples, enables flexible TAV integration
|
- **Variable chunk sizes**: Any size ≥1024 samples, including non-power-of-2 (e.g., 32016 for TAV 1-second GOPs)
|
||||||
- **M/S stereo decorrelation**: Exploits stereo correlation for better compression
|
- **M/S stereo decorrelation**: Exploits stereo correlation for better compression
|
||||||
- **PCM16→PCM8 conversion**: Error-diffusion dithering to minimize quantization noise
|
- **Gamma compression**: Dynamic range compression (γ=0.707) before quantization
|
||||||
- **Variable-level DD-4 DWT**: Dynamic levels (log2(chunk_size) - 2) for frequency domain analysis
|
- **9-level CDF 9/7 DWT**: Fixed 9 decomposition levels for all chunk sizes
|
||||||
- **Perceptual quantization**: Frequency-dependent weights preserving critical 2-4 KHz range
|
- **Perceptual quantization**: Frequency-dependent weights with lambda companding
|
||||||
- **2-bit twobitmap significance map**: Efficient encoding of sparse coefficients
|
- **Raw int8 storage**: Direct coefficient storage (no significance map, better Zstd compression)
|
||||||
- **Optional Zstd compression**: Level 7 for additional compression
|
- **Coefficient-domain dithering**: Light TPDF dithering to reduce banding
|
||||||
|
- **Zstd compression**: Level 7 for additional compression
|
||||||
|
- **Non-power-of-2 support**: Fixed 2025-10-30 to handle arbitrary chunk sizes correctly
|
||||||
- **Usage Examples**:
|
- **Usage Examples**:
|
||||||
```bash
|
```bash
|
||||||
# Encode with default quality (Q3)
|
# Encode with default quality (Q3)
|
||||||
@@ -348,7 +351,7 @@ Implemented on 2025-10-15 for improved temporal compression through group-of-pic
|
|||||||
decoder_tad -i input.tad -o output.pcm
|
decoder_tad -i input.tad -o output.pcm
|
||||||
```
|
```
|
||||||
- **Format documentation**: `terranmon.txt` (search for "TSVM Advanced Audio (TAD) Format")
|
- **Format documentation**: `terranmon.txt` (search for "TSVM Advanced Audio (TAD) Format")
|
||||||
- **Version**: 1 (2-bit twobitmap significance map)
|
- **Version**: 1.1 (raw int8 storage with non-power-of-2 support, updated 2025-10-30)
|
||||||
|
|
||||||
**TAD Compression Performance**:
|
**TAD Compression Performance**:
|
||||||
- **Target Compression**: 2:1 against PCMu8 baseline (4:1 against PCM16LE input)
|
- **Target Compression**: 2:1 against PCMu8 baseline (4:1 against PCM16LE input)
|
||||||
@@ -358,15 +361,16 @@ Implemented on 2025-10-15 for improved temporal compression through group-of-pic
|
|||||||
|
|
||||||
**TAD Encoding Pipeline**:
|
**TAD Encoding Pipeline**:
|
||||||
1. **FFmpeg Two-Pass Extraction**: High-quality SoXR resampling to 32 KHz with 16 Hz highpass filter
|
1. **FFmpeg Two-Pass Extraction**: High-quality SoXR resampling to 32 KHz with 16 Hz highpass filter
|
||||||
2. **PCM16→PCM8 with Dithering**: Error-diffusion dithering minimizes quantization noise
|
2. **Gamma Compression**: Dynamic range compression (γ=0.707) for perceptual uniformity
|
||||||
3. **M/S Stereo Decorrelation**: Transforms Left/Right to Mid/Side for better compression
|
3. **M/S Stereo Decorrelation**: Transforms Left/Right to Mid/Side for better compression
|
||||||
4. **Variable-Level DD-4 DWT**: Deslauriers-Dubuc 4-tap interpolating wavelets with dynamic levels
|
4. **9-Level CDF 9/7 DWT**: biorthogonal wavelets with fixed 9 levels
|
||||||
- Default 32768 samples → 13 DWT levels
|
- All chunk sizes use 9 levels (sufficient for ≥512 samples after 9 halvings)
|
||||||
- Minimum 1024 samples → 8 DWT levels
|
- Supports non-power-of-2 sizes through proper length tracking
|
||||||
5. **Frequency-Dependent Quantization**: Perceptual weights favor 2-4 KHz (speech intelligibility)
|
5. **Frequency-Dependent Quantization**: Perceptual weights with lambda companding
|
||||||
6. **Dead Zone Quantization**: Zeros high-frequency noise (highest band)
|
6. **Dead Zone Quantization**: Zeros high-frequency noise (highest band)
|
||||||
7. **2-bit Twobitmap Encoding**: Maps coefficients to 00=0, 01=+1, 10=-1, 11=other
|
7. **Coefficient-Domain Dithering**: Light TPDF dithering (±0.5 quantization steps)
|
||||||
8. **Optional Zstd Compression**: Level 7 compression on concatenated Mid+Side data
|
8. **Raw Int8 Storage**: Direct coefficient storage as signed int8 values
|
||||||
|
9. **Optional Zstd Compression**: Level 7 compression on concatenated Mid+Side data
|
||||||
|
|
||||||
**TAD Integration with TAV**:
|
**TAD Integration with TAV**:
|
||||||
TAD is designed as an includable API for TAV video encoder integration. The variable chunk size
|
TAD is designed as an includable API for TAV video encoder integration. The variable chunk size
|
||||||
@@ -375,7 +379,20 @@ TAV embeds TAD-compressed audio using packet type 0x24 with Zstd compression.
|
|||||||
|
|
||||||
**TAD Hardware Acceleration**:
|
**TAD Hardware Acceleration**:
|
||||||
TSVM accelerates TAD decoding with AudioAdapter.kt (backend) and AudioJSR223Delegate.kt (API):
|
TSVM accelerates TAD decoding with AudioAdapter.kt (backend) and AudioJSR223Delegate.kt (API):
|
||||||
- Backend decoder in AudioAdapter.kt with variable chunk size support
|
- Backend decoder in AudioAdapter.kt with non-power-of-2 chunk size support (fixed 2025-10-30)
|
||||||
- API functions in AudioJSR223Delegate.kt for JavaScript access
|
- API functions in AudioJSR223Delegate.kt for JavaScript access
|
||||||
- Supports chunk sizes from 1024 to 32768+ samples
|
- Supports chunk sizes from 1024 to 32768+ samples (any size ≥1024)
|
||||||
- Dynamic DWT level calculation for optimal performance
|
- Fixed 9-level CDF 9/7 inverse DWT with correct length tracking for non-power-of-2 sizes
|
||||||
|
|
||||||
|
**Critical Implementation Note (Fixed 2025-10-30)**:
|
||||||
|
Multi-level inverse DWT must pre-calculate the exact sequence of lengths from forward transform:
|
||||||
|
```kotlin
|
||||||
|
val lengths = IntArray(levels + 1)
|
||||||
|
lengths[0] = chunk_size
|
||||||
|
for (i in 1..levels) {
|
||||||
|
lengths[i] = (lengths[i - 1] + 1) / 2
|
||||||
|
}
|
||||||
|
// Apply inverse DWT using lengths[level] for each level
|
||||||
|
```
|
||||||
|
Using simple doubling (`length *= 2`) is incorrect for non-power-of-2 sizes and causes
|
||||||
|
mirrored subband artifacts.
|
||||||
|
|||||||
144
terranmon.txt
144
terranmon.txt
@@ -1523,11 +1523,12 @@ Number|Index
|
|||||||
|
|
||||||
TSVM Advanced Audio (TAD) Format
|
TSVM Advanced Audio (TAD) Format
|
||||||
Created by CuriousTorvald and Claude on 2025-10-23
|
Created by CuriousTorvald and Claude on 2025-10-23
|
||||||
|
Updated: 2025-10-30 (fixed non-power-of-2 sample count support)
|
||||||
|
|
||||||
TAD is a perceptual audio codec for TSVM utilizing Discrete Wavelet Transform (DWT)
|
TAD is a perceptual audio codec for TSVM utilizing Discrete Wavelet Transform (DWT)
|
||||||
with 4-tap interpolating Deslauriers-Dubuc wavelets, providing efficient compression
|
with CDF 9/7 biorthogonal wavelets, providing efficient compression through M/S stereo
|
||||||
through M/S stereo decorrelation, frequency-dependent quantization, and significance
|
decorrelation, frequency-dependent quantization, and raw int8 coefficient storage.
|
||||||
map encoding. Designed as an includable API for integration with TAV video encoder.
|
Designed as an includable API for integration with TAV video encoder.
|
||||||
|
|
||||||
When used inside of a video codec, only zstd-compressed payload is stored, chunk length
|
When used inside of a video codec, only zstd-compressed payload is stored, chunk length
|
||||||
is stored separately and quality index is shared with that of the video.
|
is stored separately and quality index is shared with that of the video.
|
||||||
@@ -1556,20 +1557,21 @@ is stored separately and quality index is shared with that of the video.
|
|||||||
- **Channels**: 2 (stereo)
|
- **Channels**: 2 (stereo)
|
||||||
- **Input Format**: PCM32fLE (32-bit float little-endian PCM)
|
- **Input Format**: PCM32fLE (32-bit float little-endian PCM)
|
||||||
- **Preprocessing**: 16 Hz highpass filter applied during extraction
|
- **Preprocessing**: 16 Hz highpass filter applied during extraction
|
||||||
- **Internal Representation**: Signed PCM8 with error-diffusion dithering
|
- **Internal Representation**: Float32 throughout encoding, PCM8 conversion only at decoder
|
||||||
- **Chunk Size**: Variable (1024-32768+ samples per channel, must be power of 2)
|
- **Chunk Size**: Variable (1024-32768+ samples per channel, any size ≥1024 supported)
|
||||||
- Default: 32768 samples (1.024 seconds at 32 kHz)
|
- Default: 32768 samples (1.024 seconds at 32 kHz) for standalone files
|
||||||
|
- TAV integration: Uses exact GOP sample count (e.g., 32016 for 1 second at 32 kHz)
|
||||||
- Minimum: 1024 samples (32 ms at 32 kHz)
|
- Minimum: 1024 samples (32 ms at 32 kHz)
|
||||||
- DWT levels calculated dynamically: log2(chunk_size) - 1
|
- DWT levels: Fixed at 9 levels for all chunk sizes
|
||||||
- **Target Compression**: 2:1 against PCMu8 baseline
|
- **Target Compression**: 2:1 against PCMu8 baseline
|
||||||
|
- **Wavelet**: CDF 9/7 biorthogonal
|
||||||
|
|
||||||
## Chunk Structure
|
## Chunk Structure
|
||||||
Each chunk encodes a variable number of stereo samples (power of 2, minimum 1024).
|
Each chunk encodes a variable number of stereo samples (minimum 1024, any size supported).
|
||||||
Default is 32768 samples (65536 total samples, 1.024 seconds).
|
Default is 32768 samples (65536 total samples, 1.024 seconds) for standalone files.
|
||||||
If the audio duration doesn't align to chunk boundaries, the final chunk can use
|
TAV integration uses exact GOP sample counts (e.g., 32016 samples for 1 second at 32 kHz).
|
||||||
a smaller power-of-2 size or be zero-padded.
|
|
||||||
|
|
||||||
uint16 Sample Count: number of samples per channel (must be power of 2, min 1024)
|
uint16 Sample Count: number of samples per channel (min 1024, any size ≥1024)
|
||||||
uint8 Max quantisation index: this number * 2 + 1 is the total steps of quantisation
|
uint8 Max quantisation index: this number * 2 + 1 is the total steps of quantisation
|
||||||
uint32 Chunk Payload Size: size of following payload in bytes
|
uint32 Chunk Payload Size: size of following payload in bytes
|
||||||
* Chunk Payload: encoded M/S stereo data (Zstd compressed if flag set)
|
* Chunk Payload: encoded M/S stereo data (Zstd compressed if flag set)
|
||||||
@@ -1580,11 +1582,12 @@ a smaller power-of-2 size or be zero-padded.
|
|||||||
|
|
||||||
## Encoding Pipeline
|
## Encoding Pipeline
|
||||||
|
|
||||||
### Step 1: PCM32f to PCM8 Conversion with Error-Diffusion Dithering
|
### Step 1: Dynamic Range Compression (Gamma Compression)
|
||||||
Input stereo PCM32fLE is converted to signed PCM8 using second-order noise-shaped
|
Input stereo PCM32fLE undergoes gamma compression for perceptual uniformity:
|
||||||
error-diffusion dithering to minimize quantization noise.
|
|
||||||
|
|
||||||
Error is propagated to the next sample (alternating between left/right channels).
|
encode(x) = sign(x) * |x|^γ where γ=0.707 (1/√2)
|
||||||
|
|
||||||
|
This compresses dynamic range before quantization, improving perceptual quality.
|
||||||
|
|
||||||
### Step 2: M/S Stereo Decorrelation
|
### Step 2: M/S Stereo Decorrelation
|
||||||
Mid-Side transformation exploits stereo correlation:
|
Mid-Side transformation exploits stereo correlation:
|
||||||
@@ -1595,16 +1598,18 @@ Mid-Side transformation exploits stereo correlation:
|
|||||||
This typically concentrates energy in the Mid channel while the Side channel
|
This typically concentrates energy in the Mid channel while the Side channel
|
||||||
contains mostly small values, improving compression efficiency.
|
contains mostly small values, improving compression efficiency.
|
||||||
|
|
||||||
### Step 3: Variable-Level DD-4 DWT
|
### Step 3: 9-Level CDF 9/7 DWT
|
||||||
Each channel (Mid and Side) undergoes Deslauriers-Dubuc 4-tap interpolating wavelet
|
Each channel (Mid and Side) undergoes CDF 9/7 biorthogonal wavelet decomposition. The codec uses a fixed 9 decomposition levels for all chunk sizes:
|
||||||
decomposition. The number of DWT levels is calculated dynamically based on chunk size:
|
|
||||||
|
|
||||||
DWT Levels = log2(chunk_size) - 1
|
DWT Levels = 9 (fixed)
|
||||||
|
|
||||||
For the default 32768-sample chunks, this produces 14 levels with frequency subbands:
|
For 32768-sample chunks:
|
||||||
|
- After 9 levels: 64 LL coefficients
|
||||||
|
- Frequency subbands: LL + 9 H bands (L9 to L1)
|
||||||
|
|
||||||
Level 0-13: High to low frequency coefficients
|
For 32016-sample chunks (TAV 1-second GOP):
|
||||||
DC band: Low-frequency approximation coefficients
|
- After 9 levels: 63 LL coefficients
|
||||||
|
- Supports non-power-of-2 sizes through proper length tracking (fixed 2025-10-30)
|
||||||
|
|
||||||
Sideband boundaries are calculated dynamically:
|
Sideband boundaries are calculated dynamically:
|
||||||
first_band_size = chunk_size >> dwt_levels
|
first_band_size = chunk_size >> dwt_levels
|
||||||
@@ -1612,8 +1617,12 @@ Sideband boundaries are calculated dynamically:
|
|||||||
sideband[1] = first_band_size
|
sideband[1] = first_band_size
|
||||||
sideband[i+1] = sideband[i] + (first_band_size << (i-1))
|
sideband[i+1] = sideband[i] + (first_band_size << (i-1))
|
||||||
|
|
||||||
For 32768 samples with 14 levels: boundaries at 0, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768
|
CDF 9/7 lifting coefficients:
|
||||||
For 1024 samples with 9 levels: boundaries at 0, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024
|
α = -1.586134342
|
||||||
|
β = -0.052980118
|
||||||
|
γ = 0.882911076
|
||||||
|
δ = 0.443506852
|
||||||
|
K = 1.230174105
|
||||||
|
|
||||||
### Step 4: Frequency-Dependent Quantization
|
### Step 4: Frequency-Dependent Quantization
|
||||||
DWT coefficients are quantized using perceptually-tuned frequency-dependent weights.
|
DWT coefficients are quantized using perceptually-tuned frequency-dependent weights.
|
||||||
@@ -1630,32 +1639,53 @@ where coefficients smaller than half the quantization step are zeroed:
|
|||||||
This aggressively removes high-frequency noise while preserving important
|
This aggressively removes high-frequency noise while preserving important
|
||||||
mid-frequency content (2-4 KHz critical for speech intelligibility).
|
mid-frequency content (2-4 KHz critical for speech intelligibility).
|
||||||
|
|
||||||
### Step 5: 2-bit Significance Map Encoding
|
### Step 5: Raw Int8 Coefficient Storage
|
||||||
Quantized coefficients are encoded using the 2-bit twobitmap method (see above).
|
Quantized coefficients are stored directly as signed int8 values (no significance map, better Zstd compression).
|
||||||
|
Concatenated format: [Mid_channel_data][Side_channel_data]
|
||||||
|
|
||||||
### Step 6: Optional Zstd Compression
|
### Step 6: Coefficient-Domain Dithering (Encoder)
|
||||||
If enabled (default), the concatenated Mid+Side encoded data is compressed
|
Light triangular dithering (±0.5 quantization steps) added to coefficients before
|
||||||
using Zstd level 3 for additional compression without significant CPU overhead.
|
quantization to reduce banding artifacts.
|
||||||
|
|
||||||
|
### Step 7: Zstd Compression
|
||||||
|
The concatenated Mid+Side encoded data is compressed
|
||||||
|
using Zstd level 7 for additional compression without significant CPU overhead.
|
||||||
|
|
||||||
## Decoding Pipeline
|
## Decoding Pipeline
|
||||||
|
|
||||||
### Step 1: Chunk Extraction
|
### Step 1: Chunk Extraction and Decompression
|
||||||
Read chunk header to determine significance map method and compression status.
|
Read chunk header (sample_count, max_index, payload_size).
|
||||||
If compressed, decompress payload using Zstd.
|
If compressed (default), decompress payload using Zstd.
|
||||||
|
|
||||||
### Step 2: Decode Significance Maps
|
### Step 2: Coefficient Extraction
|
||||||
Decode Mid and Side channel data using 2-bit twobitmap decoder:
|
Extract Mid and Side channel int8 data from concatenated payload:
|
||||||
- Read 2-bit codes from significance map
|
- Mid channel: bytes [0..sample_count-1]
|
||||||
- Reconstruct coefficients: 0, +1, -1, or read from Other Values array
|
- Side channel: bytes [sample_count..2*sample_count-1]
|
||||||
|
|
||||||
### Step 3: Dequantization
|
### Step 3: Dequantization with Lambda Decompanding
|
||||||
Multiply quantized coefficients by frequency-dependent quantization steps
|
Convert quantized int8 values back to float coefficients using:
|
||||||
(same weights as encoder).
|
1. Lambda decompanding (inverse of Laplacian CDF compression)
|
||||||
|
2. Multiply by frequency-dependent quantization steps
|
||||||
|
3. Apply coefficient-domain dithering (TPDF, ~-60 dBFS)
|
||||||
|
|
||||||
### Step 4: Variable-Level Inverse DD-4 DWT
|
### Step 4: 9-Level Inverse CDF 9/7 DWT
|
||||||
Reconstruct PCM8 audio from DWT coefficients using inverse DD-4 transform,
|
Reconstruct Float32 audio from DWT coefficients using inverse CDF 9/7 transform.
|
||||||
progressively doubling length from the deepest level to chunk_size samples.
|
|
||||||
The number of inverse DWT levels matches the forward transform (log2(chunk_size) - 1).
|
**Critical Implementation (Fixed 2025-10-30)**:
|
||||||
|
The multi-level inverse DWT must use the EXACT sequence of lengths from forward
|
||||||
|
transform, in reverse order. Using simple doubling (length *= 2) is INCORRECT
|
||||||
|
for non-power-of-2 sizes.
|
||||||
|
|
||||||
|
Correct approach:
|
||||||
|
1. Pre-calculate all forward transform lengths:
|
||||||
|
lengths[0] = chunk_size
|
||||||
|
lengths[i] = (lengths[i-1] + 1) / 2 for i=1..9
|
||||||
|
2. Apply inverse DWT in reverse order:
|
||||||
|
for level from 8 down to 0:
|
||||||
|
apply inverse_dwt(data, lengths[level])
|
||||||
|
|
||||||
|
This ensures correct reconstruction for all chunk sizes including non-power-of-2
|
||||||
|
values (e.g., 32016 samples for TAV 1-second GOPs).
|
||||||
|
|
||||||
### Step 5: M/S to L/R Conversion
|
### Step 5: M/S to L/R Conversion
|
||||||
Convert Mid/Side back to Left/Right stereo:
|
Convert Mid/Side back to Left/Right stereo:
|
||||||
@@ -1663,6 +1693,16 @@ Convert Mid/Side back to Left/Right stereo:
|
|||||||
Left = Mid + Side
|
Left = Mid + Side
|
||||||
Right = Mid - Side
|
Right = Mid - Side
|
||||||
|
|
||||||
|
### Step 6: Gamma Expansion
|
||||||
|
Expand dynamic range (inverse of encoder's gamma compression):
|
||||||
|
|
||||||
|
decode(y) = sign(y) * |y|^(1/γ) where γ=0.707, so 1/γ=√2≈1.414
|
||||||
|
|
||||||
|
### Step 7: PCM32f to PCM8 Conversion with Noise-Shaped Dithering
|
||||||
|
Convert Float32 samples to unsigned PCM8 (PCMu8) using second-order error-diffusion
|
||||||
|
dithering with reduced amplitude (0.2× TPDF) to coordinate with coefficient-domain
|
||||||
|
dithering.
|
||||||
|
|
||||||
## Compression Performance
|
## Compression Performance
|
||||||
- **Target Ratio**: 2:1 against PCMu8
|
- **Target Ratio**: 2:1 against PCMu8
|
||||||
- **Achieved Ratio**: 2.51:1 against PCMu8 at quality level 3
|
- **Achieved Ratio**: 2.51:1 against PCMu8 at quality level 3
|
||||||
@@ -1699,10 +1739,18 @@ TAD encoder uses two-pass FFmpeg extraction for optimal quality:
|
|||||||
This ensures resampling happens after extraction with optimal quality parameters.
|
This ensures resampling happens after extraction with optimal quality parameters.
|
||||||
|
|
||||||
## Hardware Acceleration API
|
## Hardware Acceleration API
|
||||||
TAD decoder may be accelerated using hardware functions in GraphicsJSR223Delegate:
|
TAD decoder is accelerated through AudioAdapter.kt peripheral (backend) and
|
||||||
- tadDecode(): Main decoding function (chunk-based)
|
AudioJSR223Delegate.kt (JavaScript API):
|
||||||
- tadHaarIDWT(): Fast inverse Haar DWT
|
|
||||||
- tadDequantize(): Frequency-dependent dequantization
|
Backend (AudioAdapter.kt):
|
||||||
|
- decodeTad(): Main decoding function (chunk-based, reads from tadInputBin)
|
||||||
|
- dwt97Inverse1d(): Single-level inverse CDF 9/7 DWT
|
||||||
|
- dwt97InverseMultilevel(): 9-level inverse DWT with non-power-of-2 support
|
||||||
|
|
||||||
|
JavaScript API (audio.* functions):
|
||||||
|
- audio.tadDecode(): Trigger TAD decoding from peripheral input buffer
|
||||||
|
- audio.tadUploadDecoded(offset, count): Upload decoded PCMu8 to playback buffer
|
||||||
|
- audio.getMemAddr(): Get peripheral memory base address for buffer access
|
||||||
|
|
||||||
## Usage Examples
|
## Usage Examples
|
||||||
# Encode with default quality (Q3)
|
# Encode with default quality (Q3)
|
||||||
|
|||||||
Reference in New Issue
Block a user