Files
tsvm/terranmon.txt
2025-10-23 09:26:58 +09:00

1827 lines
59 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
1 byte = 2 pixels
560x448@4bpp = 125 440 bytes
560x448@8bpp = 250 880 bytes
-> 262144 bytes (256 kB)
[USER AREA | HW AREA]
Number of pheripherals = 8, of which the computer itself is considered as
a peripheral.
HW AREA = [Peripherals | MMIO | INTVEC]
User area: 8 MB, hardware area: 8 MB
8192 kB
User Space
1024 kB
Peripheral #7
1024 kB
Peripheral #6
...
1024 kB (where Peripheral #0 would be)
MMIO and Interrupt Vectors
128 kB
MMIO for Peri #7
128 kB
MMIO for Peri #6
...
128 kB (where Peripheral #0 would be)
MMIO for the computer
Certain memory mapper may allow extra 4 MB of User Space in exchange for the Peripheral slot #4 through #7.
--------------------------------------------------------------------------------
IO Device
Endianness: little
Note: Always takes up the peripheral slot of zero
Latching: latching is used to "lock" the fluctuating values when you attempt to read them so you would get
reliable values when you try to read them, especially the multibyte values where another byte would
change after you read one byte, e.g. System uptime in nanoseconds
MMIO
0..31 RO: Raw Keyboard Buffer read. Won't shift the key buffer
32..33 RO: Mouse X pos
34..35 RO: Mouse Y pos
36 RO: Mouse down? (1 for TRUE, 0 for FALSE)
37 RW: Read/Write single key input. Key buffer will be shifted. Manual writing is
usually unnecessary as such action must be automatically managed via LibGDX
input processing.
Stores ASCII code representing the character, plus:
(1..26: Ctrl+[alph])
3 : Ctrl+C
4 : Ctrl+D
8 : Backspace
(13: Return)
19: Up arrow
20: Down arrow
21: Left arrow
22: Right arrow
38 RW: Request keyboard input be read (TTY Function). Write nonzero value to enable, write zero to
close it. Keyboard buffer will be cleared whenever request is received, so
MAKE SURE YOU REQUEST THE KEY INPUT ONLY ONCE!
39 WO: Latch Key/Mouse Input (Raw Input function). Write nonzero value to latch.
Stores LibGDX Key code
40..47 RO: Key Press buffer
stores keys that are held down. Can accomodate 8-key rollover (in keyboard geeks' terms)
0x0 is written for the empty area; numbers are always sorted
48..51 RO: System flags
48: 0b rq00 000t
t: STOP button (should raise SIGTERM)
r: RESET button (hypervisor should reset the system)
q: SysRq button (hypervisor should respond to it)
49: set to 1 if a key has pushed into key buffer (or, if the system has a key press to pull) via MMIO 38; othewise 0
64..67 RO: User area memory size in bytes
68 WO: Counter latch
0b 0000 00ba
a: System uptime
b: RTC
72..79 RO: System uptime in nanoseconds
80..87 RO: RTC in microseconds
88 RW: Rom mapping
write 0xFF to NOT map any rom
write 0x00 to map BIOS
write 0x01 to map first "extra ROM"
89 RW: BMS flags
0b P000 b0ca
a: 1 if charging (accepting power from the AC adapter)
c: 1 if battery is detected
b: 1 if the device is battery-operated
P: 1 if CPU halted (so that the "smart" power supply can shut itself down)
note: only the high nybbles are writable!
if the device is battery-operated but currently running off of an AC adapter and there is no battery inserted,
the flag would be 0000 1001
90 RO: BMS calculated battery percentage where 255 is 100%
91 RO: BMS battery voltage multiplied by 10 (127 = "12.7 V")
92 RW: Memory Mapping
0: 8 MB Core, 8 MB Hardware-reserved, 7 card slots
1: 12 MB Core, 4 MB Hardware-reserved, 3 card slots (HW addr 131072..1048575 cannot be reclaimed though)
1024..2047 RW: Reserved for integrated peripherals (e.g. built-in status display)
2048..4075 RW: Used by the hypervisor
2048..3071 RW: Interrupt vectors (0-255), 32-bit address. Used regardless of the existence of the hypervisor.
If hypervisor is installed, the interrupt calls are handled using the hypervisor
If no hypervisors are installed, the interrupt call is performed by the "hardware"
Interrupt Vector Table:
0x00 - Initial Stack Pointer (currently unused)
0x01 - Reset
0x02 - NMI
0x03 - Out of Memory
0x0C - IRQ_COM1
0x0D - IRQ_COM2
0x0E - IRQ_COM3
0x0F - IRQ_COM4
0x10 - Core Memory Access Violation
0x11 - Card 1 Access Violation
0x12 - Card 2 Access Violation
0x13 - Card 3 Access Violation
0x14 - Card 4 Access Violation
0x15 - Card 5 Access Violation
0x16 - Card 6 Access Violation
0x17 - Card 7 Access Violation
0x20 - IRQ_Core
0x21 - IRQ_CARD1
0x22 - IRQ_CARD2
0x23 - IRQ_CARD3
0x24 - IRQ_CARD4
0x25 - IRQ_CARD5
0x26 - IRQ_CARD6
0x27 - IRQ_CARD7
3072..3075 RW: Status flags
4076..4079 RW: 8-bit status code for the port
4080..4083 RO: 8-bit status code for connected device
4084..4091 RO: Block transfer status
0b nnnnnnnn a00z mmmm
n-read: size of the block from the other device, LSB (4096-full block size is zero)
m-read: size of the block from the other device, MSB (4096-full block size is zero)
a-read: if the other device hasNext (doYouHaveNext), false if device not present
z-read: set if the size is actually 0 instead of 4096 (overrides n and m parameters)
n-write: size of the block I'm sending, LSB (4096-full block size is zero)
m-write: size of the block I'm sending, MSB (4096-full block size is zero)
a-write: if there's more to send (hasNext)
z-write: set if the size is actually 0 instead of 4096 (overrides n and m parameters)
4092..4095 RW: Block transfer control for Port 1 through 4
0b 00ms abcd
m-readonly: device in master setup
s-readonly: device in slave setup
a: 1 for send, 0 for receive
b-write: 1 to start sending if a-bit is set; if a-bit is unset, make other device to start sending
b-read: if this bit is set, you're currently receiving something (aka busy)
c-write: I'm ready to receive
c-read: Are you ready to receive?
d-read: Are you there? (if the other device's recipient is myself)
NOTE: not ready AND not busy (bits b and d set when read) means the device is not connected to the port
4096..8191 RW: Buffer for block transfer lane #1
8192..12287 RW: Buffer for block transfer lane #2
12288..16383 RW: Buffer for block transfer lane #3
16384..20479 RW: Buffer for block transfer lane #4
65536..131071 RO: Mapped to ROM
--------------------------------------------------------------------------------
VRAM Bank 0 (256 kB)
Endianness: little
Memory Space
250880 bytes
Framebuffer
3 bytes
Initial background (and the border) colour RGB, 8 bits per channel
1 byte
command (writing to this memory address changes the status)
1: reset palette to default
2: fill framebuffer with given colour (arg1)
3: do '1' then do '2' (with arg1) then do '4' (with arg2)
4: fill framebuffer2 with given colour (arg1)
16: copy Low Font ROM (char 0127) to mapping area
17: copy High Font ROM (char 128255) to mapping area
18: write contents of the font ROM mapping area to the Low Font ROM
19: write contents of the font ROM mapping area to the High Font ROM
20: reset Low Font ROM to default
21: reset High Font ROM to default
12 bytes
argument for "command" (arg1: Byte, arg2: Byte)
write to this address FIRST and then write to "command" to execute the command
1134 bytes
unused
2 bytes
Cursor position in: (y*80 + x)
2560 bytes
Text foreground colours
2560 bytes
Text background colours
2560 bytes
Text buffer of 80x32 (7x14 character size, and yes: actual character data is on the bottom)
512 bytes
Palette stored in following pattern: 0b rrrr gggg, 0b bbbb aaaa, ....
Palette number 255 is always full transparent (bits being all zero)
(DRAFT) Optional Sprite Card (VRAM Bank 1 (256 kB))
250880 bytes
One of:
Secondary layer
Other 8-bit of the primary framebuffer (4K colour mode)
SPRITE FORMAT DRAFT 1
533 bytes: Sprite attribute table
(41 sprites total, of which 1 is GUI cursor)
12 bytes - signed fixed point
X-position
Y-position
Transform matrix A..D
1 bytes
0b 0000 00vp
(p: 0 for above-all, 1 for below-text, v: show/hide)
10496 bytes: Sprite table
256 bytes
16x16 texture for the sprite
235 bytes:
unused
SPRITE FORMAT DRAFT 2
DMA Sprite Area - 18 bytes each, total of ??? sprites
1 byte
Sprite width
1 byte
Sprite height
12 bytes - signed fixed point
Affine transformation A,B,C,D,X,Y
1 byte
Attributes
0b 0000 00vp
(p: 0 for above-all, 1 for below-text, v: show/hide)
3 bytes
Pointer to raw pixmap data in Core Memory
MMIO
0..1 RO
Framebuffer width in pixels
2..3 RO
Framebuffer height in pixels
4 RO
Text mode columns
5 RO
Text mode rows
6 RW
Text-mode attributes
0b 0000 00rc (r: TTY Raw mode, c: Cursor blink)
7 RW
Graphics-mode attributes
0b 0000 rrrr (r: Resolution/colour depth)
8 RO
Last used colour (set by poking at the framebuffer)
9 RW
current TTY foreground colour (useful for print() function)
10 RW
current TTY background colour (useful for print() function)
11 RO
Number of Banks, or VRAM size (1 = 256 kB, max 4)
12 RW
Graphics Mode
0: 560x448, 256 Colours, 1 layer
1: 280x224, 256 Colours, 4 layers
2: 280x224, 4096 Colours, 2 layers
3: 560x448, 256 Colours, 2 layers (if bank 2 is not installed, will fall back to mode 0)
4: 560x448, 4096 Colours, 1 layer (if bank 2 is not installed, will fall back to mode 0)
4096 is also known as "direct colour mode" (4096 colours * 16 transparency -> 65536 colours)
Two layers are grouped to make a frame, "low layer" contains RG colours and "high layer" has BA colours,
Red and Blue occupies MSBs
13 RW
Layer Arrangement
If 4 layers are used:
Num LO<->HI
0 1234
1 1243
2 1324
3 1342
4 1423
5 1432
6 2134
7 2143
8 2314
9 2341
10 2413
11 2431
12 3124
13 3142
14 3214
15 3241
16 3412
17 3421
18 4123
19 4132
20 4213
21 4231
22 4312
23 4321
If 2 layers are used:
Num LO<->HI
0 12
1 12
2 12
3 12
4 12
5 12
6 12
7 21
8 21
9 21
10 21
11 21
12 12
13 12
14 21
15 21
16 12
17 21
18 12
19 12
20 21
21 21
22 12
23 21
If 1 layer is used, this field will do nothing and always fall back to 0
14..15 RW
framebuffer scroll X
16..17 RW
framebuffer scroll Y
18 RO
Busy flags
1: Codec in-use
2: Draw Instructions being decoded
19 WO
Write non-zero value to initiate the Draw Instruction decoding
20..21 RO
Program Counter for the Draw Instruction decoding
1024..2047 RW
horizontal scroll offset for scanlines
2048..4095 RW
!!NEW!! Font ROM Mapping Area
Format is always 8x16 pixels, 1bpp ROM format (so that it would be YY_CHR-Compatible)
(designer's note: it's still useful to divide the char rom to two halves, lower half being characters ROM and upper half being symbols ROM)
65536..131071 RW
Draw Instructions
Text-mode-font-ROM is immutable and does not belong to VRAM
Even in the text mode framebuffer is still being drawn onto the screen, and the texts are drawn on top of it
Copper Commands (suggestion withdrawn)
WAITFOR 3,32
80·03 46 00 (0x004603: offset on the framebuffer)
SCROLLX 569
A0·39 02 00
SCROLLY 321
B0·41 01 00
SETPAL 5 (15 2 8 15)
C0·05·F2 8F (0x05: Palette number, 0xF28F: RGBA colour)
SETBG (15 2 8 15)
D0·00·F2 8F (0xF28F: RGBA colour)
END (pseudocommand of WAITFOR)
80·FF FF FF
--------------------------------------------------------------------------------
TSVM MOV file format
Endianness: Little
\x1F T S V M M O V
[METADATA]
[PACKET 0]
[PACKET 1]
[PACKET 2]
...
where:
METADATA -
uint16 WIDTH
uint16 HEIGHT
uint16 FPS (0: play as fast as can)
uint32 NUMBER OF FRAMES
uint16 UNUSED (fill with 255,0)
uint16 AUDIO QUEUE INFO
when read as little endian:
0b nnnn bbbb bbbb bbbb
[byte 21] [byte 20]
n: size of the queue (number of entries). Allocate at least 1 more entry than the number specified!
b: size of each entry in bytes DIVIDED BY FOUR (all zero = 16384; always 0x240 for MP2 because MP2-VBR is not supported)
n=0 indicates the video audio must be decoded on-the-fly instead of being queued, or has no audio packets
byte[10] RESERVED
Packet Types -
<video>
0,0: 256-Colour frame
1,0: 256-Colour frame with palette data
2,0: 4096-Colour frame (stored as two byte-planes)
4,t: iPF no-alpha indicator (see iPF Type Numbers for details)
5,t: iPF with alpha indicator (see iPF Type Numbers for details)
16,0: Series of JPEGs
18,0: Series of PNGs
20,0: Series of TGAs
21,0: Series of TGA/GZs
<audio>
0,16: Raw PCM Stereo
1,16: Raw PCM Mono
p,17: MP2, 32 kHz (see MP2 Format Details section for p-value)
q,18: ADPCM, 32 kHz (q = 2 * log_2(frameSize) + (1 if mono, 0 if stereo))
<special>
255,255: sync packet (wait until the next frame)
254,255: background colour packet
31,84 : prohibited
Packet Type High Byte (iPF Type Numbers)
0..7: iPF Type 1..8
- MP2 Format Details
Rate | 2ch | 1ch
32 | 0 | 1
48 | 2 | 3
56 | 4 | 5
64 | 6 | 7 (libtwolame does not allow bitrate lower than this on 32 kHz stereo)
80 | 8 | 9
96 | 10 | 11
112 | 12 | 13
128 | 14 | 15
160 | 16 | 17
192 | 18 | 19
224 | 20 | 21
256 | 22 | 23
320 | 24 | 25
384 | 26 | 27
Add 128 to the resulting number if the frame has a padding bit (should not happen on 32kHz sampling rate)
Special value of 255 may indicate some errors
To encode an audio to compliant format, use ffmpeg: ffmpeg -i <your_music> -acodec libtwolame -psymodel 4 -b:a <rate>k -ar 32000 <output.mp2>
Rationale:
-acodec libtwolame : ffmpeg has two mp2 encoders, and libtwolame produces vastly higher quality audio
-psymodel 4 : use alternative psychoacoustic model -- the default model (3) tends to insert "clunk" sounds throughout the audio
-b:a : 256k is recommended for high quality audio (trust me, you don't need 384k)
-ar 32000 : resample the audio to 32kHz, the sampling rate of the TSVM soundcard
TYPE 0 Packet -
uint32 SIZE OF COMPRESSED FRAMEDATA
* COMPRESSED FRAMEDATA
TYPE 1 Packet -
byte[512] Palette Data
uint32 SIZE OF COMPRESSED FRAMEDATA
* COMPRESSED FRAMEDATA
TYPE 2 Packet -
uint32 SIZE OF COMPRESSED FRAMEDATA BYTE-PLANE 1
* COMPRESSED FRAMEDATA
uint32 SIZE OF COMPRESSED FRAMEDATA BYTE-PLANE 2
* COMPRESSED FRAMEDATA
iPF Packet -
uint32 SIZE OF COMPRESSED FRAMEDATA
* COMPRESSED FRAMEDATA // only the actual gzip (and no UNCOMPRESSED SIZE) of the "Blocks.gz" is stored
TYPE 3 Packet (Patch-encoded iPF 1 Packet) -
uint32 SIZE OF COMPRESSED PATCHES
* COMPRESSED PATCHES
PATCHES are bunch of PATCHes concatenated
where each PATCH is encoded as:
uint8 X-coord of the patch (pixel position divided by four)
uint8 Y-coord of the patch (pixel position divided by four)
uint8 width of the patch (size divided by four)
uint8 height of the patch (size divided by four)
(calculating uncompressed size)
(iPF1 no alpha: width * height * 12)
(iPF1 with alpha: width * height * 20)
(iPF2 no alpha: width * height * 16)
(iPF2 with alpha: width * height * 24)
* UN-COMPRESSED PATCHDATA
TYPE 16+ Packet -
uint32 SIZE OF COMPRESSED FRAMEDATA BYTE-PLANE 1
* FRAMEDATA (COMPRESSED for TGA/GZ)
MP2 Packet & ADPCM Packet -
uint16 TYPE OF PACKET // follows the Metadata Packet Type scheme
* MP2 FRAME/ADPCM BLOCK
Sync Packet (subset of GLOBAL TYPE 255 Packet) -
uint16 0xFFFF (type of packet for Global Type 255)
Background Colour Packet -
uint16 0xFEFF
uint8 Red (0-255)
uint8 Green (0-255)
uint8 Blue (0-255)
uint8 0x00 (pad byte)
Frame Timing
If the global type is not 255, each packet is interpreted as a single full frame, and then will wait for the next
frame time; For type 255 however, the assumption no longer holds and each frame can have multiple packets, and thus
needs explicit "sync" packet for proper frame timing.
Comperssion Method
Old standard used Gzip, new standard is Zstd.
tsvm will read the zip header and will use appropriate decompression method, so that the old Gzipped
files remain compatible.
NOTE FROM DEVELOPER
In the future, the global packet type will be deprecated.
--------------------------------------------------------------------------------
TSVM Interchangeable Picture Format (aka iPF Type 1/2)
Image is divided into 4x4 blocks and each block is serialised, then the entire iPF blocks are gzipped
# File Structure
\x1F T S V M i P F
[HEADER]
[Blocks]
- Header
uint16 WIDTH
uint16 HEIGHT
uint8 Flags
0b p00z 000a
- a: has alpha
- z: gzipped (p flag always sets this flag)
- p: progressive ordering (Adam7)
uint8 iPF Type/Colour Mode
0: Type 1 (4:2:0 chroma subsampling; 2048 colours?)
1: Type 2 (4:2:2 chroma subsampling; 2048 colours?)
byte[10] RESERVED
uint32 UNCOMPRESSED SIZE (somewhat redundant but included for convenience)
- Chroma Subsampled Blocks
Gzipped unless the z-flag is not set.
4x4 pixels are sampled, then divided into YCoCg planes.
CoCg planes are "chroma subsampled" by 4:2:0, then quantised to 4 bits (8 bits for CoCg combined)
Y plane is quantised to 4 bits
By doing so, CoCg planes will reduce to 4 pixels
For the description of packing, pixels in Y/Cx plane will be numbered as:
Y0 Y1 Y2 Y3 || Cx1 Cx2 | Cx1 Cx2
Y4 Y5 Y6 Y7 || (iPF 1) | Cx3 Cx4
Y8 Y9 YA YB || Cx3 Cx4 | Cx5 Cx6
YC YD YE YF || (iPF 1) | Cx7 Cx8
Bits are packed like so:
iPF1:
uint16 [Co4 | Co3 | Co2 | Co1]
uint16 [Cg4 | Cg3 | Cg2 | Cg1]
uint16 [Y1 | Y0 | Y5 | Y4]
uint16 [Y3 | Y2 | Y7 | Y6]
uint16 [Y9 | Y8 | YD | YC]
uint16 [YB | YA | YF | YE]
(total: 12 bytes)
iPF2:
uint32 [Co8 | Co7 | Co6 | Co5 | Co4 | Co3 | Co2 | Co1]
uint32 [Cg8 | Cg7 | Cg6 | Cg5 | Cg4 | Cg3 | Cg2 | Cg1]
uint16 [Y1 | Y0 | Y5 | Y4]
uint16 [Y3 | Y2 | Y7 | Y6]
uint16 [Y9 | Y8 | YD | YC]
uint16 [YB | YA | YF | YE]
(total: 16 bytes)
If has alpha, append following bytes for alpha values
uint16 [a1 | a0 | a5 | a4]
uint16 [a3 | a2 | a7 | a6]
uint16 [a9 | a8 | aD | aC]
uint16 [aB | aA | aF | aE]
(total: 20/24 bytes)
Subsampling mask:
Least significant byte for top-left, most significant for bottom-right
For example, this default pattern
00 00 01 01
00 00 01 01
10 10 11 11
10 10 11 11
turns into:
01010000 -> 0x30
01010000 -> 0x30
11111010 -> 0xFA
11111010 -> 0xFA
which packs into: [ 30 | 30 | FA | FA ] (because little endian)
iPF1-delta (for video encoding):
Delta encoded frames contain "insutructions" for patch-encoding the existing frame.
Or, a collection of [StateChangeCode] [Optional VarInts] [Payload...] pairs
States:
0x00 SKIP [varint skipCount]
0x01 PATCH [varint blockCount] [12x blockCount bytes]
0x02 REPEAT [varint repeatCount] [a block]
0xFF END
Sample stream:
[SKIP 10] [PATCH A] [REPEAT 3] [SKIP 5] [PATCH B] [END]
Delta block format:
Each PATCH delta payload is still:
8 bytes of Luma (4-bit deltas for 16 pixels)
2 bytes of Co deltas (4× 4-bit deltas)
2 bytes of Cg deltas (4× 4-bit deltas)
Total: 12 bytes per PATCH.
These are always relative to the same-position block in the previous frame.
- Progressive Blocks
Ordered string of words (word size varies by the colour mode) are stored here.
If progressive mode is enabled, words are stored in the order that accomodates it.
--------------------------------------------------------------------------------
TSVM Enhanced Video (TEV) Format
Created by CuriousTorvald and Claude on 2025-08-17
TEV is a modern video codec optimized for TSVM's 4096-color hardware, featuring
DCT-based compression, motion compensation, and efficient temporal coding.
## Version History
- Version 2.0: YCoCg-R 4:2:0 with 16x16/8x8 DCT blocks
- Version 2.1: Added Rate Control Factor to all video packets (breaking change)
* Enables bitrate-constrained encoding alongside quality modes
* All video frames now include 4-byte rate control factor after payload size
- Version 3.0: Additional support of ICtCp Colour space
# File Structure
\x1F T S V M T E V (if video), \x1F T S V M T E P (if still picture)
[HEADER]
[PACKET 0]
[PACKET 1]
[PACKET 2]
...
## Header (24 bytes)
uint8 Magic[8]: "\x1F TSVM TEV" or "\x1F TSVM TEP"
uint8 Version: 2 (YCoCg-R) or 3 (ICtCp)
uint16 Width: video width in pixels
uint16 Height: video height in pixels
uint8 FPS: frames per second
uint32 Total Frames: number of video frames
uint8 Quality Index for Y channel (0-99; 100 denotes all quantiser is 1)
uint8 Quality Index for Co channel (0-99; 100 denotes all quantiser is 1)
uint8 Quality Index for Cg channel (0-99; 100 denotes all quantiser is 1)
uint8 Extra Feature Flags
- bit 0 = has audio
- bit 1 = has subtitle
- bit 2 = infinite loop (must be ignored when File Role is 1)
- bit 7 = has no actual packets, this file is header-only without an Intro Movie
uint8 Video Flags
- bit 0 = is interlaced (should be default for most non-archival TEV videos)
- bit 1 = is NTSC framerate (repeat every 1000th frame)
uint8 File Role
- 0 = generic
- 1 = this file is header-only, and UCF payload will be followed (used by seekable movie file)
When header-only file contain video packets, they should be presented as an Intro Movie
before the user-interactable selector (served by the UCF payoad)
## Packet Types
0x10: I-frame (intra-coded frame)
0x11: P-frame (predicted frame)
0x1F: prohibited
0x20: MP2 audio packet
0x30: Subtitle in "Simple" format
0x31: Subtitle in "Karaoke" format
0xE0: EXIF packet
0xE1: ID3v1 packet
0xE2: ID3v2 packet
0xE3: Vorbis Comment packet
0xE4: CD-text packet
0xFF: sync packet
## Standard metadata payload packet structure
uint8 0xE0/0xE1/0xE2/.../0xEF (see Packet Types section)
uint32 Length of the payload
* Standard payload
note: metadata packets must precede any non-metadata packets
## Video Packet Structure
uint8 Packet Type
uint32 Compressed Size
* Zstd-compressed Block Data
## Block Data (per 16x16 block)
uint8 Mode: encoding mode
0x00 = SKIP (copy from previous frame)
0x01 = INTRA (DCT-coded, no prediction)
0x02 = INTER (DCT-coded with motion compensation) -- currently unused due to bugs
0x03 = MOTION (motion vector only)
int16 Motion Vector X ("capable of" 1/4 pixel precision, integer precision for now)
int16 Motion Vector Y ("capable of" 1/4 pixel precision, integer precision for now)
float32 Rate Control Factor (4 bytes, little-endian)
uint16 Coded Block Pattern (which 8x8 have non-zero coeffs)
int16[256] DCT Coefficients Y
int16[64] DCT Coefficients Co (subsampled by two)
int16[64] DCT Coefficients Cg (subsampled by two, aggressively quantised)
For SKIP and MOTION mode, DCT coefficients are filled with zero
## DCT Quantisation and Rate Control
TEV uses 5 quality levels (0=lowest, 4=highest) with progressive quantisation
tables optimized for perceptual quality. DC coefficients are encoded losslessly,
while AC coefficients are quantised according to quality tables.
### Rate Control Factor
Each block includes a Rate Control Factor that modifies quality level for that specific block.
This feature allows more efficient coding by allows higher quality for complex blocks and lower quality for
flat blocks.
## Motion Compensation
- Search range: ±8 pixels
- Sub-pixel precision: 1/4 pixel (again, integer precision for now)
- Block size: 16x16 pixels
- Uses Sum of Absolute Differences (SAD) for motion estimation
- Bilinear interpolation for sub-pixel motion vectors
## Colour Space
TEV operates in 8-Bit colour mode, colour space conversion required
## Compression Features
- 16x16 DCT blocks (vs 4x4 in iPF)
- Temporal prediction with motion compensation
- Rate-distortion optimized mode selection
- Hardware-accelerated encoding/decoding functions
## Performance Comparison
TEV achieves 60-80% better compression than iPF formats while maintaining
equivalent visual quality, with significantly faster decode performance due
to larger block sizes and hardware acceleration.
## Audio Support
Reuses existing MP2 audio infrastructure from TSVM MOV format for seamless
compatibility with existing audio processing pipeline.
## NTSC Framerate handling
The encoder encodes the frames as-is. The decoder must duplicate every 1000th frame to keep the decoding
in-sync.
--------------------------------------------------------------------------------
Simple Subtitle Format (SSF)
SSF is a simple subtitle that is intended to use text buffer to display texts.
The format is designed to be compatible with SubRip and SAMI (without markups) and interoperable with
TEV and TAV formats.
When SSF is interleaved with MP2 audio, the payload must be inserted in-between MP2 frames.
## Packet Structure
uint8 0x30 (packet type)
uint32 Packet Size
* SSF Payload (see below)
## SSF Packet Structure
uint24 index (used to specify target subtitle object)
uint8 opcode
0x00 = <argument terminator>, is NOP when used here
0x01 = show (arguments: UTF-8 text)
0x02 = hide (arguments: none)
0x03 = move to different nonant (arguments: 0x00-bottom centre; 0x01-bottom left; 0x02-centre left; 0x03-top left; 0x04-top centre; 0x05-top right; 0x06-centre right; 0x07-bottom right; 0x08-centre
0x10..0x2F = show in alternative languages (arguments: char[5] language code, UTF-8 text)
0x80 = upload to low font rom (arguments: uint16 payload length, var bytes)
0x81 = upload to high font rom (arguments: uint16 payload length, var bytes)
note: changing the font rom will change the appearance of the every subtitle currently being displayed
* arguments separated AND terminated by 0x00
text argument may be terminated by 0x00 BEFORE the entire arguments being terminated by 0x00,
leaving extra 0x00 on the byte stream. A decoder must be able to handle the extra zeros.
--------------------------------------------------------------------------------
Karaoke Subtitle Format (KSF)
KSF is a frame-synced subtitle that is intended to use Karaoke-style subtitles.
The format is designed to be interoperable with TEV and TAV formats.
For non-karaoke style synced lyrics, use SSF.
When KSF is interleaved with MP2 audio, the payload must be inserted in-between MP2 frames.
## Packet Structure
uint8 0x31 (packet type)
* KSF Payload (see below)
### KSF Packet Structure
KSF is line-based: you define an unrevealed line, then subsequent commands reveal words/syllables
on appropriate timings.
uint24 index (used to specify target subtitle object)
uint8 opcode
<definition opcodes>
0x00 = <argument terminator>, is NOP when used here
0x01 = define line (arguments: UTF-8 text. Players will also show it in grey)
0x02 = delete line (arguments: none)
0x03 = move to different nonant (arguments: 0x00-bottom centre; 0x01-bottom left; 0x02-centre left; 0x03-top left; 0x04-top centre; 0x05-top right; 0x06-centre right; 0x07-bottom right; 0x08-centre
<reveal opcodes>
0x30 = reveal text normally (arguments: UTF-8 text. The reveal text must contain spaces when required)
0x31 = reveal text slowly (arguments: UTF-8 text. The effect is implementation-dependent)
0x40 = reveal text normally with emphasize (arguments: UTF-8 text. On TEV/TAV player, the text will be white; otherwise, implementation-dependent)
0x41 = reveal text slowly with emphasize (arguments: UTF-8 text)
0x50 = reveal text normally with target colour (arguments: uint8 target colour; UTF-8 text)
0x51 = reveal text slowly with target colour (arguments: uint8 target colour; UTF-8 text)
<hardware control opcodes>
0x80 = upload to low font rom (arguments: uint16 payload length, var bytes)
0x81 = upload to high font rom (arguments: uint16 payload length, var bytes)
note: changing the font rom will change the appearance of the every subtitle currently being displayed
* arguments separated AND terminated by 0x00
text argument may be terminated by 0x00 BEFORE the entire arguments being terminated by 0x00,
leaving extra 0x00 on the byte stream. A decoder must be able to handle the extra zeros.
--------------------------------------------------------------------------------
TSVM Advanced Video (TAV) Format
Created by CuriousTorvald and Claude on 2025-09-13
TAV is a next-generation video codec for TSVM utilizing Discrete Wavelet Transform (DWT)
similar to JPEG2000, providing superior compression efficiency and scalability compared
to DCT-based codecs like TEV. Features include multi-resolution encoding, progressive
transmission capability, and region-of-interest coding.
# File Structure
\x1F T S V M T A V (if video), \x1F T S V M T A P (if still picture)
[HEADER]
[PACKET 0]
[PACKET 1]
[PACKET 2]
...
## Header (32 bytes)
uint8 Magic[8]: "\x1F TSVM TAV" or "\x1F TSVM TAP"
uint8 Version: 3 (YCoCg-R uniform), 4 (ICtCp uniform), 5 (YCoCg-R perceptual), 6 (ICtCp perceptual)
uint16 Width: video width in pixels
uint16 Height: video height in pixels
uint8 FPS: frames per second. Use 0x00 for still images
uint32 Total Frames: number of video frames. Use 0xFFFFFFFF to denote still image (.im3 file)
- frame count of 0 is used to denote not-finalised video stream
uint8 Wavelet Filter Type:
- 0 = 5/3 reversible (LGT 5/3, JPEG 2000 standard)
- 1 = 9/7 irreversible (CDF 9/7, slight modification of JPEG 2000, default choice)
- 2 = CDF 13/7 (experimental)
- 16 = DD-4 (Four-point interpolating Deslauriers-Dubuc; experimental)
- 255 = Haar (demonstration purpose only)
uint8 Decomposition Levels: number of DWT levels (1-6+)
uint8 Quantiser Index for Y channel (uses exponential numeric system; 0: lossless, 255: potato)
uint8 Quantiser Index for Co channel (uses exponential numeric system; 0: lossless, 255: potato)
uint8 Quantiser Index for Cg channel (uses exponential numeric system; 0: lossless, 255: potato)
uint8 Extra Feature Flags (must be ignored for still images)
- bit 0 = has audio
- bit 1 = has subtitle
- bit 2 = infinite loop (must be ignored when File Role is 1)
- bit 7 = has no actual packets, this file is header-only without an Intro Movie
uint8 Video Flags
- bit 0 = interlaced
- bit 1 = is NTSC framerate
- bit 2 = is lossless mode
(shorthand for `-q 6 -Q0,0,0 -w 0 --intra-only --no-perceptual-tuning --arate 384`)
- bit 3 = has region-of-interest coding (for still images only)
uint8 Encoder quality level (stored with bias of 1 (q0=1); used to derive anisotropy value)
uint8 Channel layout (bit-field: bit 0=has alpha, bit 1=has chroma inverted, bit 2=has luma inverted)
- 0 = Y-Co-Cg/I-Ct-Cp (000: no alpha, has chroma, has luma)
- 1 = Y-Co-Cg-A/I-Ct-Cp-A (001: has alpha, has chroma, has luma)
- 2 = Y/I only (010: no alpha, no chroma, has luma)
- 3 = Y-A/I-A (011: has alpha, no chroma, has luma)
- 4 = Co-Cg/Ct-Cp (100: no alpha, has chroma, no luma)
- 5 = Co-Cg-A/Ct-Cp-A (101: has alpha, has chroma, no luma)
- 6-7 = Reserved/invalid (would indicate no luma and no chroma)
uint8 Entropy Coder
- 0 = Twobit-plane significance map
- 1 = Embedded Zero Block Coding (EZBC, experimental)
uint8 Reserved[2]: fill with zeros
uint8 Device Orientation
- 0 = No rotation
- 1 = Clockwise 90 deg
- 2 = 180 deg
- 3 = Clockwise 270 deg
- 4 = Mirrored, No rotation
- 5 = Mirrored, Clockwise 90 deg
- 6 = Mirrored, 180 deg
- 7 = Mirrored, Clockwise 270 deg
uint8 File Role
- 0 = generic
- 1 = this file is header-only, and UCF payload will be followed (used by seekable movie file)
When header-only file contain video packets, they should be presented as an Intro Movie
before the user-interactable selector (served by the UCF payoad)
## Packet Structure (some special packets have no payload. See Packet Types for details)
uint8 Packet Type
uint32 Payload Size
* Payload
## Packet Types
<video packets>
0x10: I-frame (intra-coded frame)
0x11: P-frame (delta/skip frame)
0x12: GOP Unified (temporal 3D DWT with unified preprocessing)
0x1F: (prohibited)
<audio packets>
0x20: MP2 audio packet
0x21: Zstd-compressed 8-bit PCM (32 KHz, audio hardware's native format)
0x22: Zstd-compressed 16-bit PCM (32 KHz, little endian)
0x23: Zstd-compressed ADPCM
<subtitles>
0x30: Subtitle in "Simple" format
0x31: Subtitle in "Karaoke" format
<synchronised tracks>
0x40: MP2 audio track
0x41: Zstd-compressed 8-bit PCM (32 KHz, audio hardware's native format)
0x42: Zstd-compressed 16-bit PCM (32 KHz, little endian)
0x43: Zstd-compressed ADPCM
<multiplexed video>
0x70/71: Video channel 2 I/P-frame
0x72/73: Video channel 3 I/P-frame
0x74/75: Video channel 4 I/P-frame
0x76/77: Video channel 5 I/P-frame
0x78/79: Video channel 6 I/P-frame
0x7A/7B: Video channel 7 I/P-frame
0x7C/7D: Video channel 8 I/P-frame
0x7E/7F: Video channel 9 I/P-frame
<Standard metadata payloads>
(it's called "standard" because you're expected to just copy-paste the metadata bytes verbatim)
0xE0: EXIF packet
0xE1: ID3v1 packet
0xE2: ID3v2 packet
0xE3: Vorbis Comment packet
0xE4: CD-text packet
<Special packets>
0x00: No-op (no payload)
0xEF: TAV Extended Header
0xF0: Loop point start (insert right AFTER the TC packet; no payload)
0xF1: Loop point end (insert right AFTER the TC packet; no payload)
0xFC: GOP Sync packet (indicates N frames decoded from GOP block)
0xFD: Timecode (TC) Packet [for frame 0, insert at the beginning; otherwise, insert right AFTER the sync]
0xFE: NTSC sync packet (used by player to calculate exact framerate-wise performance; no payload)
0xFF: Sync packet (no payload)
### Packet Precedence
Before the first frame group:
1. TAV Extended header (if any)
2. Standard metadata payloads (if any)
Frame group:
1. TC Packet (0xFD) or Next TAV File (0x1F) [mutually exclusive!]
2. Loop point packets
3. Audio packets
4. Subtitle packets
5. Main video packets (0x10-0x1E)
6. Multiplexed video packets (0x70-7F)
After a frame group:
1. Sync packet
## TAV Extended Header Specification and Structure
uint8 0xEF
uint16 Number of Key-Value pairs
* Key-Value pairs
### Key-Value Pair
uint8 Key[4]
uint8 Value Type
- 0x00: (U)Int16
- 0x01: (U)Int24
- 0x02: (U)Int32
- 0x03: (U)Int48
- 0x04: (U)Int64
- 0x10: Bytes
<if Value Type is Bytes>
uint16 Length of bytes
* Bytes
<otherwise>
type_t Value
### List of Keys
- Uint64 BGNT: Video begin time in nanoseconds (must be equal to the value of the first Timecode packet)
- Uint64 ENDT: Video end time in nanoseconds (must be equal to the value of the last Timecode packet)
- Uint64 CDAT: Creation time in microseconds since UNIX Epoch (must be in UTC timezone)
- Bytes VNDR: Name and version of the encoder (for Reference encoder: "Encoder-TAV 20251014 (list,of,features)")
- Bytes FMPG: FFmpeg version (typically "ffmpeg version 8.0 Copyright (c) 2000-2025 the FFmpeg developers"; the first line of text FFmpeg emits)
## Standard Metadata Payload Packet Structure
uint8 0xE0/0xE1/0xE2/.../0xEE (see Packet Types section)
uint32 Length of the payload
* Standard payload
Notes:
- metadata packets must precede any non-metadata packets
- when multiple metadata packets are present (e.g. ID3v2 and Vorbis Comment both present),
which gets precedence is implementation-dependent. ONE EXCEPTION is ID3v1 and ID3v2 where ID3v2 gets
precedence.
## Timecode Packet Structure
uint8 Packet Type (0xFE)
uint64 Time since stream start in nanoseconds (this may NOT start from zero if the video is coming from a livestream)
## Video Packet Structure (0x10, 0x11)
uint8 Packet Type
uint32 Compressed Size
* Zstd-compressed Block Data
## GOP Unified Packet Structure (0x12)
Implemented on 2025-10-15 for temporal 3D DWT with unified preprocessing.
This packet contains multiple frames encoded as a single spacetime block for optimal
temporal compression.
uint8 Packet Type (0x12/0x13)
uint8 GOP Size (number of frames in this GOP, typically 16)
<if packet type is 0x13>
uint32 Compressed Size
* Zstd-compressed Motion Data
<endif>
uint32 Compressed Size
* Zstd-compressed Unified Block Data
### Unified Block Data Format
The entire GOP (width×height×N_frames×3_channels) is preprocessed as a single block:
<if significance maps are used>
uint8 Y Significance Maps[(width*height + 7) / 8 * GOP Size] // All Y frames concatenated
uint8 Co Significance Maps[(width*height + 7) / 8 * GOP Size] // All Co frames concatenated
uint8 Cg Significance Maps[(width*height + 7) / 8 * GOP Size] // All Cg frames concatenated
int16 Y Non-zero Values[variable length] // All Y non-zero coefficients
int16 Co Non-zero Values[variable length] // All Co non-zero coefficients
int16 Cg Non-zero Values[variable length] // All Cg non-zero coefficients
<if EZBC is used>
uint32 EZBC Size for Y
* EZBC Structure for Y
uint32 EZBC Size for Co
* EZBC Structure for Co
uint32 EZBC Size for Cg
* EZBC Structure for Cg
This layout enables Zstd to find patterns across both spatial and temporal dimensions,
resulting in superior compression compared to per-frame encoding.
### Motion Vectors
- Stored in 1/16-pixel units (divide by 16.0 for pixel displacement)
- Used for global motion compensation (camera movement, scene translation)
- Computed using FFT-based phase correlation for accurate frame alignment
- Cumulative relative to frame 0 (not frame-to-frame deltas)
- First frame (frame 0) always has motion vector (0, 0)
### Temporal 3D DWT Process
1. Detect inter-frame motion using phase correlation
2. Align frames and expand canvas to preserve all original pixels
3. Apply 1D DWT across temporal axis (GOP frames) on expanded canvas
4. Apply 2D DWT on each spatial slice of temporal subbands
5. Perceptual quantization with temporal-spatial awareness
6. Unified significance map preprocessing across all frames/channels
7. Single Zstd compression of entire GOP block
## GOP Sync Packet Structure (0xFC)
Indicates that N frames were decoded from a GOP Unified block.
Decoders must track this to maintain proper frame count and synchronization.
uint8 Packet Type (0xFC)
uint8 Frame Count (number of frames that were decoded from preceding GOP block)
Note: GOP Sync packets have no payload size field (fixed 2-byte packet).
## Block Data (per frame)
uint8 Mode: encoding mode
0x00 = SKIP (just use frame data from previous frame)
0x01 = INTRA (DWT-coded)
0x02 = DELTA (DWT delta)
- 0x02: DWT level 1
- 0x12: DWT level 2
- 0x22: DWT level 3
...
- 0xF2: DWT Level 16
uint8 Quantiser override Y (uses exponential numeric system; stored with index bias of 1 (127->252, 255->4032); use 0 to disable overriding)
uint8 Quantiser override Co (uses exponential numeric system; stored with index bias of 1 (127->252, 255->4032); use 0 to disable overriding)
uint8 Quantiser override Cg (uses exponential numeric system; stored with index bias of 1 (127->252, 255->4032); use 0 to disable overriding)
- note: quantiser overrides are always present regardless of the channel layout
* Tile data (one compressed payload per tile)
### Coefficient Storage Format (Significance Map Compression)
Starting with encoder version 2025-09-29, DWT coefficients are stored using
significance map compression with concatenated maps layout for optimal efficiency:
#### Concatenated Maps Format
All channels are processed together to maximize Zstd compression:
uint8 Y Significance Map[(coeff_count + 7) / 8] // 1 bit per Y coefficient
uint8 Co Significance Map[(coeff_count + 7) / 8] // 1 bit per Co coefficient
uint8 Cg Significance Map[(coeff_count + 7) / 8] // 1 bit per Cg coefficient
uint8 A Significance Map[(coeff_count + 7) / 8] // 1 bit per A coefficient (if alpha present)
int16 Y Non-zero Values[variable length] // Only non-zero Y coefficients
int16 Co Non-zero Values[variable length] // Only non-zero Co coefficients
int16 Cg Non-zero Values[variable length] // Only non-zero Cg coefficients
int16 A Non-zero Values[variable length] // Only non-zero A coefficients (if alpha present)
#### Significance Map Encoding
Each significance map uses 1 bit per coefficient position:
- Bit = 1: coefficient is non-zero, read value from corresponding Non-zero Values array
- Bit = 0: coefficient is zero
#### Compression Benefits
- **Sparsity exploitation**: Typically 85-95% zeros in quantised DWT coefficients
- **Cross-channel patterns**: Concatenated maps allow Zstd to find patterns across similar significance maps
- **Overall improvement**: 16-18% compression improvement before Zstd compression
## DWT Implementation Details
### Wavelet Filters
- 5/3 Reversible Filter (lossless capable):
* Analysis: Low-pass [1/2, 1, 1/2], High-pass [-1/8, -1/4, 3/4, -1/4, -1/8]
* Synthesis: Low-pass [1/4, 1/2, 1/4], High-pass [-1/16, -1/8, 3/8, -1/8, -1/16]
- 9/7 Irreversible Filter (higher compression):
* Analysis: CDF 9/7 coefficients optimized for image compression
* Provides better energy compaction than 5/3 but lossy reconstruction
### Quantisation Strategy
#### Uniform Quantisation (Versions 3-4)
Traditional approach using same quantisation factor for all DWT subbands within each channel.
#### Perceptual Quantisation (Versions 5-6, Default)
TAV versions 5 and 6 implement Human Visual System (HVS) optimized quantisation with
frequency-aware subband weighting for superior visual quality:
Anisotropic quantisation is applied for both Luma and Chroma channels to preserve horizontal details.
The anisotropic quantisation is the innovative upgrade to the traditional field-interlacing and
chroma subsampling.
This perceptual approach allocates more bits to visually important low-frequency
details while aggressively quantising high-frequency noise, resulting in superior
visual quality at equivalent bitrates.
#### Grain Synthesis
The decoder must synthesise a film grain on non-LL subbands at the amplitude half of the quantisation level.
The encoder may synthesise the exact same grain in sign-reversed on encoding (but not recommended for practical reasons).
The base noise function must be triangular noise in range [-1.0, 1.0].
## Colour Space
TAV supports two colour spaces:
**YCoCg-R (Versions 3, 5):**
- Y: Luma channel (full resolution)
- Co: Orange-Cyan chroma (full resolution)
- Cg: Green-Magenta chroma (full resolution)
**ICtCp (Versions 4, 6):**
- I: Intensity (similar to luma)
- Ct: Chroma tritanopia
- Cp: Chroma protanopia
Perceptual versions (5-6) apply HVS-optimized quantisation weights per channel,
while uniform versions (3-4) use consistent quantisation across all subbands.
The encoder expects linear alpha.
## Compression Features
- Single DWT tiles vs 16x16 DCT blocks in TEV
- Multi-resolution representation enables scalable decoding
- Better frequency localization than DCT
- Reduced blocking artifacts due to overlapping basis functions
## Hardware Acceleration Functions
TAV decoder requires new GraphicsJSR223Delegate functions:
- tavDecode(): Main DWT decoding function
- tavDWT2D(): 2D DWT/IDWT transforms
- tavQuantise(): Multi-band quantisation
## Audio Support
Reuses existing MP2 audio infrastructure from TEV/MOV formats for compatibility.
## Subtitle Support
Uses same Simple Subtitle Format (SSF) as TEV for text overlay functionality.
## NTSC Framerate handling
Unlike the TEV format, TAV encoder emits extra sync packet for every 1000th frames. Decoder can just play the video without any special treatment.
## Exponential Numeric System
This system maps [0..255] to [1..4096]
Number|Index
------+-----
1|0
2|1
3|2
4|3
5|4
6|5
7|6
8|7
9|8
10|9
11|10
12|11
13|12
14|13
15|14
16|15
17|16
18|17
19|18
20|19
21|20
22|21
23|22
24|23
25|24
26|25
27|26
28|27
29|28
30|29
31|30
32|31
33|32
34|33
35|34
36|35
37|36
38|37
39|38
40|39
41|40
42|41
43|42
44|43
45|44
46|45
47|46
48|47
49|48
50|49
51|50
52|51
53|52
54|53
55|54
56|55
57|56
58|57
59|58
60|59
61|60
62|61
63|62
64|63
66|64
68|65
70|66
72|67
74|68
76|69
78|70
80|71
82|72
84|73
86|74
88|75
90|76
92|77
94|78
96|79
98|80
100|81
102|82
104|83
106|84
108|85
110|86
112|87
114|88
116|89
118|90
120|91
122|92
124|93
126|94
128|95
132|96
136|97
140|98
144|99
148|100
152|101
156|102
160|103
164|104
168|105
172|106
176|107
180|108
184|109
188|110
192|111
196|112
200|113
204|114
208|115
212|116
216|117
220|118
224|119
228|120
232|121
236|122
240|123
244|124
248|125
252|126
256|127
264|128
272|129
280|130
288|131
296|132
304|133
312|134
320|135
328|136
336|137
344|138
352|139
360|140
368|141
376|142
384|143
392|144
400|145
408|146
416|147
424|148
432|149
440|150
448|151
456|152
464|153
472|154
480|155
488|156
496|157
504|158
512|159
528|160
544|161
560|162
576|163
592|164
608|165
624|166
640|167
656|168
672|169
688|170
704|171
720|172
736|173
752|174
768|175
784|176
800|177
816|178
832|179
848|180
864|181
880|182
896|183
912|184
928|185
944|186
960|187
976|188
992|189
1008|190
1024|191
1056|192
1088|193
1120|194
1152|195
1184|196
1216|197
1248|198
1280|199
1312|200
1344|201
1376|202
1408|203
1440|204
1472|205
1504|206
1536|207
1568|208
1600|209
1632|210
1664|211
1696|212
1728|213
1760|214
1792|215
1824|216
1856|217
1888|218
1920|219
1952|220
1984|221
2016|222
2048|223
2112|224
2176|225
2240|226
2304|227
2368|228
2432|229
2496|230
2560|231
2624|232
2688|233
2752|234
2816|235
2880|236
2944|237
3008|238
3072|239
3136|240
3200|241
3264|242
3328|243
3392|244
3456|245
3520|246
3584|247
3648|248
3712|249
3776|250
3840|251
3904|252
3968|253
4032|254
4096|255
--------------------------------------------------------------------------------
TSVM Universal Cue format
Created by CuriousTorvald on 2025-09-22
A universal, simple cue designed to work as both playlist to cue up external files and lookup table for internal bytes.
# File Structure
\x1F T S V M U C F
[HEADER]
[CUE ELEMENT 0]
[CUE ELEMENT 1]
[CUE ELEMENT 2]
...
## Header (16 bytes)
uint8 Magic[8]: "\x1F TSVM UCF"
uint8 Version: 1
uint16 Number of cue elements
uint32 (Optional) Size of the cue file, useful for allocating fixed length for future expansion; 0 when not used
unit8 Reserved
## Cue Element
uint8 Addressing Mode (low nybble) and Role Flags (high nybble)
- 0x01: External
- 0x02: Internal
- 0x10: Intended for machine interaction (GOP indices, frame indices, etc.)
- 0x20: Intended for human interaction (playlist, chapter markers, etc.)
- 0x30: Intended for both machine and human interaction
Role flags must be unset to assign no roles
uint16 String Length for name
* Name of the element in UTF-8
<if external addressing mode>
uint16 String Length for relative path
* Relative path
<if internal addressing mode>
uint48 Offset to the file
--------------------------------------------------------------------------------
Sound Adapter
Endianness: little
Memory Space
0..114687 RW: Sample bin
114688..131071 RW: Instrument bin (256 instruments, 64 bytes each)
131072..196607 RW: Play data 1
196608..262143 RW: Play data 2
Sample bin: just raw sample data thrown in there. You need to keep track of starting point for each sample
Instrument bin: Registry for 256 instruments, formatted as:
Uint16 Sample Pointer
Uint16 Sample length
Uint16 Sampling rate at C3
Uint16 Play Start (usually 0 but not always)
Uint16 Loop Start (can be smaller than Play Start)
Uint16 Loop End
Bit32 Flags
0b h000 00pp
h: sample pointer high bit
pp: loop mode. 0-no loop, 1-loop, 2-backandforth, 3-oneshot (ignores note length unless overridden by other notes)
Bit16x24 Volume envelopes
Byte 1: Volume
Byte 2: Second offset from the prev point, in 3.5 Unsigned Minifloat
Play Data: play data are series of tracker-like instructions, visualised as:
rr||NOTE|Ins|E.Vol|E.Pan|EE.ff|
63||FFFF|255|3+ 64|3+ 64|16 FF| (8 bytes per line, 512 bytes per pattern, 256 patterns on 128 kB block)
notes are tuned as 4096 Tone-Equal Temperament. Tuning is set per-sample using their Sampling rate value.
Sound Adapter MMIO
0..1 RW: Play head #1 position
2..3 RW: Play head #1 length param
4 RW: Play head #1 master volume
5 RW: Play head #1 master pan
6..9 RW: Play head #1 flags
10..11 RW:Play head #2 position
12..13 RW:Play head #2 length param
14 RW: Play head #2 master volume
15 RW: Play head #2 master pan
16..19 RW:Play head #2 flags
... auto-fill to Play head #4
40 WO: Media Decoder Control
Write 16 to initialise the MP2 context (call this before the decoding of NEW music)
Write 1 to decode the frame as MP2
When called with byte 17, initialisation will precede before the decoding
41 RO: Media Decoder Status
Non-zero value indicates the decoder is busy
64..2367 RW: MP2 Decoded Samples (unsigned 8-bit stereo)
2368..4095 RW: MP2 Frame to be decoded
4096..4097 RO: MP2 Frame guard bytes; always return 0 on read
Sound Hardware Info
- Sampling rate: 32000 Hz
- Bit depth: 8 bits/sample, unsigned
- Always operate in stereo (mono samples must be expanded to stereo before uploading)
Play Head Position
- Tracker mode: Cuesheet Counter
- PCM mode: Number of buffers uploaded and received by the adapter
Length Param
PCM Mode: length of the samples to upload to the speaker
Tracker mode: unused
Play Head Flags
Byte 1
- 0b mrqp ssss
m: mode (0 for Tracker, 1 for PCM)
r: reset parameters; always 0 when read
resetting will:
set position to 0,
set length param to 0,
set queue capacity to 8 samples,
unset play bit
q: purge queues (likely do nothing if not PCM); always 0 when read
p: play (0 if not -- mute all output)
ssss: PCM Mode set PCM Queue Size
0 - 4 samples
1 - 6 samples
2 - 8 samples (the default size)
3 - 12 samples
4 - 16 samples
5 - 24 samples
6 - 32 samples
7 - 48 samples
8 - 64 samples
9 - 96 samples
10 - 128 samples
11 - 192 samples
12 - 256 samples
13 - 384 samples
14 - 512 samples
15 - 768 samples
NOTE: changing from PCM mode to Tracker mode or vice versa will also reset the parameters as described above
Byte 2
- PCM Mode: Write non-zero value to start uploading; always 0 when read
Byte 3 (Tracker Mode)
- BPM (24 to 280. Play Data will change this register)
Byte 4 (Tracker Mode)
- Tick Rate (Play Data will change this register)
Uploaded PCM data will be stored onto the queue before being consumed by hardware.
If the queue is full, any more uploads will be silently discarded.
32768..65535 RW: Cue Sheet (2048 cues)
Byte 1..15: pattern number for voice 1..15
Byte 16: instruction
1 xxxxxxx - Go back (128, 1-127) patterns to form a loop
01 xxxxxx -
001 xxxxx -
0001 xxxx - Skip (16, 1-15) patterns
00001 xxx -
000001 xx -
0000001 x -
0000000 1 -
0000000 0 - No operation
65536..131071 RW: PCM Sample buffer
Table of 3.5 Minifloat values (CSV)
,000,001,010,011,100,101,110,111,MSB
00000,0,1,2,4,8,16,32,64
00001,0.03125,1.03125,2.0625,4.125,8.25,16.5,33,66
00010,0.0625,1.0625,2.125,4.25,8.5,17,34,68
00011,0.09375,1.09375,2.1875,4.375,8.75,17.5,35,70
00100,0.125,1.125,2.25,4.5,9,18,36,72
00101,0.15625,1.15625,2.3125,4.625,9.25,18.5,37,74
00110,0.1875,1.1875,2.375,4.75,9.5,19,38,76
00111,0.21875,1.21875,2.4375,4.875,9.75,19.5,39,78
01000,0.25,1.25,2.5,5,10,20,40,80
01001,0.28125,1.28125,2.5625,5.125,10.25,20.5,41,82
01010,0.3125,1.3125,2.625,5.25,10.5,21,42,84
01011,0.34375,1.34375,2.6875,5.375,10.75,21.5,43,86
01100,0.375,1.375,2.75,5.5,11,22,44,88
01101,0.40625,1.40625,2.8125,5.625,11.25,22.5,45,90
01110,0.4375,1.4375,2.875,5.75,11.5,23,46,92
01111,0.46875,1.46875,2.9375,5.875,11.75,23.5,47,94
10000,0.5,1.5,3,6,12,24,48,96
10001,0.53125,1.53125,3.0625,6.125,12.25,24.5,49,98
10010,0.5625,1.5625,3.125,6.25,12.5,25,50,100
10011,0.59375,1.59375,3.1875,6.375,12.75,25.5,51,102
10100,0.625,1.625,3.25,6.5,13,26,52,104
10101,0.65625,1.65625,3.3125,6.625,13.25,26.5,53,106
10110,0.6875,1.6875,3.375,6.75,13.5,27,54,108
10111,0.71875,1.71875,3.4375,6.875,13.75,27.5,55,110
11000,0.75,1.75,3.5,7,14,28,56,112
11001,0.78125,1.78125,3.5625,7.125,14.25,28.5,57,114
11010,0.8125,1.8125,3.625,7.25,14.5,29,58,116
11011,0.84375,1.84375,3.6875,7.375,14.75,29.5,59,118
11100,0.875,1.875,3.75,7.5,15,30,60,120
11101,0.90625,1.90625,3.8125,7.625,15.25,30.5,61,122
11110,0.9375,1.9375,3.875,7.75,15.5,31,62,124
11111,0.96875,1.96875,3.9375,7.875,15.75,31.5,63,126
LSB
--------------------------------------------------------------------------------
RomBank / RamBank
Endianness: Little
MMIO
0 RW : Bank number for the first 512 kbytes
1 RW : Bank number for the last 512 kbytes
16..23 RW : DMA Control for Lane 1..8
Write 0x01: copy from Core to Peripheral
Write 0x02: copy from Peripheral to Core
* NOTE: after the transfer, the bank numbers will revert to the value that was before the operation
24..31 RW : DMA Control reserved
32..34 RW : DMA Lane 1 -- Addr on the Core Memory
35..37 RW : DMA Lane 1 -- Addr on the Peripheral's Memory (addr can be across-the-bank)
38..40 RW : DMA Lane 1 -- Transfer Length
41..42 RW : DMA Lane 1 -- First/Last Bank Number
43 RW : DMA Lane 1 -- (reserved)
44..55 RW : DMA Lane 2 Props
56..67 RW : DMA Lane 3 Props
68..79 RW : DMA Lane 4 Props
80..91 RW : DMA Lane 5 Props
92..103 RW : DMA Lane 6 Props
104..115 RW : DMA Lane 7 Props
116..127 RW : DMA Lane 8 Props
--------------------------------------------------------------------------------
High Speed Disk Peripheral Adapter (HSDPA)
An interface card to read and write to a single large disk sequentially which has no filesystem on it.
Endianness: Little
MMIO
0..2 RW: Block transfer status for Disk 1
0b nnnn nnnn, nnnn nnnn , a00z mmmm
n-read: size of the block from the other device, LSB (1048576-full block size is zero)
m-read: size of the block from the other device, MSB (1048576-full block size is zero)
a-read: if the other device hasNext (doYouHaveNext), false if device not present
z-read: set if the size is actually 0 instead of 1048576 (overrides n and m parameters)
n-write: size of the block I'm sending, LSB (1048576-full block size is zero)
m-write: size of the block I'm sending, MSB (1048576-full block size is zero)
a-write: if there's more to send (hasNext)
z-write: set if the size is actually 0 instead of 1048576 (overrides n and m parameters)
3..5 RW: Block transfer status for Disk 2
6..8 RW: Block transfer status for Disk 3
9..11 RW: Block transfer status for Disk 4
12..15 RW: Block transfer control for Disk 1 through 4
0b 0000 abcd
a: 1 for send, 0 for receive
b-write: 1 to start sending if a-bit is set; if a-bit is unset, make other device to start sending
b-read: if this bit is set, you're currently receiving something (aka busy)
c-write: I'm ready to receive
c-read: Are you ready to receive?
d-read: Are you there? (if the other device's recipient is myself)
NOTE: not ready AND not busy (bits b and d set when read) means the device is not connected to the port
16..19 RW: 8-bit status code for the disk
20 RW: Currently active disk (0: deselect all disk, 1: select disk #1, ...)
Selecting a disk will automatically unset and hold down "I'm ready to receive" flags of the other disks,
however, the target disk will NOT have its "I'm ready to receive" flag automatically set.
-- SEQUENTIAL IO SUPPORT MODULE --
NOTE: Sequential I/O will clobber the peripheral memory space.
256..257 RW: Sequential I/O control flags
258 RW: Opcode. Writing a value to this memory will execute the operation
0x00 - No operation
0x01 - Skip (arg 1) bytes
0x02 - Read (arg 1) bytes and store to core memory pointer (arg 2)
0x03 - Write (arg 1) bytes using data from the core memory from pointer (arg 2)
0xF0 - Rewind the file to the starting point
0xFF - Terminate sequential I/O session and free up the memory space
259..261 RW: Argument #1
262..264 RW: Argument #2
265..267 RW: Argument #3
268..270 RW: Argument #4
Memory Space
0..1048575 RW: Buffer for the block transfer lane
IMPLEMENTATION RECOMMENDATION: split the memory space into two 512K blocks, and when the sequential
reading reaches the second space, prepare the next bytes in the first memory space, so that the read
cursor reaches 1048576, it wraps into 0 and continue reading the content of the disk as if nothing happend.