Files
Terrarum-sans-bitmap/OTFbuild/CLAUDE.md
minjaesong a567b9f7fc CLAUDE.md
2026-02-25 03:13:06 +09:00

6.0 KiB

OTFbuild

Python toolchain that builds an OpenType (CFF) font from the TGA sprite sheets used by the bitmap font engine.

Building

pip install fonttools
python3 OTFbuild/build_font.py src/assets -o OTFbuild/TerrarumSansBitmap.otf

Options:

  • --no-bitmap — skip EBDT/EBLC bitmap strike (faster builds for iteration)
  • --no-features — skip GSUB/GPOS OpenType features

Debugging with HarfBuzz

Install uharfbuzz for shaping tests:

pip install uharfbuzz

Shape text and inspect glyph substitutions, advances, and positioning:

import uharfbuzz as hb
from fontTools.ttLib import TTFont

with open('OTFbuild/TerrarumSansBitmap.otf', 'rb') as f:
    font_data = f.read()

blob = hb.Blob(font_data)
face = hb.Face(blob)
font = hb.Font(face)

text = "ऐतिहासिक"
buf = hb.Buffer()
buf.add_str(text)
buf.guess_segment_properties()
hb.shape(font, buf)

ttfont = TTFont('OTFbuild/TerrarumSansBitmap.otf')
glyph_order = ttfont.getGlyphOrder()

for info, pos in zip(buf.glyph_infos, buf.glyph_positions):
    name = glyph_order[info.codepoint]
    print(f"  {name} advance=({pos.x_advance},{pos.y_advance}) cluster={info.cluster}")

Key things to check:

  • advance=(0,0) on a visible character means the glyph is zero-width (likely missing outline or failed GSUB substitution)
  • glyph name starts with uF0 means GSUB substituted to an internal PUA form (expected for Devanagari consonants, Hangul jamo variants, etc.)
  • cluster groups glyphs that originated from the same input character(s)

Inspecting GSUB tables

from fontTools.ttLib import TTFont

font = TTFont('OTFbuild/TerrarumSansBitmap.otf')
gsub = font['GSUB']

# List scripts and their features
for sr in gsub.table.ScriptList.ScriptRecord:
    tag = sr.ScriptTag
    if sr.Script.DefaultLangSys:
        for idx in sr.Script.DefaultLangSys.FeatureIndex:
            fr = gsub.table.FeatureList.FeatureRecord[idx]
            print(f"  {tag}/{fr.FeatureTag}: lookups={fr.Feature.LookupListIndex}")

# Inspect a specific lookup's substitution mappings
lookup = gsub.table.LookupList.Lookup[18]  # e.g. DevaConsonantMap
for st in lookup.SubTable:
    for src, dst in st.mapping.items():
        print(f"  {src} -> {dst}")

Checking glyph outlines and metrics

font = TTFont('OTFbuild/TerrarumSansBitmap.otf')
hmtx = font['hmtx']
cff = font['CFF ']

name = 'uni0915'  # Devanagari KA
w, lsb = hmtx[name]
cs = cff.cff.topDictIndex[0].CharStrings[name]
cs.decompile()
has_outlines = len(cs.program) > 2  # more than just width + endchar
print(f"{name}: advance={w}, has_outlines={has_outlines}")

Architecture

Build pipeline (font_builder.py)

  1. Parse sheetsglyph_parser.py reads each TGA sprite sheet, extracts per-glyph bitmaps and tag-column metadata (width, alignment, diacritics anchors, kerning data, directives)
  2. Compose Hangulhangul.py assembles 11,172 precomposed Hangul syllables from jamo components and stores jamo variants in PUA for GSUB
  3. Populate Devanagari — consonants U+0915-0939 have width=0 in the sprite sheet (the Kotlin engine normalises them to PUA forms); the builder copies PUA glyph data back to the Unicode positions so they render without GSUB
  4. Expand replacewith — glyphs with the replacewith directive (opcode 0x80-0x87) are collected for GSUB multiple substitution (e.g. U+0910 -> U+090F U+0947)
  5. Build glyph order and cmap — PUA internal forms (0xF0000-0xF0FFF) get glyphs but no cmap entries
  6. Trace bitmapsbitmap_tracer.py converts 1-bit bitmaps to CFF rectangle contours (50 units/pixel)
  7. Set metrics — hmtx, hhea, OS/2, head, name, post tables
  8. OpenType featuresopentype_features.py generates feaLib code, compiled via fontTools.feaLib
  9. Bitmap strike — optional EBDT/EBLC at 20ppem via TTX import

Module overview

Module Purpose
build_font.py CLI entry point
font_builder.py Orchestrates the build pipeline
sheet_config.py Sheet indices, code ranges, index functions, metric constants, Hangul/Devanagari/Tamil/Sundanese constants
glyph_parser.py TGA sprite sheet parsing; extracts bitmaps and tag-column properties
tga_reader.py Low-level TGA image reader
bitmap_tracer.py Converts 1-bit bitmaps to CFF outlines (rectangle merging)
opentype_features.py Generates GSUB/GPOS feature code for feaLib
keming_machine.py Generates kerning pairs from glyph kern masks
hangul.py Hangul syllable composition and jamo GSUB data

OpenType features generated (opentype_features.py)

  • ccmp — replacewith expansions (DFLT); consonant-to-PUA mapping + vowel decompositions (dev2)
  • kern — pair positioning from keming_machine.py
  • liga — Latin ligatures (ff, fi, fl, ffi, ffl, st) and Armenian ligatures
  • locl — Bulgarian/Serbian Cyrillic alternates
  • nukt, akhn, half, vatu, pres, blws, rphf — Devanagari complex script shaping (all under script dev2)
  • pres (tml2) — Tamil consonant+vowel ligatures
  • pres (sund) — Sundanese diacritic combinations
  • ljmo, vjmo, tjmo — Hangul jamo positional variants
  • mark — GPOS mark-to-base diacritics positioning

Devanagari PUA mapping

The bitmap font engine normalises Devanagari consonants to internal PUA forms before rendering. The OTF builder mirrors this:

Unicode range PUA range Purpose
U+0915-0939 0xF0140-0xF0164 Base consonants
U+0915-0939 +48 0xF0170-0xF0194 Nukta forms (consonant + U+093C)
U+0915-0939 +240 0xF0230-0xF0254 Half forms (consonant + virama)
U+0915-0939 +480 0xF0320-0xF0404 RA-appended forms (consonant + virama + RA)

Mapping formula: to_deva_internal(c) = c - 0x0915 + 0xF0140 for U+0915-0939.

Script tag gotcha

When a script-specific feature exists in GSUB (e.g. ccmp under dev2), HarfBuzz uses only the script-specific lookups and does not fall back to the DFLT script's lookups for that feature. Any substitutions needed for a specific script must be registered under that script's tag.