somewhat working in CoreText, not at all in DirectWrite

This commit is contained in:
minjaesong
2026-03-02 21:02:55 +09:00
parent 73c2b6986d
commit 3e79181aa3
4 changed files with 108 additions and 21 deletions

View File

@@ -118,7 +118,7 @@ print(f"{name}: advance={w}, has_outlines={has_outlines}")
### OpenType features generated (`opentype_features.py`)
- **ccmp** — replacewith expansions (DFLT); consonant-to-PUA mapping + vowel decompositions (dev2)
- **ccmp** — replacewith expansions (DFLT); consonant-to-PUA mapping + vowel decompositions + anusvara upper (dev2); vowel decompositions (tml2)
- **kern** — pair positioning from `keming_machine.py`
- **liga** — Latin ligatures (ff, fi, fl, ffi, ffl, st) and Armenian ligatures
- **locl** — Bulgarian/Serbian Cyrillic alternates
@@ -127,6 +127,7 @@ print(f"{name}: advance={w}, has_outlines={has_outlines}")
- **pres** (sund) — Sundanese diacritic combinations
- **ljmo, vjmo, tjmo** — Hangul jamo positional variants
- **mark** — GPOS mark-to-base diacritics positioning
- **mkmk** — GPOS mark-to-mark diacritics stacking (successive marks shift by H_DIACRITICS)
### Devanagari PUA mapping
@@ -145,3 +146,58 @@ Mapping formula: `to_deva_internal(c)` = `c - 0x0915 + 0xF0140` for U+0915-0939.
### Script tag gotcha
When a script-specific feature exists in GSUB (e.g. `ccmp` under `dev2`), HarfBuzz uses **only** the script-specific lookups and does **not** fall back to the DFLT script's lookups for that feature. Any substitutions needed for a specific script must be registered under that script's tag.
### languagesystem and language records
The `languagesystem` declarations in the preamble control which script/language records are created in the font tables. Key rules:
- `languagesystem` declarations must be at the **top level** of the feature file, not inside any `feature` block. Putting them inside `feature aalt { }` is invalid feaLib syntax and causes silent compilation failure.
- When a language-specific record exists (e.g. `dev2/MAR` from `languagesystem dev2 MAR;`), features registered under `script dev2;` only populate `dev2/dflt` — they are **not** automatically copied to `dev2/MAR`. The language record inherits only from DFLT, resulting in incomplete feature sets.
- Only declare language-specific records when you have `locl` or other language-differentiated features. Otherwise, use only `languagesystem <script> dflt;` to avoid partial feature inheritance that breaks DirectWrite and CoreText.
### Inspecting feature registration per script
To verify that features are correctly registered under each script:
```python
from fontTools.ttLib import TTFont
font = TTFont('OTFbuild/TerrarumSansBitmap.otf')
gsub = font['GSUB']
for sr in gsub.table.ScriptList.ScriptRecord:
tag = sr.ScriptTag
if sr.Script.DefaultLangSys:
feats = []
for idx in sr.Script.DefaultLangSys.FeatureIndex:
fr = gsub.table.FeatureList.FeatureRecord[idx]
feats.append(fr.FeatureTag)
print(f"{tag}/dflt: {' '.join(sorted(set(feats)))}")
for lsr in (sr.Script.LangSysRecord or []):
feats = []
for idx in lsr.LangSys.FeatureIndex:
fr = gsub.table.FeatureList.FeatureRecord[idx]
feats.append(fr.FeatureTag)
print(f"{tag}/{lsr.LangSysTag}: {' '.join(sorted(set(feats)))}")
```
Expected output for dev2: `dev2/dflt: abvs akhn blwf blws calt ccmp cjct half liga nukt pres psts rphf`. If language-specific records (e.g. `dev2/MAR`) appear with only `ccmp liga`, the language records have incomplete feature inheritance — remove the corresponding `languagesystem` declaration.
### Debugging feature compilation failures
The build writes `debugout_features.fea` with the raw feature code before compilation. When compilation fails, inspect this file to find syntax errors. Common issues:
- **`languagesystem` inside a feature block** — must be at the top level
- **Named lookup defined inside a feature block** — applies unconditionally to all input. Define the lookup outside the feature block and reference it via contextual rules inside.
- **Glyph not in font** — a substitution references a glyph name that doesn't exist in the font's glyph order (e.g. a control character was removed)
### HarfBuzz Indic shaper (dev2) feature order
Understanding feature application order is critical for Devanagari debugging:
1. **Pre-reordering** (Unicode order): `ccmp`
2. **Reordering**: HarfBuzz reorders pre-base matras (e.g. I-matra U+093F moves before the consonant)
3. **Post-reordering**: `nukt``akhn``rphf``half``blwf``cjct``pres``abvs``blws``psts``haln``calt`
4. **GPOS**: `kern``mark`/`abvm``mkmk`
Implication: GSUB rules that need to match pre-base matras adjacent to post-base marks (e.g. anusvara substitution triggered by I-matra) must go in `ccmp`, not `psts`, because reordering separates them.

View File

@@ -401,11 +401,12 @@ def build_font(assets_dir, output_path, no_bitmap=False, no_features=False):
if fea_code.strip():
print(" Compiling features with feaLib...")
try:
fea_stream = io.StringIO(fea_code)
addOpenTypeFeatures(font, fea_stream)
# Obtain raw .fea text for debugging
with open("debugout_features.fea", "w") as text_file:
text_file.write(fea_code)
fea_stream = io.StringIO(fea_code)
addOpenTypeFeatures(font, fea_stream)
print(" Features compiled successfully")
except Exception as e:
print(f" [WARNING] Feature compilation failed: {e}")

View File

@@ -48,7 +48,7 @@ def generate_features(glyphs, kern_pairs, font_glyph_set,
def has(cp):
return glyph_name(cp) in font_glyph_set
preamble = """feature aalt {
preamble = """\
languagesystem DFLT dflt;
languagesystem latn dflt;
languagesystem cyrl dflt;
@@ -57,13 +57,9 @@ languagesystem hang KOR ;
languagesystem hang KOH ;
languagesystem cyrl SRB ;
languagesystem cyrl BGR ;
languagesystem dev2 MAR ;
languagesystem dev2 NEP ;
languagesystem dev2 SAN ;
languagesystem dev2 SAT ;
languagesystem tml2 TAM ;
languagesystem sund SUN ;
} aalt;
languagesystem dev2 dflt;
languagesystem tml2 dflt;
languagesystem sund dflt;
"""
if preamble:
parts.append(preamble)
@@ -99,7 +95,7 @@ languagesystem sund SUN ;
parts.append(deva_code)
# Tamil features
tamil_code = _generate_tamil(glyphs, has)
tamil_code = _generate_tamil(glyphs, has, replacewith_subs or [])
if tamil_code:
parts.append(tamil_code)
@@ -127,12 +123,26 @@ languagesystem sund SUN ;
def _generate_ccmp(replacewith_subs, has):
"""Generate ccmp feature for replacewith directives (multiple substitution)."""
"""Generate ccmp feature for replacewith directives (multiple substitution).
Devanagari (0x0900-097F) and Tamil (0x0B80-0BFF) source codepoints are
excluded here because their ccmp lookups must live under the script-
specific tags (dev2, tml2). DirectWrite and CoreText do not fall back
from a script-specific ccmp to DFLT.
"""
if not replacewith_subs:
return ""
# Ranges handled by script-specific ccmp features
_SCRIPT_RANGES = (
range(0x0900, 0x0980), # Devanagari → dev2 ccmp
range(0x0B80, 0x0C00), # Tamil → tml2 ccmp
)
subs = []
for src_cp, target_cps in replacewith_subs:
if any(src_cp in r for r in _SCRIPT_RANGES):
continue
if not has(src_cp):
continue
if not all(has(t) for t in target_cps):
@@ -1399,8 +1409,28 @@ def _generate_psts_open_ya(glyphs, has):
return lookups, body
def _generate_tamil(glyphs, has):
"""Generate Tamil GSUB features."""
def _generate_tamil(glyphs, has, replacewith_subs=None):
"""Generate Tamil GSUB features (ccmp + pres under tml2)."""
features = []
# --- tml2 ccmp: Tamil replacewith decompositions ---
# Must be under tml2 so DirectWrite/CoreText see them.
if replacewith_subs:
tamil_ccmp = []
for src_cp, target_cps in replacewith_subs:
if not (0x0B80 <= src_cp <= 0x0BFF):
continue
if not has(src_cp) or not all(has(t) for t in target_cps):
continue
src = glyph_name(src_cp)
targets = ' '.join(glyph_name(t) for t in target_cps)
tamil_ccmp.append(f" sub {src} by {targets};")
if tamil_ccmp:
features.append("feature ccmp {\n script tml2;\n"
" lookup TamilDecomp {\n"
+ '\n'.join(tamil_ccmp)
+ "\n } TamilDecomp;\n} ccmp;")
subs = []
_tamil_i_rules = [
@@ -1436,13 +1466,13 @@ def _generate_tamil(glyphs, has):
if has(0x0BB8) and has(0x0BCD) and has(0x0BB0) and has(0x0BC0) and has(SC.TAMIL_SHRII):
subs.append(f" sub {glyph_name(0x0BB8)} {glyph_name(0x0BCD)} {glyph_name(0x0BB0)} {glyph_name(0x0BC0)} by {glyph_name(SC.TAMIL_SHRII)}; # SHRII (sa)")
if not subs:
return ""
if subs:
lines = ["feature pres {", " script tml2;"]
lines.extend(subs)
lines.append("} pres;")
features.append('\n'.join(lines))
lines = ["feature pres {", " script tml2;"]
lines.extend(subs)
lines.append("} pres;")
return '\n'.join(lines)
return '\n\n'.join(features) if features else ""
def _generate_sundanese(glyphs, has):

BIN
demo.PNG

Binary file not shown.

Before

Width:  |  Height:  |  Size: 167 KiB

After

Width:  |  Height:  |  Size: 167 KiB