47 Commits

Author SHA1 Message Date
minjaesong
af334ad20b Enclosed alphanumerics (issue #18) 2026-05-17 17:21:39 +09:00
minjaesong
55ad8ee943 minor fixes 2026-05-06 22:13:27 +09:00
minjaesong
e9fcf6bbce Wenquanyi error fix 2026-04-26 00:38:58 +09:00
minjaesong
45d5b758e3 emoji shiftdown as they should 2026-04-14 23:14:38 +09:00
minjaesong
d3ae868723 emoji wip 2026-03-28 17:36:01 +09:00
minjaesong
372ae9b354 support for Miscellaneous Technical 2026-03-27 21:16:06 +09:00
minjaesong
45027be83c Miscellaneous Technical wip 2026-03-27 01:31:19 +09:00
minjaesong
6bc365fc57 maths ops 2026-03-21 22:15:39 +09:00
minjaesong
69f868c3e8 fix for github issue #15 2026-03-20 22:57:24 +09:00
minjaesong
a1147c8611 update to Unicode 18 2026-03-20 20:12:04 +09:00
minjaesong
4bd8febfdf fix: some old hangul getting wrong component 2026-03-20 18:48:50 +09:00
minjaesong
04fbddb200 OTF hangul base addr remap 2026-03-20 18:30:57 +09:00
minjaesong
fb935ab28f cyrillic ext d 2026-03-19 16:18:21 +09:00
minjaesong
6c65cfc0a1 demo text update 2026-03-18 00:44:22 +09:00
minjaesong
c5c2ae4022 full support of CJK Unified Ideographs and ExtA 2026-03-16 22:33:44 +09:00
minjaesong
2279873222 wenquanyi update 2026-03-16 16:09:16 +09:00
minjaesong
4dbcbb7df5 working on Han char sheet 2026-03-15 22:11:24 +09:00
minjaesong
ad4a044ace ascii fractions to use char decompo 2026-03-15 01:18:43 +09:00
minjaesong
9ca8117b5f coptic 2026-03-15 01:00:23 +09:00
minjaesong
76f223aee8 ogham 2026-03-14 19:32:15 +09:00
minjaesong
662dc5b093 glyph texture atlas (2) 2026-03-14 16:06:31 +09:00
minjaesong
f4e1db5846 glyph texture atlas 2026-03-14 14:24:23 +09:00
minjaesong
1c7471ccf3 added missing glyphs in Greek and Coptic block 2026-03-14 11:14:18 +09:00
minjaesong
5fca96a861 fix: number forms using wrong slash 2026-03-13 21:17:25 +09:00
minjaesong
539a2c9f46 autokem: more filtering 2026-03-13 20:08:56 +09:00
CuriousTorvald
d57707b210 Add funding options for GitHub and PayPal
Updated funding options to include GitHub Sponsors and a PayPal link.
2026-03-13 19:16:55 +09:00
minjaesong
7c4ab9d4be reshaping Tironian et 2026-03-13 19:14:58 +09:00
minjaesong
175fe4edfb demo text update 2026-03-13 14:13:28 +09:00
minjaesong
4d7aa79740 fixed some mislabeling 2026-03-13 13:59:58 +09:00
minjaesong
9d9efce9d4 Latin Ext F and G 2026-03-13 13:29:43 +09:00
minjaesong
8daa968d80 cyrillic extb relabelling 2026-03-11 09:11:28 +09:00
minjaesong
199519fe1c latin ext e 2026-03-10 22:51:44 +09:00
minjaesong
194cda93fe latin ext-e 2026-03-10 04:02:03 +09:00
minjaesong
71370a82a7 revised autokem model 2026-03-09 23:54:51 +09:00
minjaesong
268610a8b3 revised autokem model 2026-03-09 23:46:28 +09:00
minjaesong
244371aa9d revised autokem model 2026-03-08 20:34:45 +09:00
minjaesong
39603d897b more fixes 2026-03-08 03:12:40 +09:00
minjaesong
2b1f9a866f more fixes 2026-03-08 02:38:03 +09:00
minjaesong
af1d720ec2 anchor fixes 2026-03-08 01:44:17 +09:00
minjaesong
8a52fcfb91 tfw part of your code was full of hacks 2026-03-08 00:32:42 +09:00
minjaesong
b82dbecc30 keming machine dot removal directive for OTF 2026-03-07 22:47:01 +09:00
minjaesong
2008bbf6dd keming machine dot removal directive 2026-03-07 22:41:24 +09:00
minjaesong
163e3d7b3e Arrow and Number Forms unicode block 2026-03-07 22:13:22 +09:00
minjaesong
f26998b641 Cyrillic Ext-C 2026-03-07 21:34:40 +09:00
minjaesong
4121648dfc minor fixes (2) 2026-03-07 21:08:44 +09:00
minjaesong
30de279a08 minor fixes 2026-03-07 21:03:10 +09:00
minjaesong
956599b83f Full Cyrillic, CyrlExtA/B support (enables church slavonic) 2026-03-07 20:55:47 +09:00
105 changed files with 2733 additions and 545 deletions

View File

@@ -0,0 +1,30 @@
{
"permissions": {
"allow": [
"WebFetch(domain:github.com)",
"WebSearch",
"Bash(head:*)",
"WebFetch(domain:gitlab.com)",
"Bash(java:*)",
"Bash(ls:*)",
"Bash(jar tf:*)",
"Bash(chmod +x:*)",
"WebFetch(domain:fontforge.org)",
"WebFetch(domain:fonttools.readthedocs.io)",
"Bash(grep:*)",
"Bash(tail:*)",
"Bash(python3:*)",
"Bash(make:*)",
"Bash(git stash:*)",
"Bash(./autokem*)",
"Bash(cmp:*)",
"Bash(wc:*)",
"Bash(pip3:*)",
"Bash(pip install:*)",
"Bash(.venv/bin/python3:*)",
"Bash(.venv/bin/python:*)",
"Bash(find:*)",
"Skill(update-config)"
]
}
}

View File

@@ -0,0 +1,93 @@
---
name: add-unicode-block
description: Add a new Unicode script/block to the Terrarum Sans Bitmap font engine.
---
# Add Unicode Block
## Required inputs
The user must supply:
- **Script name** — human-readable name used in constant/function names (e.g. `Ogham`, `LatinExtE`)
- **TGA filename** — the sprite sheet filename without path (e.g. `ogham_variable.tga`)
- **Unicode range** — start and end codepoints inclusive (e.g. `U+1680..U+169F`)
If any of these are missing, ask for them before proceeding. Extra directions can be given after Unicode range.
## Step 1 — Determine the next sheet index
Read the sheet index constants from both files to find the current highest index (excluding `SHEET_UNKNOWN = 254`):
- Kotlin: `src/net/torvald/terrarumsansbitmap/gdx/TerrarumSansBitmap.kt` — grep for `internal const val SHEET_`
- Python: `OTFbuild/sheet_config.py` — grep for `^SHEET_`
The new index = highest existing index + 1.
## Step 2 — Derive identifiers
From the script name, derive:
- **Kotlin constant**: `SHEET_<UPPER_SNAKE>_VARW` (e.g. `SHEET_OGHAM_VARW`)
- **Kotlin indexY function**: `<camelCase>IndexY` (e.g. `oghamIndexY`)
- **Python constant**: same as Kotlin constant
- **Range start hex**: the lower bound codepoint as a `0x`-prefixed Kotlin/Python literal
## Step 3 — Edit both files
Make all 6 edits. Read each section before editing.
### Kotlin: `src/net/torvald/terrarumsansbitmap/gdx/TerrarumSansBitmap.kt`
**a) Sheet index constant** — find the block of `internal const val SHEET_*` constants (just before `SHEET_UNKNOWN = 254`) and append:
```kotlin
internal const val SHEET_<NAME>_VARW = <INDEX>
```
**b) fileList entry** — find `internal val fileList` array and append before the closing `)`:
```kotlin
"<tga_filename>",
```
**c) codeRange entry** — find `internal val codeRange` array and append before the closing `)`:
```kotlin
0x<START>..<0x<END>, // SHEET_<NAME>_VARW
```
Use `+` to combine non-contiguous ranges if needed.
**d) getSheetwisePosition when-branch** — find the `when` block that dispatches to indexY functions (just before `else -> ch / 16`) and append:
```kotlin
SHEET_<NAME>_VARW -> <camelCase>IndexY(ch)
```
**e) indexY function** — find the block of private `*IndexY` functions near the bottom of the companion object and append:
```kotlin
private fun <camelCase>IndexY(c: CodePoint) = (c - 0x<START>) / 16
```
### Python: `OTFbuild/sheet_config.py`
**f) Sheet index constant** — find the block of `SHEET_* = <n>` constants (just before `SHEET_UNKNOWN = 254`) and append:
```python
SHEET_<NAME>_VARW = <INDEX>
```
**g) FILE_LIST entry** — find `FILE_LIST = [` array and append before the closing `]`:
```python
"<tga_filename>",
```
**h) CODE_RANGE entry** — find `CODE_RANGE = [` array and append before the closing `]`:
```python
list(range(0x<START>, 0x<END+1>)), # <INDEX>: <ScriptName>
```
**i) index_y lambda** — find the dict in `get_index_y(sheet_index, c)` (just before `SHEET_HANGUL: lambda: 0`) and append:
```python
SHEET_<NAME>_VARW: lambda: (c - 0x<START>) // 16,
```
## Step 4 — Verify
After all edits, confirm:
1. The Kotlin constant, fileList, codeRange, when-branch, and indexY function are all present and consistent.
2. The Python constant, FILE_LIST, CODE_RANGE, and index_y lambda are all present and consistent.
3. The indices in both files match.
4. The range end in `CODE_RANGE` is `end + 1` (Python `range` is exclusive).

2
.github/FUNDING.yml vendored Normal file
View File

@@ -0,0 +1,2 @@
github: [curioustorvald]
custom: ["https://paypal.me/curioustorvald"]

View File

@@ -25,20 +25,32 @@ make clean
- `apply` creates `.bak` backup, runs inference per cell, writes Y+5 (lowheight) and Y+6 (kern data) pixels. Skips cells with width=0, writeOnTop, or compiler directives - `apply` creates `.bak` backup, runs inference per cell, writes Y+5 (lowheight) and Y+6 (kern data) pixels. Skips cells with width=0, writeOnTop, or compiler directives
- Model file `autokem.safetensors` must be in the working directory - Model file `autokem.safetensors` must be in the working directory
### PyTorch training (faster prototyping)
```bash
cd Autokem
.venv/bin/python train_torch.py # train with defaults
.venv/bin/python train_torch.py --epochs 300 # override max epochs
.venv/bin/python train_torch.py --lr 0.0005 # override learning rate
.venv/bin/python train_torch.py --load model.safetensors # resume from weights
```
- Drop-in replacement for `./autokem train` — reads the same sheets, produces the same safetensors format
- The exported `autokem.safetensors` is directly loadable by the C inference code (`./autokem apply`)
- Requires: `pip install torch numpy` (venv at `.venv/`)
## Architecture ## Architecture
### Neural network ### Neural network
``` ```
Input: 15x20x1 binary (300 values, alpha >= 0x80 → 1.0) Input: 15x20x1 binary (300 values, alpha >= 0x80 → 1.0)
Conv2D(1→12, 3x3, same) → LeakyReLU(0.01) Conv2D(1→32, 7x7, pad=1) → SiLU
Conv2D(12→16, 3x3, same) → LeakyReLU(0.01) Conv2D(32→64, 7x7, pad=1) → SiLU
Flatten → 4800 Global Average Pool → [batch, 64]
Dense(4800→24) → LeakyReLU(0.01) Dense(64→256) → SiLU
├── Dense(24→10) → sigmoid (shape bits A-H, J, K) Dense(256→12) → sigmoid (10 shape bits + 1 ytype + 1 lowheight)
├── Dense(24→1) → sigmoid (Y-type) Total: ~121,740 params (~476 KB float32)
└── Dense(24→1) → sigmoid (lowheight)
Total: ~117,388 params (~460 KB float32)
``` ```
Training: Adam (lr=0.001, beta1=0.9, beta2=0.999), BCE loss, batch size 32, early stopping patience 10. Training: Adam (lr=0.001, beta1=0.9, beta2=0.999), BCE loss, batch size 32, early stopping patience 10.
@@ -49,8 +61,9 @@ Training: Adam (lr=0.001, beta1=0.9, beta2=0.999), BCE loss, batch size 32, earl
|------|---------| |------|---------|
| `main.c` | CLI dispatch | | `main.c` | CLI dispatch |
| `tga.h/tga.c` | TGA reader/writer — BGRA↔RGBA8888, row-order handling, per-pixel write-in-place | | `tga.h/tga.c` | TGA reader/writer — BGRA↔RGBA8888, row-order handling, per-pixel write-in-place |
| `nn.h/nn.c` | Tensor, Conv2D (same padding), Dense, LeakyReLU, sigmoid, Adam, He init | | `nn.h/nn.c` | Tensor, Conv2D (configurable padding), Dense, SiLU, sigmoid, global avg pool, Adam, He init |
| `safetensor.h/safetensor.c` | `.safetensors` serialisation — 12 named tensors + JSON metadata | | `safetensor.h/safetensor.c` | `.safetensors` serialisation — 8 named tensors + JSON metadata |
| `train_torch.py` | PyTorch training script — same data pipeline and architecture, exports C-compatible safetensors |
| `train.h/train.c` | Data collection from sheets, training loop, validation, label distribution | | `train.h/train.c` | Data collection from sheets, training loop, validation, label distribution |
| `apply.h/apply.c` | Backup, eligibility checks, inference, pixel composition | | `apply.h/apply.c` | Backup, eligibility checks, inference, pixel composition |

View File

@@ -2,6 +2,7 @@
#include "tga.h" #include "tga.h"
#include "nn.h" #include "nn.h"
#include "safetensor.h" #include "safetensor.h"
#include "unicode_filter.h"
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
@@ -75,7 +76,8 @@ int apply_model(const char *tga_path) {
int rows = img->height / cell_h; int rows = img->height / cell_h;
int total_cells = cols * rows; int total_cells = cols * rows;
int processed = 0, updated = 0, skipped = 0; int start_code = sheet_start_code(basename);
int processed = 0, updated = 0, skipped = 0, fixed_lm = 0;
for (int index = 0; index < total_cells; index++) { for (int index = 0; index < total_cells; index++) {
int cell_x, cell_y; int cell_x, cell_y;
@@ -107,6 +109,21 @@ int apply_model(const char *tga_path) {
int opcode = (int)((dir_pixel >> 24) & 0xFF); int opcode = (int)((dir_pixel >> 24) & 0xFF);
if (opcode != 0) { skipped++; continue; } if (opcode != 0) { skipped++; continue; }
/* Modifier letters: fixed kern pixel, skip inference */
if (start_code >= 0 && is_modifier_letter(start_code + index)) {
if (is_subscript_modifier(start_code + index)) {
/* Subscript: CDEFGHJK(B), lowheight=1 */
tga_write_pixel(tga_path, img, tag_x, tag_y + 5, 0xFFFFFFFF);
tga_write_pixel(tga_path, img, tag_x, tag_y + 6, 0x00C03FFF);
} else {
/* Superscript: ABCDEF(B), lowheight=0 */
tga_write_pixel(tga_path, img, tag_x, tag_y + 5, 0x00000000);
tga_write_pixel(tga_path, img, tag_x, tag_y + 6, 0x0000FCFF);
}
processed++; updated++; fixed_lm++;
continue;
}
/* Extract 15x20 binary input */ /* Extract 15x20 binary input */
float input[300]; float input[300];
for (int gy = 0; gy < 20; gy++) { for (int gy = 0; gy < 20; gy++) {
@@ -155,8 +172,8 @@ int apply_model(const char *tga_path) {
updated++; updated++;
} }
printf("Processed: %d cells, Updated: %d, Skipped: %d (of %d total)\n", printf("Processed: %d cells, Updated: %d, Skipped: %d, Fixed Lm: %d (of %d total)\n",
processed, updated, skipped, total_cells); processed, updated, skipped, fixed_lm, total_cells);
tga_free(img); tga_free(img);
network_free(net); network_free(net);

Binary file not shown.

68
Autokem/eval.sh Executable file
View File

@@ -0,0 +1,68 @@
#!/usr/bin/env bash
# Run train_torch.py N times and report mean ± stddev of per-bit and overall accuracy.
# Usage: ./eval.sh [runs] [extra train_torch.py args...]
# e.g. ./eval.sh 10
# ./eval.sh 5 --epochs 300 --lr 0.0005
set -euo pipefail
cd "$(dirname "$0")"
RUNS="${1:-42}"
shift 2>/dev/null || true
EXTRA_ARGS="$*"
PYTHON="${PYTHON:-.venv/bin/python3}"
RESULTS_FILE=$(mktemp)
trap 'rm -f "$RESULTS_FILE"' EXIT
echo "=== Autokem evaluation: $RUNS runs ==="
[ -n "$EXTRA_ARGS" ] && echo "Extra args: $EXTRA_ARGS"
echo
for i in $(seq 1 "$RUNS"); do
echo "--- Run $i/$RUNS ---"
OUT=$("$PYTHON" train_torch.py --save /dev/null $EXTRA_ARGS 2>&1)
# Extract per-bit line (the one after "Per-bit accuracy"): A:53.9% B:46.7% ...
PERBIT=$(echo "$OUT" | grep -A1 'Per-bit accuracy' | tail -1)
# Extract overall line: Overall: 5267/6660 (79.08%)
OVERALL=$(echo "$OUT" | grep -oP 'Overall:.*\(\K[0-9.]+')
# Extract val_loss
VALLOSS=$(echo "$OUT" | grep -oP 'val_loss: \K[0-9.]+' | tail -1)
# Parse per-bit percentages into a tab-separated line
BITS=$(echo "$PERBIT" | grep -oP '[0-9.]+(?=%)' | tr '\n' '\t')
echo "$BITS$OVERALL $VALLOSS" >> "$RESULTS_FILE"
echo " val_loss=$VALLOSS overall=$OVERALL%"
done
echo
echo "=== Results ($RUNS runs) ==="
"$PYTHON" - "$RESULTS_FILE" <<'PYEOF'
import sys
import numpy as np
names = ['A','B','C','D','E','F','G','H','J','K','Ytype','LowH','Overall','ValLoss']
data = []
with open(sys.argv[1]) as f:
for line in f:
vals = line.strip().split('\t')
if len(vals) >= len(names):
data.append([float(v) for v in vals[:len(names)]])
if not data:
print("No data collected!")
sys.exit(1)
arr = np.array(data)
means = arr.mean(axis=0)
stds = arr.std(axis=0)
print(f"{'Metric':<10s} {'Mean':>8s} {'StdDev':>8s}")
print("-" * 28)
for i, name in enumerate(names):
unit = '' if name == 'ValLoss' else '%'
print(f"{name:<10s} {means[i]:>7.2f}{unit} {stds[i]:>7.2f}{unit}")
PYEOF

View File

@@ -71,14 +71,6 @@ static void he_init(Tensor *w, int fan_in) {
/* ---- Activations ---- */ /* ---- Activations ---- */
static inline float leaky_relu(float x) {
return x >= 0.0f ? x : 0.01f * x;
}
static inline float leaky_relu_grad(float x) {
return x >= 0.0f ? 1.0f : 0.01f;
}
static inline float sigmoid_f(float x) { static inline float sigmoid_f(float x) {
if (x >= 0.0f) { if (x >= 0.0f) {
float ez = expf(-x); float ez = expf(-x);
@@ -89,13 +81,24 @@ static inline float sigmoid_f(float x) {
} }
} }
static inline float silu_f(float x) {
return x * sigmoid_f(x);
}
static inline float silu_grad(float x) {
float s = sigmoid_f(x);
return s * (1.0f + x * (1.0f - s));
}
/* ---- Conv2D forward/backward ---- */ /* ---- Conv2D forward/backward ---- */
static void conv2d_init(Conv2D *c, int in_ch, int out_ch, int kh, int kw) { static void conv2d_init(Conv2D *c, int in_ch, int out_ch, int kh, int kw, int pad) {
c->in_ch = in_ch; c->in_ch = in_ch;
c->out_ch = out_ch; c->out_ch = out_ch;
c->kh = kh; c->kh = kh;
c->kw = kw; c->kw = kw;
c->pad_h = pad;
c->pad_w = pad;
int wshape[] = {out_ch, in_ch, kh, kw}; int wshape[] = {out_ch, in_ch, kh, kw};
int bshape[] = {out_ch}; int bshape[] = {out_ch};
@@ -125,13 +128,13 @@ static void conv2d_free(Conv2D *c) {
tensor_free(c->input_cache); tensor_free(c->input_cache);
} }
/* Forward: input [batch, in_ch, H, W] -> output [batch, out_ch, H, W] (same padding) */ /* Forward: input [batch, in_ch, H, W] -> output [batch, out_ch, oH, oW] */
static Tensor *conv2d_forward(Conv2D *c, Tensor *input, int training) { static Tensor *conv2d_forward(Conv2D *c, Tensor *input, int training) {
int batch = input->shape[0]; int batch = input->shape[0];
int in_ch = c->in_ch, out_ch = c->out_ch; int in_ch = c->in_ch, out_ch = c->out_ch;
int H = input->shape[2], W = input->shape[3]; int H = input->shape[2], W = input->shape[3];
int kh = c->kh, kw = c->kw; int kh = c->kh, kw = c->kw;
int ph = kh / 2, pw = kw / 2; int ph = c->pad_h, pw = c->pad_w;
if (training) { if (training) {
tensor_free(c->input_cache); tensor_free(c->input_cache);
@@ -139,13 +142,15 @@ static Tensor *conv2d_forward(Conv2D *c, Tensor *input, int training) {
memcpy(c->input_cache->data, input->data, (size_t)input->size * sizeof(float)); memcpy(c->input_cache->data, input->data, (size_t)input->size * sizeof(float));
} }
int oshape[] = {batch, out_ch, H, W}; int oH = H + 2 * ph - kh + 1;
int oW = W + 2 * pw - kw + 1;
int oshape[] = {batch, out_ch, oH, oW};
Tensor *out = tensor_alloc(4, oshape); Tensor *out = tensor_alloc(4, oshape);
for (int b = 0; b < batch; b++) { for (int b = 0; b < batch; b++) {
for (int oc = 0; oc < out_ch; oc++) { for (int oc = 0; oc < out_ch; oc++) {
for (int oh = 0; oh < H; oh++) { for (int oh = 0; oh < oH; oh++) {
for (int ow = 0; ow < W; ow++) { for (int ow = 0; ow < oW; ow++) {
float sum = c->bias->data[oc]; float sum = c->bias->data[oc];
for (int ic = 0; ic < in_ch; ic++) { for (int ic = 0; ic < in_ch; ic++) {
for (int fh = 0; fh < kh; fh++) { for (int fh = 0; fh < kh; fh++) {
@@ -160,7 +165,7 @@ static Tensor *conv2d_forward(Conv2D *c, Tensor *input, int training) {
} }
} }
} }
out->data[((b * out_ch + oc) * H + oh) * W + ow] = sum; out->data[((b * out_ch + oc) * oH + oh) * oW + ow] = sum;
} }
} }
} }
@@ -168,22 +173,23 @@ static Tensor *conv2d_forward(Conv2D *c, Tensor *input, int training) {
return out; return out;
} }
/* Backward: grad_output [batch, out_ch, H, W] -> grad_input [batch, in_ch, H, W] */ /* Backward: grad_output [batch, out_ch, oH, oW] -> grad_input [batch, in_ch, H, W] */
static Tensor *conv2d_backward(Conv2D *c, Tensor *grad_output) { static Tensor *conv2d_backward(Conv2D *c, Tensor *grad_output) {
Tensor *input = c->input_cache; Tensor *input = c->input_cache;
int batch = input->shape[0]; int batch = input->shape[0];
int in_ch = c->in_ch, out_ch = c->out_ch; int in_ch = c->in_ch, out_ch = c->out_ch;
int H = input->shape[2], W = input->shape[3]; int H = input->shape[2], W = input->shape[3];
int kh = c->kh, kw = c->kw; int kh = c->kh, kw = c->kw;
int ph = kh / 2, pw = kw / 2; int ph = c->pad_h, pw = c->pad_w;
int oH = grad_output->shape[2], oW = grad_output->shape[3];
Tensor *grad_input = tensor_zeros(input->ndim, input->shape); Tensor *grad_input = tensor_zeros(input->ndim, input->shape);
for (int b = 0; b < batch; b++) { for (int b = 0; b < batch; b++) {
for (int oc = 0; oc < out_ch; oc++) { for (int oc = 0; oc < out_ch; oc++) {
for (int oh = 0; oh < H; oh++) { for (int oh = 0; oh < oH; oh++) {
for (int ow = 0; ow < W; ow++) { for (int ow = 0; ow < oW; ow++) {
float go = grad_output->data[((b * out_ch + oc) * H + oh) * W + ow]; float go = grad_output->data[((b * out_ch + oc) * oH + oh) * oW + ow];
c->grad_bias->data[oc] += go; c->grad_bias->data[oc] += go;
for (int ic = 0; ic < in_ch; ic++) { for (int ic = 0; ic < in_ch; ic++) {
for (int fh = 0; fh < kh; fh++) { for (int fh = 0; fh < kh; fh++) {
@@ -288,22 +294,68 @@ static Tensor *dense_backward(Dense *d, Tensor *grad_output) {
return grad_input; return grad_input;
} }
/* ---- LeakyReLU helpers on tensors ---- */ /* ---- SiLU helpers on tensors ---- */
static Tensor *apply_leaky_relu(Tensor *input) { static Tensor *apply_silu(Tensor *input) {
Tensor *out = tensor_alloc(input->ndim, input->shape); Tensor *out = tensor_alloc(input->ndim, input->shape);
for (int i = 0; i < input->size; i++) for (int i = 0; i < input->size; i++)
out->data[i] = leaky_relu(input->data[i]); out->data[i] = silu_f(input->data[i]);
return out; return out;
} }
static Tensor *apply_leaky_relu_backward(Tensor *grad_output, Tensor *pre_activation) { static Tensor *apply_silu_backward(Tensor *grad_output, Tensor *pre_activation) {
Tensor *grad = tensor_alloc(grad_output->ndim, grad_output->shape); Tensor *grad = tensor_alloc(grad_output->ndim, grad_output->shape);
for (int i = 0; i < grad_output->size; i++) for (int i = 0; i < grad_output->size; i++)
grad->data[i] = grad_output->data[i] * leaky_relu_grad(pre_activation->data[i]); grad->data[i] = grad_output->data[i] * silu_grad(pre_activation->data[i]);
return grad; return grad;
} }
/* ---- Global Average Pooling ---- */
/* Forward: input [batch, C, H, W] -> output [batch, C] */
static Tensor *global_avg_pool_forward(Tensor *input) {
int batch = input->shape[0];
int C = input->shape[1];
int H = input->shape[2];
int W = input->shape[3];
int hw = H * W;
int oshape[] = {batch, C};
Tensor *out = tensor_alloc(2, oshape);
for (int b = 0; b < batch; b++) {
for (int c = 0; c < C; c++) {
float sum = 0.0f;
int base = (b * C + c) * hw;
for (int i = 0; i < hw; i++)
sum += input->data[base + i];
out->data[b * C + c] = sum / (float)hw;
}
}
return out;
}
/* Backward: grad_output [batch, C] -> grad_input [batch, C, H, W] */
static Tensor *global_avg_pool_backward(Tensor *grad_output, int H, int W) {
int batch = grad_output->shape[0];
int C = grad_output->shape[1];
int hw = H * W;
float scale = 1.0f / (float)hw;
int ishape[] = {batch, C, H, W};
Tensor *grad_input = tensor_alloc(4, ishape);
for (int b = 0; b < batch; b++) {
for (int c = 0; c < C; c++) {
float go = grad_output->data[b * C + c] * scale;
int base = (b * C + c) * hw;
for (int i = 0; i < hw; i++)
grad_input->data[base + i] = go;
}
}
return grad_input;
}
/* ---- Sigmoid on tensor ---- */ /* ---- Sigmoid on tensor ---- */
static Tensor *apply_sigmoid(Tensor *input) { static Tensor *apply_sigmoid(Tensor *input) {
@@ -335,12 +387,10 @@ Network *network_create(void) {
rng_seed((uint64_t)time(NULL) ^ 0xDEADBEEF); rng_seed((uint64_t)time(NULL) ^ 0xDEADBEEF);
Network *net = calloc(1, sizeof(Network)); Network *net = calloc(1, sizeof(Network));
conv2d_init(&net->conv1, 1, 12, 3, 3); conv2d_init(&net->conv1, 1, 32, 7, 7, 1);
conv2d_init(&net->conv2, 12, 16, 3, 3); conv2d_init(&net->conv2, 32, 64, 7, 7, 1);
dense_init(&net->fc1, 4800, 24); dense_init(&net->fc1, 64, 256);
dense_init(&net->head_shape, 24, 10); dense_init(&net->output, 256, 12);
dense_init(&net->head_ytype, 24, 1);
dense_init(&net->head_lowheight, 24, 1);
return net; return net;
} }
@@ -349,133 +399,92 @@ void network_free(Network *net) {
conv2d_free(&net->conv1); conv2d_free(&net->conv1);
conv2d_free(&net->conv2); conv2d_free(&net->conv2);
dense_free(&net->fc1); dense_free(&net->fc1);
dense_free(&net->head_shape); dense_free(&net->output);
dense_free(&net->head_ytype);
dense_free(&net->head_lowheight);
tensor_free(net->act_conv1); tensor_free(net->act_conv1);
tensor_free(net->act_relu1); tensor_free(net->act_silu1);
tensor_free(net->act_conv2); tensor_free(net->act_conv2);
tensor_free(net->act_relu2); tensor_free(net->act_silu2);
tensor_free(net->act_flat); tensor_free(net->act_pool);
tensor_free(net->act_fc1); tensor_free(net->act_fc1);
tensor_free(net->act_relu3); tensor_free(net->act_silu3);
tensor_free(net->out_shape); tensor_free(net->act_logits);
tensor_free(net->out_ytype); tensor_free(net->out_all);
tensor_free(net->out_lowheight);
free(net); free(net);
} }
static void free_activations(Network *net) { static void free_activations(Network *net) {
tensor_free(net->act_conv1); net->act_conv1 = NULL; tensor_free(net->act_conv1); net->act_conv1 = NULL;
tensor_free(net->act_relu1); net->act_relu1 = NULL; tensor_free(net->act_silu1); net->act_silu1 = NULL;
tensor_free(net->act_conv2); net->act_conv2 = NULL; tensor_free(net->act_conv2); net->act_conv2 = NULL;
tensor_free(net->act_relu2); net->act_relu2 = NULL; tensor_free(net->act_silu2); net->act_silu2 = NULL;
tensor_free(net->act_flat); net->act_flat = NULL; tensor_free(net->act_pool); net->act_pool = NULL;
tensor_free(net->act_fc1); net->act_fc1 = NULL; tensor_free(net->act_fc1); net->act_fc1 = NULL;
tensor_free(net->act_relu3); net->act_relu3 = NULL; tensor_free(net->act_silu3); net->act_silu3 = NULL;
tensor_free(net->out_shape); net->out_shape = NULL; tensor_free(net->act_logits); net->act_logits = NULL;
tensor_free(net->out_ytype); net->out_ytype = NULL; tensor_free(net->out_all); net->out_all = NULL;
tensor_free(net->out_lowheight); net->out_lowheight = NULL;
} }
void network_forward(Network *net, Tensor *input, int training) { void network_forward(Network *net, Tensor *input, int training) {
free_activations(net); free_activations(net);
/* Conv1 -> LeakyReLU */ /* Conv1 -> SiLU */
net->act_conv1 = conv2d_forward(&net->conv1, input, training); net->act_conv1 = conv2d_forward(&net->conv1, input, training);
net->act_relu1 = apply_leaky_relu(net->act_conv1); net->act_silu1 = apply_silu(net->act_conv1);
/* Conv2 -> LeakyReLU */ /* Conv2 -> SiLU */
net->act_conv2 = conv2d_forward(&net->conv2, net->act_relu1, training); net->act_conv2 = conv2d_forward(&net->conv2, net->act_silu1, training);
net->act_relu2 = apply_leaky_relu(net->act_conv2); net->act_silu2 = apply_silu(net->act_conv2);
/* Flatten: [batch, 16, 20, 15] -> [batch, 4800] */ /* Global Average Pool */
int batch = net->act_relu2->shape[0]; net->act_pool = global_avg_pool_forward(net->act_silu2);
int flat_size = net->act_relu2->size / batch;
int fshape[] = {batch, flat_size};
net->act_flat = tensor_alloc(2, fshape);
memcpy(net->act_flat->data, net->act_relu2->data, (size_t)net->act_relu2->size * sizeof(float));
/* FC1 -> LeakyReLU */ /* FC1 -> SiLU */
net->act_fc1 = dense_forward(&net->fc1, net->act_flat, training); net->act_fc1 = dense_forward(&net->fc1, net->act_pool, training);
net->act_relu3 = apply_leaky_relu(net->act_fc1); net->act_silu3 = apply_silu(net->act_fc1);
/* Three heads with sigmoid */ /* Output -> Sigmoid */
Tensor *logit_shape = dense_forward(&net->head_shape, net->act_relu3, training); net->act_logits = dense_forward(&net->output, net->act_silu3, training);
Tensor *logit_ytype = dense_forward(&net->head_ytype, net->act_relu3, training); net->out_all = apply_sigmoid(net->act_logits);
Tensor *logit_lowheight = dense_forward(&net->head_lowheight, net->act_relu3, training);
net->out_shape = apply_sigmoid(logit_shape);
net->out_ytype = apply_sigmoid(logit_ytype);
net->out_lowheight = apply_sigmoid(logit_lowheight);
tensor_free(logit_shape);
tensor_free(logit_ytype);
tensor_free(logit_lowheight);
} }
void network_backward(Network *net, Tensor *target_shape, Tensor *target_ytype, Tensor *target_lowheight) { void network_backward(Network *net, Tensor *target) {
int batch = net->out_shape->shape[0]; int batch = net->out_all->shape[0];
int n_out = 12;
/* BCE gradient at sigmoid: d_logit = pred - target */ /* BCE gradient at sigmoid: d_logit = (pred - target) / batch */
/* Head: shape (10 outputs) */ int gs[] = {batch, n_out};
int gs[] = {batch, 10}; Tensor *grad_logits = tensor_alloc(2, gs);
Tensor *grad_logit_shape = tensor_alloc(2, gs); for (int i = 0; i < batch * n_out; i++)
for (int i = 0; i < batch * 10; i++) grad_logits->data[i] = (net->out_all->data[i] - target->data[i]) / (float)batch;
grad_logit_shape->data[i] = (net->out_shape->data[i] - target_shape->data[i]) / (float)batch;
int gy[] = {batch, 1}; /* Output layer backward */
Tensor *grad_logit_ytype = tensor_alloc(2, gy); Tensor *grad_silu3 = dense_backward(&net->output, grad_logits);
for (int i = 0; i < batch; i++) tensor_free(grad_logits);
grad_logit_ytype->data[i] = (net->out_ytype->data[i] - target_ytype->data[i]) / (float)batch;
Tensor *grad_logit_lh = tensor_alloc(2, gy); /* SiLU backward (fc1) */
for (int i = 0; i < batch; i++) Tensor *grad_fc1_out = apply_silu_backward(grad_silu3, net->act_fc1);
grad_logit_lh->data[i] = (net->out_lowheight->data[i] - target_lowheight->data[i]) / (float)batch; tensor_free(grad_silu3);
/* Backward through heads */ /* FC1 backward */
Tensor *grad_relu3_s = dense_backward(&net->head_shape, grad_logit_shape); Tensor *grad_pool = dense_backward(&net->fc1, grad_fc1_out);
Tensor *grad_relu3_y = dense_backward(&net->head_ytype, grad_logit_ytype);
Tensor *grad_relu3_l = dense_backward(&net->head_lowheight, grad_logit_lh);
/* Sum gradients from three heads */
int r3shape[] = {batch, 24};
Tensor *grad_relu3 = tensor_zeros(2, r3shape);
for (int i = 0; i < batch * 24; i++)
grad_relu3->data[i] = grad_relu3_s->data[i] + grad_relu3_y->data[i] + grad_relu3_l->data[i];
tensor_free(grad_logit_shape);
tensor_free(grad_logit_ytype);
tensor_free(grad_logit_lh);
tensor_free(grad_relu3_s);
tensor_free(grad_relu3_y);
tensor_free(grad_relu3_l);
/* LeakyReLU backward (fc1 output) */
Tensor *grad_fc1_out = apply_leaky_relu_backward(grad_relu3, net->act_fc1);
tensor_free(grad_relu3);
/* Dense fc1 backward */
Tensor *grad_flat = dense_backward(&net->fc1, grad_fc1_out);
tensor_free(grad_fc1_out); tensor_free(grad_fc1_out);
/* Unflatten: [batch, 4800] -> [batch, 16, 20, 15] */ /* Global Average Pool backward */
int ushape[] = {batch, 16, 20, 15}; int H = net->act_silu2->shape[2], W = net->act_silu2->shape[3];
Tensor *grad_relu2 = tensor_alloc(4, ushape); Tensor *grad_silu2 = global_avg_pool_backward(grad_pool, H, W);
memcpy(grad_relu2->data, grad_flat->data, (size_t)grad_flat->size * sizeof(float)); tensor_free(grad_pool);
tensor_free(grad_flat);
/* LeakyReLU backward (conv2 output) */ /* SiLU backward (conv2) */
Tensor *grad_conv2_out = apply_leaky_relu_backward(grad_relu2, net->act_conv2); Tensor *grad_conv2_out = apply_silu_backward(grad_silu2, net->act_conv2);
tensor_free(grad_relu2); tensor_free(grad_silu2);
/* Conv2 backward */ /* Conv2 backward */
Tensor *grad_relu1 = conv2d_backward(&net->conv2, grad_conv2_out); Tensor *grad_silu1 = conv2d_backward(&net->conv2, grad_conv2_out);
tensor_free(grad_conv2_out); tensor_free(grad_conv2_out);
/* LeakyReLU backward (conv1 output) */ /* SiLU backward (conv1) */
Tensor *grad_conv1_out = apply_leaky_relu_backward(grad_relu1, net->act_conv1); Tensor *grad_conv1_out = apply_silu_backward(grad_silu1, net->act_conv1);
tensor_free(grad_relu1); tensor_free(grad_silu1);
/* Conv1 backward */ /* Conv1 backward */
Tensor *grad_input = conv2d_backward(&net->conv1, grad_conv1_out); Tensor *grad_input = conv2d_backward(&net->conv1, grad_conv1_out);
@@ -490,12 +499,8 @@ void network_adam_step(Network *net, float lr, float beta1, float beta2, float e
adam_update(net->conv2.bias, net->conv2.grad_bias, net->conv2.m_bias, net->conv2.v_bias, lr, beta1, beta2, eps, t); adam_update(net->conv2.bias, net->conv2.grad_bias, net->conv2.m_bias, net->conv2.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->fc1.weight, net->fc1.grad_weight, net->fc1.m_weight, net->fc1.v_weight, lr, beta1, beta2, eps, t); adam_update(net->fc1.weight, net->fc1.grad_weight, net->fc1.m_weight, net->fc1.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->fc1.bias, net->fc1.grad_bias, net->fc1.m_bias, net->fc1.v_bias, lr, beta1, beta2, eps, t); adam_update(net->fc1.bias, net->fc1.grad_bias, net->fc1.m_bias, net->fc1.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->head_shape.weight, net->head_shape.grad_weight, net->head_shape.m_weight, net->head_shape.v_weight, lr, beta1, beta2, eps, t); adam_update(net->output.weight, net->output.grad_weight, net->output.m_weight, net->output.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->head_shape.bias, net->head_shape.grad_bias, net->head_shape.m_bias, net->head_shape.v_bias, lr, beta1, beta2, eps, t); adam_update(net->output.bias, net->output.grad_bias, net->output.m_bias, net->output.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->head_ytype.weight, net->head_ytype.grad_weight, net->head_ytype.m_weight, net->head_ytype.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->head_ytype.bias, net->head_ytype.grad_bias, net->head_ytype.m_bias, net->head_ytype.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->head_lowheight.weight, net->head_lowheight.grad_weight, net->head_lowheight.m_weight, net->head_lowheight.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->head_lowheight.bias, net->head_lowheight.grad_bias, net->head_lowheight.m_bias, net->head_lowheight.v_bias, lr, beta1, beta2, eps, t);
} }
void network_zero_grad(Network *net) { void network_zero_grad(Network *net) {
@@ -505,34 +510,18 @@ void network_zero_grad(Network *net) {
memset(net->conv2.grad_bias->data, 0, (size_t)net->conv2.grad_bias->size * sizeof(float)); memset(net->conv2.grad_bias->data, 0, (size_t)net->conv2.grad_bias->size * sizeof(float));
memset(net->fc1.grad_weight->data, 0, (size_t)net->fc1.grad_weight->size * sizeof(float)); memset(net->fc1.grad_weight->data, 0, (size_t)net->fc1.grad_weight->size * sizeof(float));
memset(net->fc1.grad_bias->data, 0, (size_t)net->fc1.grad_bias->size * sizeof(float)); memset(net->fc1.grad_bias->data, 0, (size_t)net->fc1.grad_bias->size * sizeof(float));
memset(net->head_shape.grad_weight->data, 0, (size_t)net->head_shape.grad_weight->size * sizeof(float)); memset(net->output.grad_weight->data, 0, (size_t)net->output.grad_weight->size * sizeof(float));
memset(net->head_shape.grad_bias->data, 0, (size_t)net->head_shape.grad_bias->size * sizeof(float)); memset(net->output.grad_bias->data, 0, (size_t)net->output.grad_bias->size * sizeof(float));
memset(net->head_ytype.grad_weight->data, 0, (size_t)net->head_ytype.grad_weight->size * sizeof(float));
memset(net->head_ytype.grad_bias->data, 0, (size_t)net->head_ytype.grad_bias->size * sizeof(float));
memset(net->head_lowheight.grad_weight->data, 0, (size_t)net->head_lowheight.grad_weight->size * sizeof(float));
memset(net->head_lowheight.grad_bias->data, 0, (size_t)net->head_lowheight.grad_bias->size * sizeof(float));
} }
float network_bce_loss(Network *net, Tensor *target_shape, Tensor *target_ytype, Tensor *target_lowheight) { float network_bce_loss(Network *net, Tensor *target) {
float loss = 0.0f; float loss = 0.0f;
int batch = net->out_shape->shape[0]; int batch = net->out_all->shape[0];
int n = batch * 12;
for (int i = 0; i < batch * 10; i++) { for (int i = 0; i < n; i++) {
float p = net->out_shape->data[i]; float p = fmaxf(1e-7f, fminf(1.0f - 1e-7f, net->out_all->data[i]));
float t = target_shape->data[i]; float t = target->data[i];
p = fmaxf(1e-7f, fminf(1.0f - 1e-7f, p));
loss -= t * logf(p) + (1.0f - t) * logf(1.0f - p);
}
for (int i = 0; i < batch; i++) {
float p = net->out_ytype->data[i];
float t = target_ytype->data[i];
p = fmaxf(1e-7f, fminf(1.0f - 1e-7f, p));
loss -= t * logf(p) + (1.0f - t) * logf(1.0f - p);
}
for (int i = 0; i < batch; i++) {
float p = net->out_lowheight->data[i];
float t = target_lowheight->data[i];
p = fmaxf(1e-7f, fminf(1.0f - 1e-7f, p));
loss -= t * logf(p) + (1.0f - t) * logf(1.0f - p); loss -= t * logf(p) + (1.0f - t) * logf(1.0f - p);
} }
@@ -546,11 +535,8 @@ void network_infer(Network *net, const float *input300, float *output12) {
network_forward(net, input, 0); network_forward(net, input, 0);
/* output order: A,B,C,D,E,F,G,H,J,K, ytype, lowheight */ for (int i = 0; i < 12; i++)
for (int i = 0; i < 10; i++) output12[i] = net->out_all->data[i];
output12[i] = net->out_shape->data[i];
output12[10] = net->out_ytype->data[0];
output12[11] = net->out_lowheight->data[0];
tensor_free(input); tensor_free(input);
} }

View File

@@ -20,6 +20,7 @@ void tensor_free(Tensor *t);
typedef struct { typedef struct {
int in_ch, out_ch, kh, kw; int in_ch, out_ch, kh, kw;
int pad_h, pad_w;
Tensor *weight; /* [out_ch, in_ch, kh, kw] */ Tensor *weight; /* [out_ch, in_ch, kh, kw] */
Tensor *bias; /* [out_ch] */ Tensor *bias; /* [out_ch] */
Tensor *grad_weight; Tensor *grad_weight;
@@ -45,35 +46,32 @@ typedef struct {
/* ---- Network ---- */ /* ---- Network ---- */
typedef struct { typedef struct {
Conv2D conv1; /* 1->12, 3x3 */ Conv2D conv1; /* 1->32, 7x7, pad=1 */
Conv2D conv2; /* 12->16, 3x3 */ Conv2D conv2; /* 32->64, 7x7, pad=1 */
Dense fc1; /* 4800->24 */ Dense fc1; /* 64->256 */
Dense head_shape; /* 24->10 (bits A-H, J, K) */ Dense output; /* 256->12 (10 shape + 1 ytype + 1 lowheight) */
Dense head_ytype; /* 24->1 */
Dense head_lowheight;/* 24->1 */
/* activation caches (allocated per forward) */ /* activation caches (allocated per forward) */
Tensor *act_conv1; Tensor *act_conv1;
Tensor *act_relu1; Tensor *act_silu1;
Tensor *act_conv2; Tensor *act_conv2;
Tensor *act_relu2; Tensor *act_silu2;
Tensor *act_flat; Tensor *act_pool; /* global average pool output */
Tensor *act_fc1; Tensor *act_fc1;
Tensor *act_relu3; Tensor *act_silu3;
Tensor *out_shape; Tensor *act_logits; /* pre-sigmoid */
Tensor *out_ytype; Tensor *out_all; /* sigmoid output [batch, 12] */
Tensor *out_lowheight;
} Network; } Network;
/* Init / free */ /* Init / free */
Network *network_create(void); Network *network_create(void);
void network_free(Network *net); void network_free(Network *net);
/* Forward pass. input: [batch, 1, 20, 15]. Outputs stored in net->out_* */ /* Forward pass. input: [batch, 1, 20, 15]. Output stored in net->out_all */
void network_forward(Network *net, Tensor *input, int training); void network_forward(Network *net, Tensor *input, int training);
/* Backward pass. targets: shape[batch,10], ytype[batch,1], lowheight[batch,1] */ /* Backward pass. target: [batch, 12] */
void network_backward(Network *net, Tensor *target_shape, Tensor *target_ytype, Tensor *target_lowheight); void network_backward(Network *net, Tensor *target);
/* Adam update step */ /* Adam update step */
void network_adam_step(Network *net, float lr, float beta1, float beta2, float eps, int t); void network_adam_step(Network *net, float lr, float beta1, float beta2, float eps, int t);
@@ -81,8 +79,8 @@ void network_adam_step(Network *net, float lr, float beta1, float beta2, float e
/* Zero all gradients */ /* Zero all gradients */
void network_zero_grad(Network *net); void network_zero_grad(Network *net);
/* Compute BCE loss (sum of all heads) */ /* Compute BCE loss */
float network_bce_loss(Network *net, Tensor *target_shape, Tensor *target_ytype, Tensor *target_lowheight); float network_bce_loss(Network *net, Tensor *target);
/* Single-sample inference: input float[300], output float[12] (A-H,J,K,ytype,lowheight) */ /* Single-sample inference: input float[300], output float[12] (A-H,J,K,ytype,lowheight) */
void network_infer(Network *net, const float *input300, float *output12); void network_infer(Network *net, const float *input300, float *output12);

View File

@@ -31,18 +31,14 @@ static void collect_tensors(Network *net, TensorEntry *entries, int *count) {
ADD("conv2.bias", conv2, bias); ADD("conv2.bias", conv2, bias);
ADD("fc1.weight", fc1, weight); ADD("fc1.weight", fc1, weight);
ADD("fc1.bias", fc1, bias); ADD("fc1.bias", fc1, bias);
ADD("head_shape.weight", head_shape, weight); ADD("output.weight", output, weight);
ADD("head_shape.bias", head_shape, bias); ADD("output.bias", output, bias);
ADD("head_ytype.weight", head_ytype, weight);
ADD("head_ytype.bias", head_ytype, bias);
ADD("head_lowheight.weight", head_lowheight, weight);
ADD("head_lowheight.bias", head_lowheight, bias);
#undef ADD #undef ADD
*count = n; *count = n;
} }
int safetensor_save(const char *path, Network *net, int total_samples, int epochs, float val_loss) { int safetensor_save(const char *path, Network *net, int total_samples, int epochs, float val_loss) {
TensorEntry entries[12]; TensorEntry entries[8];
int count; int count;
collect_tensors(net, entries, &count); collect_tensors(net, entries, &count);
@@ -149,7 +145,7 @@ int safetensor_load(const char *path, Network *net) {
long data_start = 8 + (long)header_len; long data_start = 8 + (long)header_len;
TensorEntry entries[12]; TensorEntry entries[8];
int count; int count;
collect_tensors(net, entries, &count); collect_tensors(net, entries, &count);
@@ -234,14 +230,12 @@ int safetensor_stats(const char *path) {
const char *tensor_names[] = { const char *tensor_names[] = {
"conv1.weight", "conv1.bias", "conv2.weight", "conv2.bias", "conv1.weight", "conv1.bias", "conv2.weight", "conv2.bias",
"fc1.weight", "fc1.bias", "fc1.weight", "fc1.bias",
"head_shape.weight", "head_shape.bias", "output.weight", "output.bias"
"head_ytype.weight", "head_ytype.bias",
"head_lowheight.weight", "head_lowheight.bias"
}; };
int total_params = 0; int total_params = 0;
printf("\nTensors:\n"); printf("\nTensors:\n");
for (int i = 0; i < 12; i++) { for (int i = 0; i < 8; i++) {
size_t off_start, off_end; size_t off_start, off_end;
if (find_tensor_offsets(json, (size_t)header_len, tensor_names[i], &off_start, &off_end) == 0) { if (find_tensor_offsets(json, (size_t)header_len, tensor_names[i], &off_start, &off_end) == 0) {
int params = (int)(off_end - off_start) / 4; int params = (int)(off_end - off_start) / 4;

631
Autokem/sheet_stats.py Normal file
View File

@@ -0,0 +1,631 @@
#!/usr/bin/env python3
"""
Spritesheet statistics generator for TerrarumSansBitmap.
Scans all *_variable.tga sheets and reports:
- Width distribution
- Compiler directives (replaceWith breakdown)
- Kerning shape distribution
- Lowheight count
- Diacritics (anchors, writeOnTop, stacking)
- Glyphs missing kerning data
- Dot removal directives
- Nudge usage
- Alignment modes
- Per-sheet summary
Usage:
python sheet_stats.py [assets_dir]
python sheet_stats.py ../src/assets
"""
import os
import struct
import sys
from collections import Counter, defaultdict
# ---- TGA reader ----
class TgaImage:
__slots__ = ('width', 'height', 'pixels')
def __init__(self, width, height, pixels):
self.width = width
self.height = height
self.pixels = pixels
def get_pixel(self, x, y):
if x < 0 or x >= self.width or y < 0 or y >= self.height:
return 0
return self.pixels[y * self.width + x]
def read_tga(path):
with open(path, 'rb') as f:
data = f.read()
pos = 0
id_length = data[pos]; pos += 1
pos += 1 # colour_map_type
image_type = data[pos]; pos += 1
pos += 5
pos += 4 # x/y origin
width = struct.unpack_from('<H', data, pos)[0]; pos += 2
height = struct.unpack_from('<H', data, pos)[0]; pos += 2
bits_per_pixel = data[pos]; pos += 1
descriptor = data[pos]; pos += 1
top_to_bottom = (descriptor & 0x20) != 0
bpp = bits_per_pixel // 8
pos += id_length
if image_type != 2 or bpp not in (3, 4):
raise ValueError(f"Unsupported TGA: type={image_type}, bpp={bits_per_pixel}")
pixels = [0] * (width * height)
for row in range(height):
y = row if top_to_bottom else (height - 1 - row)
for x in range(width):
b = data[pos]; g = data[pos+1]; r = data[pos+2]
a = data[pos+3] if bpp == 4 else 0xFF
pos += bpp
pixels[y * width + x] = (r << 24) | (g << 16) | (b << 8) | a
return TgaImage(width, height, pixels)
def tagify(pixel):
return 0 if (pixel & 0xFF) == 0 else pixel
def signed_byte(val):
return val - 256 if val >= 128 else val
# ---- Unicode range classification ----
# Ranges to EXCLUDE from "missing kern" report
EXCLUDE_KERN_RANGES = [
(0x3400, 0xA000, 'CJK Unified Ideographs'),
(0x1100, 0x1200, 'Hangul Jamo'),
(0xA960, 0xA980, 'Hangul Jamo Extended-A'),
(0xD7B0, 0xD800, 'Hangul Jamo Extended-B'),
(0x3130, 0x3190, 'Hangul Compatibility Jamo'),
(0xAC00, 0xD7A4, 'Hangul Syllables'),
(0xE000, 0xE100, 'Custom Symbols (PUA)'),
(0xF0000, 0xF0600, 'Internal PUA'),
(0xFFE00, 0x100000, 'Internal control/PUA'),
(0x2800, 0x2900, 'Braille'),
(0x1FB00, 0x1FC00, 'Legacy Computing Symbols'),
(0x2400, 0x2440, 'Control Pictures'),
(0x3000, 0x3040, 'CJK Punctuation'),
(0x3040, 0x3100, 'Hiragana/Katakana'),
(0x31F0, 0x3200, 'Katakana Phonetic Ext'),
(0xFF00, 0x10000, 'Halfwidth/Fullwidth'),
(0x16A0, 0x1700, 'Runic'),
(0x300, 0x370, 'Combining Diacritical Marks'),
(0x1B000, 0x1B170, 'Hentaigana'),
]
def is_excluded_from_kern(cp):
for lo, hi, _ in EXCLUDE_KERN_RANGES:
if lo <= cp < hi:
return True
return False
def unicode_block_name(cp):
"""Rough Unicode block classification for display."""
blocks = [
(0x0000, 0x0080, 'Basic Latin'),
(0x0080, 0x0100, 'Latin-1 Supplement'),
(0x0100, 0x0180, 'Latin Extended-A'),
(0x0180, 0x0250, 'Latin Extended-B'),
(0x0250, 0x02B0, 'IPA Extensions'),
(0x02B0, 0x0300, 'Spacing Modifier Letters'),
(0x0300, 0x0370, 'Combining Diacritical Marks'),
(0x0370, 0x0400, 'Greek and Coptic'),
(0x0400, 0x0530, 'Cyrillic'),
(0x0530, 0x0590, 'Armenian'),
(0x0900, 0x0980, 'Devanagari'),
(0x0980, 0x0A00, 'Bengali'),
(0x0B80, 0x0C00, 'Tamil'),
(0x0E00, 0x0E80, 'Thai'),
(0x10D0, 0x1100, 'Georgian'),
(0x1100, 0x1200, 'Hangul Jamo'),
(0x13A0, 0x13F6, 'Cherokee'),
(0x1B80, 0x1BC0, 'Sundanese'),
(0x1C80, 0x1CC0, 'Cyrillic Extended'),
(0x1D00, 0x1DC0, 'Phonetic Extensions'),
(0x1E00, 0x1F00, 'Latin Extended Additional'),
(0x1F00, 0x2000, 'Greek Extended'),
(0x2000, 0x2070, 'General Punctuation'),
(0x20A0, 0x20D0, 'Currency Symbols'),
(0x2100, 0x2200, 'Letterlike Symbols'),
(0x2C60, 0x2C80, 'Latin Extended-C'),
(0x2DE0, 0x2E00, 'Cyrillic Extended-A'),
(0xA640, 0xA6A0, 'Cyrillic Extended-B'),
(0xA720, 0xA800, 'Latin Extended-D'),
(0xFB00, 0xFB50, 'Alphabetic Presentation Forms'),
(0x1F100, 0x1F200, 'Enclosed Alphanumeric Supplement'),
(0xF0000, 0xF0060, 'PUA Bulgarian'),
(0xF0060, 0xF00C0, 'PUA Serbian'),
(0xF0100, 0xF0500, 'PUA Devanagari Internal'),
(0xF0500, 0xF0600, 'PUA Sundanese/Codestyle'),
]
for lo, hi, name in blocks:
if lo <= cp < hi:
return name
return f'U+{cp:04X}'
# ---- Code ranges (from sheet_config.py) ----
CODE_RANGE = [
list(range(0x00, 0x100)),
list(range(0x1100, 0x1200)) + list(range(0xA960, 0xA980)) + list(range(0xD7B0, 0xD800)),
list(range(0x100, 0x180)),
list(range(0x180, 0x250)),
list(range(0x3040, 0x3100)) + list(range(0x31F0, 0x3200)),
list(range(0x3000, 0x3040)),
list(range(0x3400, 0xA000)),
list(range(0x400, 0x530)),
list(range(0xFF00, 0x10000)),
list(range(0x2000, 0x20A0)),
list(range(0x370, 0x3CF)),
list(range(0xE00, 0xE60)),
list(range(0x530, 0x590)),
list(range(0x10D0, 0x1100)),
list(range(0x250, 0x300)),
list(range(0x16A0, 0x1700)),
list(range(0x1E00, 0x1F00)),
list(range(0xE000, 0xE100)),
list(range(0xF0000, 0xF0060)),
list(range(0xF0060, 0xF00C0)),
list(range(0x13A0, 0x13F6)),
list(range(0x1D00, 0x1DC0)),
list(range(0x900, 0x980)) + list(range(0xF0100, 0xF0500)),
list(range(0x1C90, 0x1CC0)),
list(range(0x300, 0x370)),
list(range(0x1F00, 0x2000)),
list(range(0x2C60, 0x2C80)),
list(range(0xA720, 0xA800)),
list(range(0x20A0, 0x20D0)),
list(range(0xFFE00, 0xFFFA0)),
list(range(0x2100, 0x2200)),
list(range(0x1F100, 0x1F200)),
list(range(0x0B80, 0x0C00)) + list(range(0xF00C0, 0xF0100)),
list(range(0x980, 0xA00)),
list(range(0x2800, 0x2900)),
list(range(0x1B80, 0x1BC0)) + list(range(0x1CC0, 0x1CD0)) + list(range(0xF0500, 0xF0510)),
list(range(0xF0110, 0xF0130)),
list(range(0xF0520, 0xF0580)),
list(range(0xFB00, 0xFB18)),
list(range(0x1B000, 0x1B170)),
list(range(0x2400, 0x2440)),
list(range(0x1FB00, 0x1FC00)),
list(range(0xA640, 0xA6A0)),
list(range(0x2DE0, 0x2E00)),
list(range(0x1C80, 0x1C8F)),
]
FILE_LIST = [
"ascii_variable.tga",
"hangul_johab.tga",
"latinExtA_variable.tga",
"latinExtB_variable.tga",
"kana_variable.tga",
"cjkpunct_variable.tga",
"wenquanyi.tga",
"cyrilic_variable.tga",
"halfwidth_fullwidth_variable.tga",
"unipunct_variable.tga",
"greek_variable.tga",
"thai_variable.tga",
"hayeren_variable.tga",
"kartuli_variable.tga",
"ipa_ext_variable.tga",
"futhark.tga",
"latinExt_additional_variable.tga",
"puae000-e0ff.tga",
"cyrilic_bulgarian_variable.tga",
"cyrilic_serbian_variable.tga",
"tsalagi_variable.tga",
"phonetic_extensions_variable.tga",
"devanagari_variable.tga",
"kartuli_allcaps_variable.tga",
"diacritical_marks_variable.tga",
"greek_polytonic_xyswap_variable.tga",
"latinExtC_variable.tga",
"latinExtD_variable.tga",
"currencies_variable.tga",
"internal_variable.tga",
"letterlike_symbols_variable.tga",
"enclosed_alphanumeric_supplement_variable.tga",
"tamil_extrawide_variable.tga",
"bengali_variable.tga",
"braille_variable.tga",
"sundanese_variable.tga",
"devanagari_internal_extrawide_variable.tga",
"pua_codestyle_ascii_variable.tga",
"alphabetic_presentation_forms_extrawide_variable.tga",
"hentaigana_variable.tga",
"control_pictures_variable.tga",
"symbols_for_legacy_computing_variable.tga",
"cyrilic_extB_variable.tga",
"cyrilic_extA_variable.tga",
"cyrilic_extC_variable.tga",
]
def is_variable(fn):
return fn.endswith('_variable.tga')
def is_extra_wide(fn):
return 'extrawide' in fn.lower()
def is_xyswap(fn):
return 'xyswap' in fn.lower()
# ---- Shape tag formatting ----
SHAPE_CHARS = 'ABCDEFGHJK'
def format_shape(mask, is_ytype):
"""Format kerning mask + ytype as keming_machine tag, e.g. 'ABCDEFGH(B)'."""
bits = []
for i, ch in enumerate(SHAPE_CHARS):
bit_pos = [7, 6, 5, 4, 3, 2, 1, 0, 15, 14][i]
if (mask >> bit_pos) & 1:
bits.append(ch)
chars = ''.join(bits) if bits else '(empty)'
mode = '(Y)' if is_ytype else '(B)'
return f'{chars}{mode}'
# ---- Parsing ----
def parse_diacritics_anchors(img, tag_x, tag_y):
"""Return number of defined diacritics anchors (0-6)."""
count = 0
for i in range(6):
y_pos = 13 - (i // 3) * 2
shift = (3 - (i % 3)) * 8
y_pixel = tagify(img.get_pixel(tag_x, tag_y + y_pos))
x_pixel = tagify(img.get_pixel(tag_x, tag_y + y_pos + 1))
y_used = ((y_pixel >> shift) & 128) != 0
x_used = ((x_pixel >> shift) & 128) != 0
if y_used or x_used:
count += 1
return count
def parse_variable_sheet(path, code_range, is_xy, is_ew):
"""Parse a variable-width sheet and yield per-glyph stats dicts."""
img = read_tga(path)
cell_w = 32 if is_ew else 16
cell_h = 20
cols = img.width // cell_w
for index, code in enumerate(code_range):
if is_xy:
cell_x = (index // cols) * cell_w
cell_y = (index % cols) * cell_h
else:
cell_x = (index % cols) * cell_w
cell_y = (index // cols) * cell_h
tag_x = cell_x + (cell_w - 1)
tag_y = cell_y
# Width
width = 0
for y in range(5):
if img.get_pixel(tag_x, tag_y + y) & 0xFF:
width |= (1 << y)
if width == 0:
continue # empty cell
# Lowheight
is_low_height = (img.get_pixel(tag_x, tag_y + 5) & 0xFF) != 0
# Kerning data
kern_pixel = tagify(img.get_pixel(tag_x, tag_y + 6))
has_kern = (kern_pixel & 0xFF) != 0
is_ytype = (kern_pixel & 0x80000000) != 0 if has_kern else False
kern_mask = ((kern_pixel >> 8) & 0xFFFFFF) if has_kern else 0
# Dot removal (Y+7)
dot_pixel = tagify(img.get_pixel(tag_x, tag_y + 7))
has_dot_removal = dot_pixel != 0
# Compiler directive (Y+9)
dir_pixel = tagify(img.get_pixel(tag_x, tag_y + 9))
opcode = (dir_pixel >> 24) & 0xFF
arg1 = (dir_pixel >> 16) & 0xFF
arg2 = (dir_pixel >> 8) & 0xFF
# Nudge (Y+10)
nudge_pixel = tagify(img.get_pixel(tag_x, tag_y + 10))
nudge_x = signed_byte((nudge_pixel >> 24) & 0xFF) if nudge_pixel else 0
nudge_y = signed_byte((nudge_pixel >> 16) & 0xFF) if nudge_pixel else 0
has_nudge = nudge_x != 0 or nudge_y != 0
# Diacritics anchors (Y+11..Y+14)
n_anchors = parse_diacritics_anchors(img, tag_x, tag_y)
# Alignment (Y+15..Y+16)
align = 0
for y in range(2):
if img.get_pixel(tag_x, tag_y + 15 + y) & 0xFF:
align |= (1 << y)
# WriteOnTop (Y+17)
wot_raw = img.get_pixel(tag_x, tag_y + 17)
has_write_on_top = (wot_raw & 0xFF) != 0
# Stack (Y+18..Y+19)
s0 = tagify(img.get_pixel(tag_x, tag_y + 18))
s1 = tagify(img.get_pixel(tag_x, tag_y + 19))
if s0 == 0x00FF00FF and s1 == 0x00FF00FF:
stack_where = 4 # STACK_DONT
else:
stack_where = 0
for y in range(2):
if img.get_pixel(tag_x, tag_y + 18 + y) & 0xFF:
stack_where |= (1 << y)
yield {
'code': code,
'width': width,
'lowheight': is_low_height,
'has_kern': has_kern,
'is_ytype': is_ytype,
'kern_mask': kern_mask,
'has_dot_removal': has_dot_removal,
'opcode': opcode,
'opcode_arg1': arg1,
'opcode_arg2': arg2,
'has_nudge': has_nudge,
'nudge_x': nudge_x,
'nudge_y': nudge_y,
'n_anchors': n_anchors,
'align': align,
'has_write_on_top': has_write_on_top,
'stack_where': stack_where,
}
# ---- Main ----
def main():
assets_dir = sys.argv[1] if len(sys.argv) > 1 else '../src/assets'
# Accumulators
all_glyphs = []
per_sheet = defaultdict(lambda: {'total': 0, 'kern': 0, 'lowh': 0, 'directives': 0})
sheets_scanned = 0
print(f"Scanning {assets_dir}...\n")
for sheet_idx, filename in enumerate(FILE_LIST):
if not is_variable(filename):
continue
if sheet_idx >= len(CODE_RANGE):
continue
path = os.path.join(assets_dir, filename)
if not os.path.exists(path):
continue
is_xy = is_xyswap(filename)
is_ew = is_extra_wide(filename)
code_range = CODE_RANGE[sheet_idx]
count = 0
for g in parse_variable_sheet(path, code_range, is_xy, is_ew):
g['sheet'] = filename
all_glyphs.append(g)
s = per_sheet[filename]
s['total'] += 1
if g['has_kern']:
s['kern'] += 1
if g['lowheight']:
s['lowh'] += 1
if g['opcode'] != 0:
s['directives'] += 1
count += 1
sheets_scanned += 1
total = len(all_glyphs)
if total == 0:
print("No glyphs found!")
return 1
print(f"Scanned {sheets_scanned} variable sheets, {total} glyphs with width > 0\n")
# ---- 1. Width distribution ----
width_counter = Counter(g['width'] for g in all_glyphs)
print("=" * 60)
print("WIDTH DISTRIBUTION")
print("=" * 60)
for w in sorted(width_counter):
c = width_counter[w]
bar = '#' * (c * 40 // max(width_counter.values()))
print(f" w={w:2d}: {c:5d} ({100*c/total:5.1f}%) {bar}")
print(f" Total: {total}")
# ---- 2. Compiler directives ----
dir_glyphs = [g for g in all_glyphs if g['opcode'] != 0]
print(f"\n{'=' * 60}")
print("COMPILER DIRECTIVES")
print("=" * 60)
print(f" Total glyphs with directives: {len(dir_glyphs)}/{total} ({100*len(dir_glyphs)/total:.1f}%)")
opcode_counter = Counter()
replace_counts = Counter()
illegal_count = 0
for g in dir_glyphs:
op = g['opcode']
opcode_counter[op] += 1
if 0x80 <= op <= 0x87:
n_replace = op & 0x07
replace_counts[n_replace] += 1
if op == 255:
illegal_count += 1
if opcode_counter:
print(f"\n By opcode:")
for op in sorted(opcode_counter):
c = opcode_counter[op]
if 0x80 <= op <= 0x87:
label = f'replaceWith (n={op & 0x07})'
elif op == 255:
label = 'ILLEGAL (0xFF)'
else:
label = f'unknown'
print(f" 0x{op:02X} ({label}): {c}")
if replace_counts:
print(f"\n replaceWith breakdown:")
for n in sorted(replace_counts):
print(f" {n} replacement char(s): {replace_counts[n]}")
if illegal_count:
print(f" Illegal glyphs: {illegal_count}")
# ---- 3. Kerning shapes ----
kern_glyphs = [g for g in all_glyphs if g['has_kern']]
print(f"\n{'=' * 60}")
print("KERNING SHAPES")
print("=" * 60)
print(f" Glyphs with kern data: {len(kern_glyphs)}/{total} ({100*len(kern_glyphs)/total:.1f}%)")
shape_counter = Counter()
for g in kern_glyphs:
tag = format_shape(g['kern_mask'], g['is_ytype'])
shape_counter[tag] += 1
n_unique = len(shape_counter)
n_kern = len(kern_glyphs)
ytype_count = sum(1 for g in kern_glyphs if g['is_ytype'])
btype_count = n_kern - ytype_count
print(f" Unique shapes: {n_unique}")
print(f" B-type: {btype_count} ({100*btype_count/n_kern:.1f}%)")
print(f" Y-type: {ytype_count} ({100*ytype_count/n_kern:.1f}%)")
# Per-bit occurrences
bit_names = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
bit_positions = [7, 6, 5, 4, 3, 2, 1, 0, 15, 14]
print(f"\n Per-bit occurrences ({n_kern} glyphs with kern):")
for name, pos in zip(bit_names, bit_positions):
c = sum(1 for g in kern_glyphs if (g['kern_mask'] >> pos) & 1)
bar = '#' * (c * 30 // n_kern)
print(f" {name}: {c:5d}/{n_kern} ({100*c/n_kern:5.1f}%) {bar}")
print(f"\n Top shapes (of {n_unique} unique):")
for tag, c in shape_counter.most_common(30):
bar = '#' * (c * 30 // shape_counter.most_common(1)[0][1])
print(f" {tag:<22s} {c:4d} ({100*c/len(kern_glyphs):5.1f}%) {bar}")
if n_unique > 30:
remaining = sum(c for _, c in shape_counter.most_common()[30:])
print(f" ... {n_unique - 30} more shapes: {remaining} glyphs")
# ---- 4. Lowheight ----
lowh_glyphs = [g for g in all_glyphs if g['lowheight']]
print(f"\n{'=' * 60}")
print("LOWHEIGHT")
print("=" * 60)
print(f" Lowheight glyphs: {len(lowh_glyphs)}/{total} ({100*len(lowh_glyphs)/total:.1f}%)")
# ---- 5. Diacritics / stacking ----
anchor_glyphs = [g for g in all_glyphs if g['n_anchors'] > 0]
wot_glyphs = [g for g in all_glyphs if g['has_write_on_top']]
stack_names = {0: 'STACK_UP', 1: 'STACK_DOWN', 2: 'STACK_BEFORE_N_AFTER',
3: 'STACK_UP_N_DOWN', 4: 'STACK_DONT'}
stack_counter = Counter(g['stack_where'] for g in all_glyphs if g['stack_where'] != 0)
print(f"\n{'=' * 60}")
print("DIACRITICS & STACKING")
print("=" * 60)
print(f" Glyphs with diacritics anchors: {len(anchor_glyphs)}/{total} ({100*len(anchor_glyphs)/total:.1f}%)")
anchor_count_dist = Counter(g['n_anchors'] for g in anchor_glyphs)
for n in sorted(anchor_count_dist):
print(f" {n} anchor(s): {anchor_count_dist[n]}")
print(f" Glyphs with writeOnTop: {len(wot_glyphs)}")
if stack_counter:
print(f" Stack modes:")
for sw, c in stack_counter.most_common():
print(f" {stack_names.get(sw, f'?{sw}')}: {c}")
# ---- 6. Dot removal ----
dot_glyphs = [g for g in all_glyphs if g['has_dot_removal']]
print(f"\n{'=' * 60}")
print("DOT REMOVAL")
print("=" * 60)
print(f" Glyphs with dot removal directive: {len(dot_glyphs)}/{total} ({100*len(dot_glyphs)/total:.1f}%)")
# ---- 7. Nudge ----
nudge_glyphs = [g for g in all_glyphs if g['has_nudge']]
print(f"\n{'=' * 60}")
print("NUDGE")
print("=" * 60)
print(f" Glyphs with nudge: {len(nudge_glyphs)}/{total} ({100*len(nudge_glyphs)/total:.1f}%)")
if nudge_glyphs:
nudge_x_vals = Counter(g['nudge_x'] for g in nudge_glyphs if g['nudge_x'] != 0)
nudge_y_vals = Counter(g['nudge_y'] for g in nudge_glyphs if g['nudge_y'] != 0)
if nudge_x_vals:
print(f" X nudge values: {dict(sorted(nudge_x_vals.items()))}")
if nudge_y_vals:
print(f" Y nudge values: {dict(sorted(nudge_y_vals.items()))}")
# ---- 8. Alignment ----
align_names = {0: 'LEFT', 1: 'RIGHT', 2: 'CENTRE', 3: 'BEFORE'}
align_counter = Counter(g['align'] for g in all_glyphs if g['align'] != 0)
print(f"\n{'=' * 60}")
print("ALIGNMENT")
print("=" * 60)
if align_counter:
for a, c in align_counter.most_common():
print(f" {align_names.get(a, f'?{a}')}: {c}")
else:
print(" All glyphs use default (LEFT) alignment")
# ---- 9. Missing kern data ----
missing = [g for g in all_glyphs
if not g['has_kern']
and g['opcode'] == 0
and not is_excluded_from_kern(g['code'])]
print(f"\n{'=' * 60}")
print("MISSING KERNING DATA")
print("=" * 60)
print(f" Glyphs without kern (excl. CJK/Hangul/symbols/diacriticals): "
f"{len(missing)}/{total} ({100*len(missing)/total:.1f}%)")
if missing:
by_block = defaultdict(list)
for g in missing:
by_block[unicode_block_name(g['code'])].append(g['code'])
print(f"\n By block:")
for block in sorted(by_block, key=lambda b: by_block[b][0]):
cps = by_block[block]
sample = ', '.join(f'U+{c:04X}' for c in cps[:8])
more = f' ... +{len(cps)-8}' if len(cps) > 8 else ''
print(f" {block}: {len(cps)} ({sample}{more})")
# ---- 10. Per-sheet summary ----
print(f"\n{'=' * 60}")
print("PER-SHEET SUMMARY")
print("=" * 60)
print(f" {'Sheet':<52s} {'Total':>5s} {'Kern':>5s} {'LowH':>5s} {'Dir':>4s}")
print(f" {'-'*52} {'-'*5} {'-'*5} {'-'*5} {'-'*4}")
for fn in sorted(per_sheet):
s = per_sheet[fn]
print(f" {fn:<52s} {s['total']:5d} {s['kern']:5d} {s['lowh']:5d} {s['directives']:4d}")
return 0
if __name__ == '__main__':
sys.exit(main())

View File

@@ -2,6 +2,7 @@
#include "tga.h" #include "tga.h"
#include "nn.h" #include "nn.h"
#include "safetensor.h" #include "safetensor.h"
#include "unicode_filter.h"
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
@@ -42,7 +43,8 @@ static void extract_shape_bits(int kerning_mask, float *shape) {
/* ---- Collect samples from one TGA ---- */ /* ---- Collect samples from one TGA ---- */
static int collect_from_sheet(const char *path, int is_xyswap, Sample *samples, int max_samples) { static int collect_from_sheet(const char *path, int is_xyswap, int start_code,
Sample *samples, int max_samples) {
TgaImage *img = tga_read(path); TgaImage *img = tga_read(path);
if (!img) { if (!img) {
fprintf(stderr, "Warning: cannot read %s\n", path); fprintf(stderr, "Warning: cannot read %s\n", path);
@@ -76,6 +78,10 @@ static int collect_from_sheet(const char *path, int is_xyswap, Sample *samples,
} }
if (width == 0) continue; if (width == 0) continue;
/* Skip modifier letters, symbols, punctuation */
if (start_code >= 0 && is_excluded_from_training(start_code + index))
continue;
/* Read kerning data pixel at Y+6 */ /* Read kerning data pixel at Y+6 */
uint32_t kern_pixel = tagify(tga_get_pixel(img, tag_x, tag_y + 6)); uint32_t kern_pixel = tagify(tga_get_pixel(img, tag_x, tag_y + 6));
if ((kern_pixel & 0xFF) == 0) continue; /* no kern data */ if ((kern_pixel & 0xFF) == 0) continue; /* no kern data */
@@ -128,12 +134,8 @@ static void save_weights(Network *net, Network *best) {
copy_tensor_data(best->conv2.bias, net->conv2.bias); copy_tensor_data(best->conv2.bias, net->conv2.bias);
copy_tensor_data(best->fc1.weight, net->fc1.weight); copy_tensor_data(best->fc1.weight, net->fc1.weight);
copy_tensor_data(best->fc1.bias, net->fc1.bias); copy_tensor_data(best->fc1.bias, net->fc1.bias);
copy_tensor_data(best->head_shape.weight, net->head_shape.weight); copy_tensor_data(best->output.weight, net->output.weight);
copy_tensor_data(best->head_shape.bias, net->head_shape.bias); copy_tensor_data(best->output.bias, net->output.bias);
copy_tensor_data(best->head_ytype.weight, net->head_ytype.weight);
copy_tensor_data(best->head_ytype.bias, net->head_ytype.bias);
copy_tensor_data(best->head_lowheight.weight, net->head_lowheight.weight);
copy_tensor_data(best->head_lowheight.bias, net->head_lowheight.bias);
} }
/* ---- Training ---- */ /* ---- Training ---- */
@@ -174,7 +176,9 @@ int train_model(void) {
char fullpath[512]; char fullpath[512];
snprintf(fullpath, sizeof(fullpath), "%s/%s", assets_dir, name); snprintf(fullpath, sizeof(fullpath), "%s/%s", assets_dir, name);
int got = collect_from_sheet(fullpath, is_xyswap, all_samples + total, max_total - total); int start_code = sheet_start_code(name);
int got = collect_from_sheet(fullpath, is_xyswap, start_code,
all_samples + total, max_total - total);
if (got > 0) { if (got > 0) {
printf(" %s: %d samples\n", name, got); printf(" %s: %d samples\n", name, got);
total += got; total += got;
@@ -247,19 +251,15 @@ int train_model(void) {
int ishape[] = {bs, 1, 20, 15}; int ishape[] = {bs, 1, 20, 15};
Tensor *input = tensor_alloc(4, ishape); Tensor *input = tensor_alloc(4, ishape);
int sshape[] = {bs, 10}; int tshape[] = {bs, 12};
Tensor *tgt_shape = tensor_alloc(2, sshape); Tensor *target = tensor_alloc(2, tshape);
int yshape[] = {bs, 1};
Tensor *tgt_ytype = tensor_alloc(2, yshape);
Tensor *tgt_lh = tensor_alloc(2, yshape);
for (int i = 0; i < bs; i++) { for (int i = 0; i < bs; i++) {
Sample *s = &all_samples[indices[start + i]]; Sample *s = &all_samples[indices[start + i]];
memcpy(input->data + i * 300, s->input, 300 * sizeof(float)); memcpy(input->data + i * 300, s->input, 300 * sizeof(float));
memcpy(tgt_shape->data + i * 10, s->shape, 10 * sizeof(float)); memcpy(target->data + i * 12, s->shape, 10 * sizeof(float));
tgt_ytype->data[i] = s->ytype; target->data[i * 12 + 10] = s->ytype;
tgt_lh->data[i] = s->lowheight; target->data[i * 12 + 11] = s->lowheight;
} }
/* Forward */ /* Forward */
@@ -267,21 +267,19 @@ int train_model(void) {
network_forward(net, input, 1); network_forward(net, input, 1);
/* Loss */ /* Loss */
float loss = network_bce_loss(net, tgt_shape, tgt_ytype, tgt_lh); float loss = network_bce_loss(net, target);
train_loss += loss; train_loss += loss;
n_batches++; n_batches++;
/* Backward */ /* Backward */
network_backward(net, tgt_shape, tgt_ytype, tgt_lh); network_backward(net, target);
/* Adam step */ /* Adam step */
adam_t++; adam_t++;
network_adam_step(net, lr, beta1, beta2, eps, adam_t); network_adam_step(net, lr, beta1, beta2, eps, adam_t);
tensor_free(input); tensor_free(input);
tensor_free(tgt_shape); tensor_free(target);
tensor_free(tgt_ytype);
tensor_free(tgt_lh);
} }
train_loss /= (float)n_batches; train_loss /= (float)n_batches;
@@ -295,29 +293,23 @@ int train_model(void) {
int ishape[] = {bs, 1, 20, 15}; int ishape[] = {bs, 1, 20, 15};
Tensor *input = tensor_alloc(4, ishape); Tensor *input = tensor_alloc(4, ishape);
int sshape[] = {bs, 10}; int tshape[] = {bs, 12};
Tensor *tgt_shape = tensor_alloc(2, sshape); Tensor *target = tensor_alloc(2, tshape);
int yshape[] = {bs, 1};
Tensor *tgt_ytype = tensor_alloc(2, yshape);
Tensor *tgt_lh = tensor_alloc(2, yshape);
for (int i = 0; i < bs; i++) { for (int i = 0; i < bs; i++) {
Sample *s = &all_samples[indices[n_train + start + i]]; Sample *s = &all_samples[indices[n_train + start + i]];
memcpy(input->data + i * 300, s->input, 300 * sizeof(float)); memcpy(input->data + i * 300, s->input, 300 * sizeof(float));
memcpy(tgt_shape->data + i * 10, s->shape, 10 * sizeof(float)); memcpy(target->data + i * 12, s->shape, 10 * sizeof(float));
tgt_ytype->data[i] = s->ytype; target->data[i * 12 + 10] = s->ytype;
tgt_lh->data[i] = s->lowheight; target->data[i * 12 + 11] = s->lowheight;
} }
network_forward(net, input, 0); network_forward(net, input, 0);
val_loss += network_bce_loss(net, tgt_shape, tgt_ytype, tgt_lh); val_loss += network_bce_loss(net, target);
val_batches++; val_batches++;
tensor_free(input); tensor_free(input);
tensor_free(tgt_shape); tensor_free(target);
tensor_free(tgt_ytype);
tensor_free(tgt_lh);
} }
val_loss /= (float)val_batches; val_loss /= (float)val_batches;

717
Autokem/train_torch.py Normal file
View File

@@ -0,0 +1,717 @@
#!/usr/bin/env python3
"""
PyTorch training script for Autokem — drop-in replacement for `autokem train`.
Reads the same *_variable.tga sprite sheets, trains the same architecture,
and saves weights in safetensors format loadable by the C inference code.
Usage:
python train_keras.py # train with defaults
python train_keras.py --epochs 300 # override max epochs
python train_keras.py --lr 0.0005 # override learning rate
python train_keras.py --save model.safetensors
Requirements:
pip install torch numpy
"""
import argparse
import json
import os
import struct
import sys
import unicodedata
from pathlib import Path
import numpy as np
# ---- Sheet code ranges (imported from OTFbuild/sheet_config.py) ----
_otfbuild = os.path.join(os.path.dirname(os.path.abspath(__file__)), '..', 'OTFbuild')
try:
sys.path.insert(0, _otfbuild)
from sheet_config import FILE_LIST as _FILE_LIST, CODE_RANGE as _CODE_RANGE
sys.path.pop(0)
_CODE_RANGE_MAP = {}
for _i, _fn in enumerate(_FILE_LIST):
if _i < len(_CODE_RANGE):
_CODE_RANGE_MAP[_fn] = _CODE_RANGE[_i]
except ImportError:
_CODE_RANGE_MAP = {}
# ---- TGA reader (matches OTFbuild/tga_reader.py and Autokem/tga.c) ----
class TgaImage:
__slots__ = ('width', 'height', 'pixels')
def __init__(self, width, height, pixels):
self.width = width
self.height = height
self.pixels = pixels # flat list of RGBA8888 ints
def get_pixel(self, x, y):
if x < 0 or x >= self.width or y < 0 or y >= self.height:
return 0
return self.pixels[y * self.width + x]
def read_tga(path):
with open(path, 'rb') as f:
data = f.read()
pos = 0
id_length = data[pos]; pos += 1
_colour_map_type = data[pos]; pos += 1
image_type = data[pos]; pos += 1
pos += 5 # colour map spec
pos += 2 # x_origin
pos += 2 # y_origin
width = struct.unpack_from('<H', data, pos)[0]; pos += 2
height = struct.unpack_from('<H', data, pos)[0]; pos += 2
bits_per_pixel = data[pos]; pos += 1
descriptor = data[pos]; pos += 1
top_to_bottom = (descriptor & 0x20) != 0
bpp = bits_per_pixel // 8
pos += id_length
if image_type != 2 or bpp not in (3, 4):
raise ValueError(f"Unsupported TGA: type={image_type}, bpp={bits_per_pixel}")
pixels = [0] * (width * height)
for row in range(height):
y = row if top_to_bottom else (height - 1 - row)
for x in range(width):
b = data[pos]; g = data[pos+1]; r = data[pos+2]
a = data[pos+3] if bpp == 4 else 0xFF
pos += bpp
pixels[y * width + x] = (r << 24) | (g << 16) | (b << 8) | a
return TgaImage(width, height, pixels)
def tagify(pixel):
return 0 if (pixel & 0xFF) == 0 else pixel
# ---- Data collection (matches Autokem/train.c) ----
def collect_from_sheet(path, is_xyswap, code_range=None):
"""Extract labelled samples from a single TGA sheet."""
img = read_tga(path)
cell_w, cell_h = 16, 20
cols = img.width // cell_w
rows = img.height // cell_h
total_cells = cols * rows
inputs = []
labels = []
skipped_lm = 0
for index in range(total_cells):
if is_xyswap:
cell_x = (index // cols) * cell_w
cell_y = (index % cols) * cell_h
else:
cell_x = (index % cols) * cell_w
cell_y = (index // cols) * cell_h
tag_x = cell_x + (cell_w - 1)
tag_y = cell_y
# Width (5-bit)
width = 0
for y in range(5):
if img.get_pixel(tag_x, tag_y + y) & 0xFF:
width |= (1 << y)
if width == 0:
continue
# Skip modifier letters, symbols, punctuation
if code_range is not None and index < len(code_range):
cp = code_range[index]
try:
cat = unicodedata.category(chr(cp))
if cat == 'Lm' or cat[0] in ('S', 'P'):
skipped_lm += 1
continue
except (ValueError, OverflowError):
pass
# Kern data pixel at Y+6
kern_pixel = tagify(img.get_pixel(tag_x, tag_y + 6))
if (kern_pixel & 0xFF) == 0:
continue # no kern data
# Extract labels
is_kern_ytype = 1.0 if (kern_pixel & 0x80000000) != 0 else 0.0
kerning_mask = (kern_pixel >> 8) & 0xFFFFFF
is_low_height = 1.0 if (img.get_pixel(tag_x, tag_y + 5) & 0xFF) != 0 else 0.0
# Shape bits: A(7) B(6) C(5) D(4) E(3) F(2) G(1) H(0) J(15) K(14)
shape = [
float((kerning_mask >> 7) & 1), # A
float((kerning_mask >> 6) & 1), # B
float((kerning_mask >> 5) & 1), # C
float((kerning_mask >> 4) & 1), # D
float((kerning_mask >> 3) & 1), # E
float((kerning_mask >> 2) & 1), # F
float((kerning_mask >> 1) & 1), # G
float((kerning_mask >> 0) & 1), # H
float((kerning_mask >> 15) & 1), # J
float((kerning_mask >> 14) & 1), # K
]
# 15x20 binary input
inp = np.zeros((20, 15), dtype=np.float32)
for gy in range(20):
for gx in range(15):
p = img.get_pixel(cell_x + gx, cell_y + gy)
if (p & 0x80) != 0:
inp[gy, gx] = 1.0
inputs.append(inp)
labels.append(shape + [is_kern_ytype, is_low_height])
return inputs, labels, skipped_lm
def collect_all_samples(assets_dir):
"""Scan assets_dir for *_variable.tga, collect all labelled samples."""
all_inputs = []
all_labels = []
file_count = 0
total_skipped_lm = 0
for name in sorted(os.listdir(assets_dir)):
if not name.endswith('_variable.tga'):
continue
if 'extrawide' in name:
continue
is_xyswap = 'xyswap' in name
code_range = _CODE_RANGE_MAP.get(name, None)
path = os.path.join(assets_dir, name)
inputs, labels, skipped_lm = collect_from_sheet(path, is_xyswap, code_range)
total_skipped_lm += skipped_lm
if inputs:
suffix = f" (skipped {skipped_lm})" if skipped_lm else ""
print(f" {name}: {len(inputs)} samples{suffix}")
all_inputs.extend(inputs)
all_labels.extend(labels)
file_count += 1
if total_skipped_lm:
print(f" Filtered (Lm/S/P): {total_skipped_lm}")
return np.array(all_inputs), np.array(all_labels, dtype=np.float32), file_count
# ---- Model (matches Autokem/nn.c architecture) ----
def build_model():
"""
Conv2D(1->32, 7x7, padding=1) -> SiLU
Conv2D(32->64, 7x7, padding=1) -> SiLU
GlobalAveragePooling2D -> [64]
Dense(256) -> SiLU
Dense(12) -> sigmoid
"""
import torch
import torch.nn as nn
class Keminet(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 32, 7, padding=1)
self.conv2 = nn.Conv2d(32, 64, 7, padding=1)
self.fc1 = nn.Linear(64, 256)
# self.fc2 = nn.Linear(256, 128)
self.output = nn.Linear(256, 12)
self.tf = nn.SiLU()
# He init
for m in self.modules():
if isinstance(m, (nn.Conv2d, nn.Linear)):
nn.init.kaiming_normal_(m.weight, a=0.01, nonlinearity='leaky_relu')
if m.bias is not None:
nn.init.zeros_(m.bias)
def forward(self, x):
x = self.tf(self.conv1(x))
x = self.tf(self.conv2(x))
x = x.mean(dim=(2, 3)) # global average pool
x = self.tf(self.fc1(x))
# x = self.tf(self.fc2(x))
x = torch.sigmoid(self.output(x))
return x
return Keminet()
# ---- Safetensors export (matches Autokem/safetensor.c layout) ----
def export_safetensors(model, path, total_samples, epochs, val_loss):
"""
Save model weights in safetensors format compatible with the C code.
C code expects these tensor names with these shapes:
conv1.weight [out_ch, in_ch, kh, kw] — PyTorch matches this layout
conv1.bias [out_ch]
conv2.weight [out_ch, in_ch, kh, kw]
conv2.bias [out_ch]
fc1.weight [out_features, in_features] — PyTorch matches this layout
fc1.bias [out_features]
fc2.weight [out_features, in_features]
fc2.bias [out_features]
output.weight [out_features, in_features]
output.bias [out_features]
"""
tensor_names = [
'conv1.weight', 'conv1.bias',
'conv2.weight', 'conv2.bias',
'fc1.weight', 'fc1.bias',
# 'fc2.weight', 'fc2.bias',
'output.weight', 'output.bias',
]
state = model.state_dict()
header = {}
header['__metadata__'] = {
'samples': str(total_samples),
'epochs': str(epochs),
'val_loss': f'{val_loss:.6f}',
}
data_parts = []
offset = 0
for name in tensor_names:
arr = state[name].detach().cpu().numpy().astype(np.float32)
raw = arr.tobytes()
header[name] = {
'dtype': 'F32',
'shape': list(arr.shape),
'data_offsets': [offset, offset + len(raw)],
}
data_parts.append(raw)
offset += len(raw)
header_json = json.dumps(header, separators=(',', ':')).encode('utf-8')
padded_len = (len(header_json) + 7) & ~7
header_json = header_json + b' ' * (padded_len - len(header_json))
with open(path, 'wb') as f:
f.write(struct.pack('<Q', len(header_json)))
f.write(header_json)
for part in data_parts:
f.write(part)
total_bytes = 8 + len(header_json) + offset
print(f"Saved model to {path} ({total_bytes} bytes)")
def load_safetensors(model, path):
"""Load weights from safetensors file into the PyTorch model."""
import torch
with open(path, 'rb') as f:
header_len = struct.unpack('<Q', f.read(8))[0]
header_json = f.read(header_len)
header = json.loads(header_json)
data_start = 8 + header_len
state = model.state_dict()
for name in state:
if name not in header:
print(f" Warning: tensor '{name}' not in safetensors")
continue
entry = header[name]
off_start, off_end = entry['data_offsets']
f.seek(data_start + off_start)
raw = f.read(off_end - off_start)
arr = np.frombuffer(raw, dtype=np.float32).reshape(entry['shape'])
state[name] = torch.from_numpy(arr.copy())
model.load_state_dict(state)
print(f"Loaded weights from {path}")
# ---- Pretty-print helpers ----
BIT_NAMES = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'Ytype', 'LowH']
SHAPE_CHARS = 'ABCDEFGHJK'
MIRROR_PAIRS = [(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)] # A↔B, C↔D, E↔F, G↔H, J↔K
def format_tag(bits_12):
"""Format 12 binary bits as keming_machine tag string, e.g. 'ABCDEFGH(B)'."""
chars = ''.join(SHAPE_CHARS[i] for i in range(10) if bits_12[i])
if not chars:
chars = '(empty)'
mode = '(Y)' if bits_12[10] else '(B)'
low = ' low' if bits_12[11] else ''
return f'{chars}{mode}{low}'
def print_label_distribution(labels, total):
counts = labels.sum(axis=0).astype(int)
parts = [f'{BIT_NAMES[b]}:{counts[b]}({100*counts[b]/total:.0f}%)' for b in range(12)]
print(f"Label distribution:\n {' '.join(parts)}")
def print_examples_and_accuracy(model, X_val, y_val, max_examples=8):
"""Print example predictions and per-bit accuracy on validation set."""
import torch
model.eval()
with torch.no_grad():
preds = model(X_val).cpu().numpy()
y_np = y_val.cpu().numpy() if hasattr(y_val, 'cpu') else y_val
pred_bits = (preds >= 0.5).astype(int)
tgt_bits = y_np.astype(int)
n_val = len(y_np)
n_examples = 0
print("\nGlyph Tags — validation predictions:")
for i in range(n_val):
mismatch = not np.array_equal(pred_bits[i], tgt_bits[i])
if n_examples < max_examples and (mismatch or i < 4):
actual_tag = format_tag(tgt_bits[i])
pred_tag = format_tag(pred_bits[i])
status = 'MISMATCH' if mismatch else 'ok'
print(f" actual={actual_tag:<20s} pred={pred_tag:<20s} {status}")
n_examples += 1
correct = (pred_bits == tgt_bits)
per_bit = correct.sum(axis=0)
total_correct = correct.sum()
print(f"\nPer-bit accuracy ({n_val} val samples):")
parts = [f'{BIT_NAMES[b]}:{100*per_bit[b]/n_val:.1f}%' for b in range(12)]
print(f" {' '.join(parts)}")
print(f" Overall: {total_correct}/{n_val*12} ({100*total_correct/(n_val*12):.2f}%)")
# ---- Data augmentation ----
def _shape_key(label):
"""10-bit shape tuple from label (A through K)."""
return tuple(int(label[i]) for i in range(10))
def _mirror_shape(key):
"""Swap mirror pairs: A↔B, C↔D, E↔F, G↔H, J↔K."""
m = list(key)
for a, b in MIRROR_PAIRS:
m[a], m[b] = m[b], m[a]
return tuple(m)
def _mirror_label(label):
"""Mirror shape bits in label, keep ytype and lowheight."""
m = label.copy()
for a, b in MIRROR_PAIRS:
m[a], m[b] = m[b], m[a]
return m
def _shift_image(img, dx, dy):
"""Shift 2D image by (dx, dy), fill with 0."""
h, w = img.shape
shifted = np.zeros_like(img)
sx0, sx1 = max(0, -dx), min(w, w - dx)
sy0, sy1 = max(0, -dy), min(h, h - dy)
dx0, dx1 = max(0, dx), min(w, w + dx)
dy0, dy1 = max(0, dy), min(h, h + dy)
shifted[dy0:dy1, dx0:dx1] = img[sy0:sy1, sx0:sx1]
return shifted
def _augment_one(img, label, rng):
"""One augmented copy: random 1px shift + 1% pixel dropout."""
dx = rng.integers(-1, 2) # -1, 0, or 1
dy = rng.integers(-1, 2) # -1, 0, or 1
aug = _shift_image(img, dx, dy)
# mask = rng.random(aug.shape) > 0.01
# aug = aug * mask
return aug, label.copy()
def _do_mirror_augmentation(X, y, rng):
"""For each mirror pair (S, mirror(S)), fill deficit from the common side."""
shape_counts = {}
shape_indices = {}
for i in range(len(y)):
key = _shape_key(y[i])
shape_counts[key] = shape_counts.get(key, 0) + 1
shape_indices.setdefault(key, []).append(i)
new_X, new_y = [], []
done = set() # avoid processing both directions
for key, count in shape_counts.items():
if key in done:
continue
mkey = _mirror_shape(key)
done.add(key)
done.add(mkey)
if mkey == key:
continue # symmetric shape
mcount = shape_counts.get(mkey, 0)
if count == mcount:
continue
# Mirror from the larger side to fill the smaller side
if count > mcount:
src_key, deficit = key, count - mcount
else:
src_key, deficit = mkey, mcount - count
indices = shape_indices.get(src_key, [])
if not indices:
continue
chosen = rng.choice(indices, size=deficit, replace=True)
for idx in chosen:
new_X.append(np.fliplr(X[idx]).copy())
new_y.append(_mirror_label(y[idx]))
if new_X:
X = np.concatenate([X, np.array(new_X)])
y = np.concatenate([y, np.array(new_y)])
return X, y
def _compute_rarity_weights(y):
"""Per-sample weight: sum of inverse bit frequencies for all 12 bits.
Samples with rare bit values (e.g. J=1 at 13%, C=0 at 8%) get higher weight.
"""
bit_freq = y.mean(axis=0) # [12], P(bit=1)
weights = np.zeros(len(y))
for i in range(len(y)):
w = 0.0
for b in range(12):
p = bit_freq[b] if y[i, b] > 0.5 else (1.0 - bit_freq[b])
w += 1.0 / max(p, 0.01)
weights[i] = w
return weights
def _do_rarity_augmentation(X, y, rng, target_new):
"""Create target_new augmented samples, drawn proportionally to rarity weight."""
if target_new <= 0:
return X, y
weights = _compute_rarity_weights(y)
weights /= weights.sum()
chosen = rng.choice(len(X), size=target_new, replace=True, p=weights)
new_X, new_y = [], []
for idx in chosen:
aug_img, aug_label = _augment_one(X[idx], y[idx], rng)
new_X.append(aug_img)
new_y.append(aug_label)
X = np.concatenate([X, np.array(new_X)])
y = np.concatenate([y, np.array(new_y)])
return X, y
def _print_bit_freq(y, label):
"""Print per-bit frequencies for diagnostics."""
freq = y.mean(axis=0)
names = BIT_NAMES
parts = [f'{names[b]}:{freq[b]*100:.0f}%' for b in range(12)]
print(f" {label}: {' '.join(parts)}")
def augment_training_data(X_train, y_train, rng, aug_factor=3.0):
"""
Three-phase data augmentation:
1. Mirror augmentation — fill deficit between mirror-paired shapes
2. Rarity-weighted — samples with rare bit values get more copies (shift+dropout)
3. Y-type boost — repeat phases 1-2 scoped to Y-type samples only
"""
n0 = len(X_train)
_print_bit_freq(y_train, 'Before')
# Phase 1: Mirror augmentation
X_train, y_train = _do_mirror_augmentation(X_train, y_train, rng)
n1 = len(X_train)
# Phase 2: Rarity-weighted augmentation — target aug_factor × original size
target_new = int(n0 * aug_factor) - n1
X_train, y_train = _do_rarity_augmentation(X_train, y_train, rng, target_new)
n2 = len(X_train)
# Phase 3: Y-type boost — same pipeline for Y-type subset only
ytype_mask = y_train[:, 10] > 0.5
n_ytype_existing = int(ytype_mask.sum())
if n_ytype_existing > 0:
X_yt = X_train[ytype_mask]
y_yt = y_train[ytype_mask]
X_yt, y_yt = _do_mirror_augmentation(X_yt, y_yt, rng)
# Double the Y-type subset via rarity augmentation
yt_new = n_ytype_existing
X_yt, y_yt = _do_rarity_augmentation(X_yt, y_yt, rng, yt_new)
if len(X_yt) > n_ytype_existing:
X_train = np.concatenate([X_train, X_yt[n_ytype_existing:]])
y_train = np.concatenate([y_train, y_yt[n_ytype_existing:]])
n3 = len(X_train)
_print_bit_freq(y_train, 'After ')
print(f"Data augmentation: {n0}{n3} samples ({n3/n0:.1f}×)")
print(f" Mirror: +{n1 - n0}, Rarity: +{n2 - n1}, Y-type boost: +{n3 - n2}")
return X_train, y_train
# ---- Main ----
def main():
parser = argparse.ArgumentParser(description='Train Autokem model (PyTorch)')
parser.add_argument('--assets', default='../src/assets',
help='Path to assets directory (default: ../src/assets)')
parser.add_argument('--save', default='autokem.safetensors',
help='Output safetensors path (default: autokem.safetensors)')
parser.add_argument('--load', default=None,
help='Load weights from safetensors before training')
parser.add_argument('--epochs', type=int, default=200, help='Max epochs (default: 200)')
parser.add_argument('--batch-size', type=int, default=32, help='Batch size (default: 32)')
parser.add_argument('--lr', type=float, default=0.001, help='Learning rate (default: 0.001)')
parser.add_argument('--patience', type=int, default=10,
help='Early stopping patience (default: 10)')
parser.add_argument('--val-split', type=float, default=0.2,
help='Validation split (default: 0.2)')
parser.add_argument('--no-augment', action='store_true',
help='Disable data augmentation')
parser.add_argument('--aug-factor', type=float, default=3.0,
help='Augmentation target multiplier (default: 3.0)')
args = parser.parse_args()
import torch
import torch.nn as nn
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Device: {device}")
# Collect data
print("Collecting samples...")
X, y, file_count = collect_all_samples(args.assets)
if len(X) < 10:
print(f"Error: too few samples ({len(X)})", file=sys.stderr)
return 1
total = len(X)
print(f"Collected {total} samples from {file_count} sheets")
print_label_distribution(y, total)
nonzero = np.any(X.reshape(total, -1) > 0.5, axis=1).sum()
print(f" Non-empty inputs: {nonzero}/{total}\n")
# Shuffle and split
rng = np.random.default_rng(42)
perm = rng.permutation(total)
X, y = X[perm], y[perm]
n_val = int(total * args.val_split)
n_train = total - n_val
X_train, X_val = X[:n_train], X[n_train:]
y_train, y_val = y[:n_train], y[n_train:]
print(f"Train: {n_train}, Validation: {n_val}")
# Data augmentation (training set only)
if not args.no_augment:
X_train, y_train = augment_training_data(X_train, y_train, rng, args.aug_factor)
n_train = len(X_train)
print()
# Convert to tensors — PyTorch conv expects [N, C, H, W]
X_train_t = torch.from_numpy(X_train[:, np.newaxis, :, :]).to(device) # [N,1,20,15]
y_train_t = torch.from_numpy(y_train).to(device)
X_val_t = torch.from_numpy(X_val[:, np.newaxis, :, :]).to(device)
y_val_t = torch.from_numpy(y_val).to(device)
# Build model
model = build_model().to(device)
if args.load:
load_safetensors(model, args.load)
total_params = sum(p.numel() for p in model.parameters())
print(f"Model parameters: {total_params} ({total_params * 4 / 1024:.1f} KB)\n")
optimizer = torch.optim.Adam(model.parameters(), lr=args.lr)
loss_fn = nn.BCELoss()
best_val_loss = float('inf')
best_epoch = 0
patience_counter = 0
best_state = None
for epoch in range(1, args.epochs + 1):
# Training
model.train()
perm_train = torch.randperm(n_train, device=device)
train_loss = 0.0
n_batches = 0
for start in range(0, n_train, args.batch_size):
end = min(start + args.batch_size, n_train)
idx = perm_train[start:end]
optimizer.zero_grad()
pred = model(X_train_t[idx])
loss = loss_fn(pred, y_train_t[idx])
loss.backward()
optimizer.step()
train_loss += loss.item()
n_batches += 1
train_loss /= n_batches
# Validation
model.eval()
with torch.no_grad():
val_pred = model(X_val_t)
val_loss = loss_fn(val_pred, y_val_t).item()
marker = ''
if val_loss < best_val_loss:
best_val_loss = val_loss
best_epoch = epoch
patience_counter = 0
best_state = {k: v.clone() for k, v in model.state_dict().items()}
marker = ' *best*'
else:
patience_counter += 1
print(f"Epoch {epoch:3d}: train_loss={train_loss:.4f} val_loss={val_loss:.4f}{marker}")
if patience_counter >= args.patience:
print(f"\nEarly stopping at epoch {epoch} (best epoch: {best_epoch})")
break
# Restore best weights
if best_state is not None:
model.load_state_dict(best_state)
print(f"\nBest epoch: {best_epoch}, val_loss: {best_val_loss:.6f}")
# Print accuracy
model.eval()
print_examples_and_accuracy(model, X_val_t, y_val, max_examples=8)
# Save
export_safetensors(model, args.save, total, best_epoch, best_val_loss)
return 0
if __name__ == '__main__':
sys.exit(main())

191
Autokem/unicode_filter.h Normal file
View File

@@ -0,0 +1,191 @@
#ifndef UNICODE_FILTER_H
#define UNICODE_FILTER_H
#include <string.h>
/*
* Unicode category filters for training/apply.
* Generated from Python unicodedata (Unicode 16.0).
*
* is_modifier_letter(cp) — category Lm
* is_subscript_modifier(cp) — Lm with <sub> decomposition
* is_symbol_or_punctuation(cp) — categories S* or P*
* is_excluded_from_training(cp) — Lm or S* or P*
*/
/* ---- Lm (modifier letter) ---- */
static inline int is_modifier_letter(int cp) {
if (cp >= 0x02B0 && cp <= 0x02C1) return 1;
if (cp >= 0x02C6 && cp <= 0x02D1) return 1;
if (cp >= 0x02E0 && cp <= 0x02E4) return 1;
if (cp == 0x02EC) return 1;
if (cp == 0x02EE) return 1;
if (cp == 0x0374) return 1;
if (cp == 0x037A) return 1;
if (cp == 0x0559) return 1;
if (cp == 0x0640) return 1;
if (cp >= 0x06E5 && cp <= 0x06E6) return 1;
if (cp >= 0x07F4 && cp <= 0x07F5) return 1;
if (cp == 0x07FA) return 1;
if (cp == 0x081A) return 1;
if (cp == 0x0824) return 1;
if (cp == 0x0828) return 1;
if (cp == 0x08C9) return 1;
if (cp == 0x0971) return 1;
if (cp == 0x0E46) return 1;
if (cp == 0x0EC6) return 1;
if (cp == 0x10FC) return 1;
if (cp == 0x17D7) return 1;
if (cp == 0x1843) return 1;
if (cp == 0x1AA7) return 1;
if (cp >= 0x1C78 && cp <= 0x1C7D) return 1;
if (cp >= 0x1D2C && cp <= 0x1D6A) return 1;
if (cp == 0x1D78) return 1;
if (cp >= 0x1D9B && cp <= 0x1DBF) return 1;
if (cp == 0x2071) return 1;
if (cp == 0x207F) return 1;
if (cp >= 0x2090 && cp <= 0x209C) return 1;
if (cp >= 0x2C7C && cp <= 0x2C7D) return 1;
if (cp == 0x2D6F) return 1;
if (cp == 0x2E2F) return 1;
if (cp == 0x3005) return 1;
if (cp >= 0x3031 && cp <= 0x3035) return 1;
if (cp == 0x303B) return 1;
if (cp >= 0x309D && cp <= 0x309E) return 1;
if (cp >= 0x30FC && cp <= 0x30FE) return 1;
if (cp == 0xA015) return 1;
if (cp >= 0xA4F8 && cp <= 0xA4FD) return 1;
if (cp == 0xA60C) return 1;
if (cp == 0xA67F) return 1;
if (cp >= 0xA69C && cp <= 0xA69D) return 1;
if (cp >= 0xA717 && cp <= 0xA71F) return 1;
if (cp == 0xA770) return 1;
if (cp == 0xA788) return 1;
if (cp >= 0xA7F2 && cp <= 0xA7F4) return 1;
if (cp >= 0xA7F8 && cp <= 0xA7F9) return 1;
if (cp == 0xA9CF) return 1;
if (cp == 0xA9E6) return 1;
if (cp == 0xAA70) return 1;
if (cp == 0xAADD) return 1;
if (cp >= 0xAAF3 && cp <= 0xAAF4) return 1;
if (cp >= 0xAB5C && cp <= 0xAB5F) return 1;
if (cp == 0xAB69) return 1;
if (cp == 0xFF70) return 1;
if (cp >= 0xFF9E && cp <= 0xFF9F) return 1;
if (cp >= 0x10780 && cp <= 0x10785) return 1;
if (cp >= 0x10787 && cp <= 0x107B0) return 1;
if (cp >= 0x107B2 && cp <= 0x107BA) return 1;
if (cp >= 0x16B40 && cp <= 0x16B43) return 1;
if (cp >= 0x16F93 && cp <= 0x16F9F) return 1;
if (cp >= 0x16FE0 && cp <= 0x16FE1) return 1;
if (cp == 0x16FE3) return 1;
if (cp >= 0x1AFF0 && cp <= 0x1AFF3) return 1;
if (cp >= 0x1AFF5 && cp <= 0x1AFFB) return 1;
if (cp >= 0x1AFFD && cp <= 0x1AFFE) return 1;
if (cp >= 0x1E030 && cp <= 0x1E06D) return 1;
if (cp >= 0x1E137 && cp <= 0x1E13D) return 1;
if (cp == 0x1E4EB) return 1;
if (cp == 0x1E94B) return 1;
return 0;
}
static inline int is_subscript_modifier(int cp) {
if (cp >= 0x1D62 && cp <= 0x1D6A) return 1;
if (cp >= 0x2090 && cp <= 0x209C) return 1;
if (cp == 0x2C7C) return 1;
if (cp >= 0x1E051 && cp <= 0x1E06A) return 1;
return 0;
}
/* ---- S* (Symbol) and P* (Punctuation) ---- */
/* Table of {start, end} ranges for S/P codepoints in font sheets */
static const int sp_ranges[][2] = {
{0x00021, 0x0002F}, {0x0003A, 0x00040}, {0x0005B, 0x00060},
{0x0007B, 0x0007E}, {0x000A1, 0x000A9}, {0x000AB, 0x000AC},
{0x000AE, 0x000B1}, {0x000B4, 0x000B4}, {0x000B6, 0x000B8},
{0x000BB, 0x000BB}, {0x000BF, 0x000BF}, {0x000D7, 0x000D7},
{0x000F7, 0x000F7}, {0x002C2, 0x002C5}, {0x002D2, 0x002DF},
{0x002E5, 0x002EB}, {0x002ED, 0x002ED}, {0x002EF, 0x002FF},
{0x00375, 0x00375}, {0x0037E, 0x0037E}, {0x00384, 0x00385},
{0x00387, 0x00387}, {0x00482, 0x00482}, {0x0055A, 0x0055F},
{0x00589, 0x0058A}, {0x0058D, 0x0058F}, {0x00964, 0x00965},
{0x00970, 0x00970}, {0x009F2, 0x009F3}, {0x009FA, 0x009FB},
{0x009FD, 0x009FD}, {0x00BF3, 0x00BFA}, {0x00E3F, 0x00E3F},
{0x00E4F, 0x00E4F}, {0x00E5A, 0x00E5B}, {0x010FB, 0x010FB},
{0x016EB, 0x016ED}, {0x01CC0, 0x01CC7}, {0x01FBD, 0x01FBD},
{0x01FBF, 0x01FC1}, {0x01FCD, 0x01FCF}, {0x01FDD, 0x01FDF},
{0x01FED, 0x01FEF}, {0x01FFD, 0x01FFE}, {0x02010, 0x02027},
{0x02030, 0x0205E}, {0x0207A, 0x0207E}, {0x0208A, 0x0208E},
{0x020A0, 0x020C0}, {0x02100, 0x02101}, {0x02103, 0x02106},
{0x02108, 0x02109}, {0x02114, 0x02114}, {0x02116, 0x02118},
{0x0211E, 0x02123}, {0x02125, 0x02125}, {0x02127, 0x02127},
{0x02129, 0x02129}, {0x0212E, 0x0212E}, {0x0213A, 0x0213B},
{0x02140, 0x02144}, {0x0214A, 0x0214D}, {0x0214F, 0x0214F},
{0x0218A, 0x0218B}, {0x02190, 0x021FF}, {0x02400, 0x02426},
{0x02800, 0x028FF}, {0x03001, 0x03004}, {0x03008, 0x03020},
{0x03030, 0x03030}, {0x03036, 0x03037}, {0x0303D, 0x0303F},
{0x0309B, 0x0309C}, {0x030A0, 0x030A0}, {0x030FB, 0x030FB},
{0x04DC0, 0x04DFF}, {0x0A673, 0x0A673}, {0x0A67E, 0x0A67E},
{0x0A720, 0x0A721}, {0x0A789, 0x0A78A}, {0x0AB5B, 0x0AB5B},
{0x0AB6A, 0x0AB6B}, {0x0FF01, 0x0FF0F}, {0x0FF1A, 0x0FF20},
{0x0FF3B, 0x0FF40}, {0x0FF5B, 0x0FF65}, {0x0FFE0, 0x0FFE6},
{0x0FFE8, 0x0FFEE}, {0x0FFFC, 0x0FFFD}, {0x1F10D, 0x1F1AD},
{0x1F1E6, 0x1F1FF}, {0x1FB00, 0x1FB92}, {0x1FB94, 0x1FBCA},
};
static inline int is_symbol_or_punctuation(int cp) {
int n = (int)(sizeof(sp_ranges) / sizeof(sp_ranges[0]));
for (int i = 0; i < n; i++) {
if (cp >= sp_ranges[i][0] && cp <= sp_ranges[i][1])
return 1;
}
return 0;
}
/* ---- Combined filter for training exclusion ---- */
static inline int is_excluded_from_training(int cp) {
return is_modifier_letter(cp) || is_symbol_or_punctuation(cp);
}
/* ---- Sheet filename → start codepoint ---- */
static int sheet_start_code(const char *basename) {
if (strstr(basename, "ascii_variable")) return 0x00;
if (strstr(basename, "latinExtA_variable")) return 0x100;
if (strstr(basename, "latinExtB_variable")) return 0x180;
if (strstr(basename, "cyrilic_extC_variable")) return 0x1C80;
if (strstr(basename, "cyrilic_extB_variable")) return 0xA640;
if (strstr(basename, "cyrilic_bulgarian_variable")) return 0xF0000;
if (strstr(basename, "cyrilic_serbian_variable")) return 0xF0060;
if (strstr(basename, "cyrilic_variable")) return 0x400;
if (strstr(basename, "halfwidth_fullwidth_variable")) return 0xFF00;
if (strstr(basename, "unipunct_variable")) return 0x2000;
if (strstr(basename, "greek_polytonic")) return 0x1F00;
if (strstr(basename, "greek_variable")) return 0x370;
if (strstr(basename, "thai_variable")) return 0xE00;
if (strstr(basename, "hayeren_variable")) return 0x530;
if (strstr(basename, "kartuli_allcaps_variable")) return 0x1C90;
if (strstr(basename, "kartuli_variable")) return 0x10D0;
if (strstr(basename, "ipa_ext_variable")) return 0x250;
if (strstr(basename, "latinExt_additional_variable")) return 0x1E00;
if (strstr(basename, "tsalagi_variable")) return 0x13A0;
if (strstr(basename, "phonetic_extensions_variable")) return 0x1D00;
if (strstr(basename, "latinExtC_variable")) return 0x2C60;
if (strstr(basename, "latinExtD_variable")) return 0xA720;
if (strstr(basename, "internal_variable")) return 0xFFE00;
if (strstr(basename, "letterlike_symbols_variable")) return 0x2100;
if (strstr(basename, "enclosed_alphanumeric")) return 0x1F100;
if (strstr(basename, "sundanese_variable")) return 0x1B80;
if (strstr(basename, "control_pictures_variable")) return 0x2400;
if (strstr(basename, "latinExtE_variable")) return 0xAB30;
if (strstr(basename, "latinExtF_variable")) return 0x10780;
if (strstr(basename, "latinExtG_variable")) return 0x1DF00;
if (strstr(basename, "devanagari") && !strstr(basename, "internal"))
return 0x900;
return -1;
}
#endif /* UNICODE_FILTER_H */

View File

@@ -46,8 +46,8 @@ Rightmost vertical column (should be 20 px tall) contains the tags. Tags are def
W |= Width of the character W |= Width of the character
W | W |
W -' W -'
m --Is this character lowheight?
K -, K -,
K |
K |= Tags used by the "Keming Machine" K |= Tags used by the "Keming Machine"
K -' K -'
Q ---Compiler Directive (see below) Q ---Compiler Directive (see below)
@@ -77,29 +77,32 @@ Up&Down:
<MSB,Red> SXXXXXXX SYYYYYYY 00000000 <LSB,Blue> <MSB,Red> SXXXXXXX SYYYYYYY 00000000 <LSB,Blue>
Each X and Y numbers are Signed 8-Bit Integer. Each X and Y numbers are TWO'S COMPLEMENT Signed 8-Bit Integer.
X-positive: nudges towards left X-positive: nudges towards left
Y-positive: nudges towards up Y-positive: nudges towards down
#### Diacritics Anchor Point Encoding #### Diacritics Anchor Point Encoding
4 Pixels are further divided as follows: 4 Pixels are further divided as follows:
| LSB | | Red | Green | Blue | | LSB | | Red | Green | Blue |
| ------------ | ------------ | ------------ | ------------ | ------------ | | ------------ | ------------ | ------------ | ----------- | ------------ |
| Y | Anchor point Y for: | undefined | undefined | undefined | | Y | Anchor point Y for: | undefined | undefined | undefined |
| X | Anchor point X for: | undefined | undefined | undefined | | X | Anchor point X for: | undefined | undefined | undefined |
| Y | Anchor point Y for: | (unused) | (unused) | (unused) | | Y | Anchor point Y for: | Type-0 | Type-1 | Type-2 |
| X | Anchor point X for: | Type-0 | Type-1 | Type-2 | | X | Anchor point X for: | Type-0 | Type-1 | Type-2 |
| **MSB** | | | | | | **MSB** | | | | |
<MSB,Red> 1Y1Y1Y1Y 1Y2Y2Y2Y 1Y3Y3Y3Y <LSB,Blue> <MSB,Red> 1Y1Y1Y1Y 2Y2Y2Y2Y 3Y3Y3Y3Y <LSB,Blue>
<MSB,Red> 1X1X1X1X 1X2X2X2X 1X3X3X3X <LSB,Blue> <MSB,Red> 1X1X1X1X 2X2X2X2X 3X3X3X3X <LSB,Blue>
where Red is first, Green is second, Blue is the third diacritics. where Red is first, Green is second, Blue is the third diacritics.
MSB for each word must be set so that the pixel would appear brighter on the image editor.
(the font program will only read low 7 bits for each RGB channel) Each X and Y numbers are SIGN AND MAGNITUDE 8-Bit Integer.
X-positive: nudges towards left
Y-positive: nudges towards down
#### Diacritics Type Bit Encoding #### Diacritics Type Bit Encoding

Binary file not shown.

View File

@@ -28,7 +28,7 @@ from keming_machine import generate_kerning_pairs
from opentype_features import generate_features, glyph_name from opentype_features import generate_features, glyph_name
import sheet_config as SC import sheet_config as SC
FONT_VERSION = "1.15" FONT_VERSION = "1.16"
# Codepoints that get cmap entries (user-visible) # Codepoints that get cmap entries (user-visible)
# PUA forms used internally by GSUB get glyphs but NO cmap entries # PUA forms used internally by GSUB get glyphs but NO cmap entries
@@ -332,6 +332,7 @@ def build_font(assets_dir, output_path, no_bitmap=False, no_features=False):
charstrings[".notdef"] = pen.getCharString() charstrings[".notdef"] = pen.getCharString()
_unihan_cps = set(SC.CODE_RANGE[SC.SHEET_UNIHAN]) _unihan_cps = set(SC.CODE_RANGE[SC.SHEET_UNIHAN])
_emoji1_cps = set(SC.CODE_RANGE[SC.SHEET_EMOJI1])
_base_offsets = {} # glyph_name -> (x_offset, y_offset) for COLR layers _base_offsets = {} # glyph_name -> (x_offset, y_offset) for COLR layers
traced_count = 0 traced_count = 0
@@ -370,10 +371,9 @@ def build_font(assets_dir, output_path, no_bitmap=False, no_features=False):
x_offset = 0 x_offset = 0
x_offset -= g.props.nudge_x * SCALE x_offset -= g.props.nudge_x * SCALE
# For STACK_DOWN marks (below-base diacritics), negative nudge_y # For marks (write_on_top >= 0), positive nudge_y means shift UP
# means "shift content down to below baseline". The sign convention # in the bitmap engine (opposite to non-marks where positive = down).
# is opposite to non-marks where positive nudge_y means shift down. if g.props.write_on_top >= 0:
if g.props.stack_where == SC.STACK_DOWN and g.props.write_on_top >= 0:
y_offset = g.props.nudge_y * SCALE y_offset = g.props.nudge_y * SCALE
else: else:
y_offset = -g.props.nudge_y * SCALE y_offset = -g.props.nudge_y * SCALE
@@ -383,6 +383,10 @@ def build_font(assets_dir, output_path, no_bitmap=False, no_features=False):
if cp in _unihan_cps: if cp in _unihan_cps:
y_offset -= ((SC.H - SC.H_UNIHAN) // 2) * SCALE y_offset -= ((SC.H - SC.H_UNIHAN) // 2) * SCALE
# Emoji1 glyphs are 16px tall in a 20px cell; same 2px top/bottom padding.
if cp in _emoji1_cps:
y_offset -= ((SC.H - SC.H_EMOJI1) // 2) * SCALE
# Hangul jungseong/jongseong PUA variants (rows 15-18) have zero # Hangul jungseong/jongseong PUA variants (rows 15-18) have zero
# advance and overlay the preceding choseong. Shift their outlines # advance and overlay the preceding choseong. Shift their outlines
# left by one syllable cell width so they render at the same position. # left by one syllable cell width so they render at the same position.

View File

@@ -38,6 +38,7 @@ class GlyphProps:
has_kern_data: bool = False has_kern_data: bool = False
is_kern_y_type: bool = False is_kern_y_type: bool = False
kerning_mask: int = 255 kerning_mask: int = 255
dot_removal: Optional[int] = None # codepoint to replace with when followed by a STACK_UP mark
directive_opcode: int = 0 directive_opcode: int = 0
directive_arg1: int = 0 directive_arg1: int = 0
directive_arg2: int = 0 directive_arg2: int = 0
@@ -131,7 +132,8 @@ def parse_variable_sheet(image, sheet_index, cell_w, cell_h, cols, is_xy_swapped
# Kerning data # Kerning data
kerning_bit1 = _tagify(image.get_pixel(code_start_x, code_start_y + 6)) kerning_bit1 = _tagify(image.get_pixel(code_start_x, code_start_y + 6))
# kerning_bit2 and kerning_bit3 are reserved kerning_bit2 = _tagify(image.get_pixel(code_start_x, code_start_y + 7))
dot_removal = None if kerning_bit2 == 0 else (kerning_bit2 >> 8)
is_kern_y_type = (kerning_bit1 & 0x80000000) != 0 is_kern_y_type = (kerning_bit1 & 0x80000000) != 0
kerning_mask = (kerning_bit1 >> 8) & 0xFFFFFF kerning_mask = (kerning_bit1 >> 8) & 0xFFFFFF
has_kern_data = (kerning_bit1 & 0xFF) != 0 has_kern_data = (kerning_bit1 & 0xFF) != 0
@@ -188,7 +190,7 @@ def parse_variable_sheet(image, sheet_index, cell_w, cell_h, cols, is_xy_swapped
align_where=align_where, write_on_top=write_on_top, align_where=align_where, write_on_top=write_on_top,
stack_where=stack_where, ext_info=ext_info, stack_where=stack_where, ext_info=ext_info,
has_kern_data=has_kern_data, is_kern_y_type=is_kern_y_type, has_kern_data=has_kern_data, is_kern_y_type=is_kern_y_type,
kerning_mask=kerning_mask, kerning_mask=kerning_mask, dot_removal=dot_removal,
directive_opcode=directive_opcode, directive_arg1=directive_arg1, directive_opcode=directive_opcode, directive_arg1=directive_arg1,
directive_arg2=directive_arg2, directive_arg2=directive_arg2,
) )

View File

@@ -16,8 +16,8 @@ import sheet_config as SC
# PUA range for Hangul jamo variant storage. # PUA range for Hangul jamo variant storage.
# We need space for: max_col * max_row variants. # We need space for: max_col * max_row variants.
# Using 0xF0600-0xF1E7F # Using 0x100000-0x10187F
HANGUL_PUA_BASE = 0xF0600 HANGUL_PUA_BASE = 0x100000
def _compose_bitmaps(a, b, w, h): def _compose_bitmaps(a, b, w, h):

View File

@@ -70,6 +70,11 @@ languagesystem sund dflt;
if ccmp_code: if ccmp_code:
parts.append(ccmp_code) parts.append(ccmp_code)
# ccmp: dot removal (e.g. i→ı, j→ȷ when followed by STACK_UP marks)
dot_removal_code = _generate_dot_removal(glyphs, has)
if dot_removal_code:
parts.append(dot_removal_code)
# Hangul jamo GSUB assembly # Hangul jamo GSUB assembly
hangul_code = _generate_hangul_gsub(glyphs, has, jamo_data) hangul_code = _generate_hangul_gsub(glyphs, has, jamo_data)
if hangul_code: if hangul_code:
@@ -162,6 +167,54 @@ def _generate_ccmp(replacewith_subs, has):
return '\n'.join(lines) return '\n'.join(lines)
def _generate_dot_removal(glyphs, has):
"""Generate ccmp contextual substitution for dot removal.
When a base glyph tagged with dot_removal (kerning bit 2, pixel Y+7) is
followed by a STACK_UP mark, substitute the base with its dotless form.
Matches the Kotlin engine's dotRemoval logic.
"""
# Collect all STACK_UP marks
stack_up_marks = []
for cp, g in glyphs.items():
if g.props.write_on_top >= 0 and g.props.stack_where == SC.STACK_UP and has(cp):
stack_up_marks.append(cp)
if not stack_up_marks:
return ""
# Collect all base glyphs with dot_removal
dot_removal_subs = []
for cp, g in glyphs.items():
if g.props.dot_removal is not None and has(cp) and has(g.props.dot_removal):
dot_removal_subs.append((cp, g.props.dot_removal))
if not dot_removal_subs:
return ""
lines = []
# Define the STACK_UP marks class
mark_names = ' '.join(glyph_name(cp) for cp in sorted(stack_up_marks))
lines.append(f"@stackUpMarks = [{mark_names}];")
lines.append("")
# Single substitution lookup for the replacements
lines.append("lookup DotRemoval {")
for src_cp, dst_cp in sorted(dot_removal_subs):
lines.append(f" sub {glyph_name(src_cp)} by {glyph_name(dst_cp)};")
lines.append("} DotRemoval;")
lines.append("")
# Contextual rules in ccmp
lines.append("feature ccmp {")
for src_cp, _ in sorted(dot_removal_subs):
lines.append(f" sub {glyph_name(src_cp)}' lookup DotRemoval @stackUpMarks;")
lines.append("} ccmp;")
return '\n'.join(lines)
def _generate_hangul_gsub(glyphs, has, jamo_data): def _generate_hangul_gsub(glyphs, has, jamo_data):
""" """
Generate Hangul jamo GSUB lookups for syllable assembly. Generate Hangul jamo GSUB lookups for syllable assembly.
@@ -220,7 +273,7 @@ def _generate_hangul_gsub(glyphs, has, jamo_data):
continue continue
for f in [0, 1]: for f in [0, 1]:
try: try:
row_ng = SC.get_han_initial_row(1, idx, f) row_ng = SC.get_han_initial_row(2, idx, f)
except (ValueError, KeyError): except (ValueError, KeyError):
continue continue
jung_groups_general.setdefault((row_ng, f), []).append(jcp) jung_groups_general.setdefault((row_ng, f), []).append(jcp)
@@ -1750,7 +1803,7 @@ def _generate_mark(glyphs, has):
mark_groups = {} # (mark_type, align, is_dia, stack_cat) -> [(cp, g), ...] mark_groups = {} # (mark_type, align, is_dia, stack_cat) -> [(cp, g), ...]
for cp, g in marks.items(): for cp, g in marks.items():
is_dia = (0x0300 <= cp <= 0x036F) is_dia = True # all marks (write_on_top >= 0) are diacritics; Kotlin applies lowheight shiftdown unconditionally
sc = _stack_cat(g.props.stack_where) sc = _stack_cat(g.props.stack_where)
key = (g.props.write_on_top, g.props.align_where, is_dia, sc) key = (g.props.write_on_top, g.props.align_where, is_dia, sc)
mark_groups.setdefault(key, []).append((cp, g)) mark_groups.setdefault(key, []).append((cp, g))
@@ -1846,7 +1899,9 @@ def _generate_mark(glyphs, has):
# Lowheight adjustment for combining diacritical marks: # Lowheight adjustment for combining diacritical marks:
# shift base anchor Y down so diacritics sit closer to # shift base anchor Y down so diacritics sit closer to
# the shorter base glyph. # the shorter base glyph.
if is_dia and g.props.is_low_height: # Only applies to 'up' marks (and overlay), not 'dn',
# matching Kotlin which only adjusts in STACK_UP/STACK_UP_N_DOWN.
if is_dia and g.props.is_low_height and scat != 'dn':
if mark_type == 2: # overlay if mark_type == 2: # overlay
ay -= SC.H_OVERLAY_LOWERCASE_SHIFTDOWN * SC.SCALE ay -= SC.H_OVERLAY_LOWERCASE_SHIFTDOWN * SC.SCALE
else: # above (type 0) else: # above (type 0)
@@ -1878,12 +1933,13 @@ def _generate_mark(glyphs, has):
lines.append(f"lookup {mkmk_name} {{") lines.append(f"lookup {mkmk_name} {{")
if scat == 'up': if scat == 'up':
m2y = SC.ASCENT + SC.H_DIACRITICS * SC.SCALE m2y_base = SC.ASCENT + SC.H_DIACRITICS * SC.SCALE
else: # 'dn' else: # 'dn'
m2y = SC.ASCENT - SC.H_DIACRITICS * SC.SCALE m2y_base = SC.ASCENT - SC.H_DIACRITICS * SC.SCALE
for cp, g in mark_list: for cp, g in mark_list:
mx = mark_anchors.get(cp, 0) mx = mark_anchors.get(cp, 0)
m2y = m2y_base
lines.append( lines.append(
f" pos mark {glyph_name(cp)}" f" pos mark {glyph_name(cp)}"
f" <anchor {mx} {m2y}> mark {class_name};" f" <anchor {mx} {m2y}> mark {class_name};"
@@ -1893,6 +1949,46 @@ def _generate_mark(glyphs, has):
lines.append("") lines.append("")
mkmk_lookup_names.append(mkmk_name) mkmk_lookup_names.append(mkmk_name)
# --- Nudge-Y gap correction for MarkToMark ---
# Without cascade, the gap between consecutive 'up' marks is
# H_DIACRITICS + nudge_y_2 - nudge_y_1
# which is less than H_DIACRITICS when nudge_y_1 > nudge_y_2
# (e.g. Cyrillic uni2DED nudge=2 followed by uni0487 nudge=0).
# Add contextual positioning to compensate: shift mark2 up by
# (nudge_y_1 - nudge_y_2) * SCALE for each such pair.
# This keeps Thai correct (same nudge on both marks → no correction)
# while fixing Cyrillic (different nudge → correction applied).
nudge_groups = {} # nudge_y -> [glyph_name, ...]
for (mark_type, align, is_dia, scat), mark_list in sorted(mark_groups.items()):
if scat != 'up':
continue
for cp, g in mark_list:
ny = g.props.nudge_y
nudge_groups.setdefault(ny, []).append(glyph_name(cp))
distinct_nudges = sorted(nudge_groups.keys())
correction_pairs = []
for n1 in distinct_nudges:
for n2 in distinct_nudges:
if n1 > n2:
correction_pairs.append((n1, n2, (n1 - n2) * SC.SCALE))
if correction_pairs:
for ny, glyphs in sorted(nudge_groups.items()):
lines.append(f"@up_nudge_{ny} = [{' '.join(sorted(glyphs))}];")
lines.append("")
mkmk_corr_name = "mkmk_nudge_correct"
lines.append(f"lookup {mkmk_corr_name} {{")
for n1, n2, val in correction_pairs:
lines.append(
f" pos @up_nudge_{n1} <0 0 0 0>"
f" @up_nudge_{n2} <0 {val} 0 0>;"
)
lines.append(f"}} {mkmk_corr_name};")
lines.append("")
mkmk_lookup_names.append(mkmk_corr_name)
# Register MarkToBase lookups under mark. # Register MarkToBase lookups under mark.
# dev2 is excluded: HarfBuzz/DirectWrite use abvm for Devanagari marks. # dev2 is excluded: HarfBuzz/DirectWrite use abvm for Devanagari marks.
# deva is INCLUDED: CoreText's old-Indic shaper may need mark/mkmk # deva is INCLUDED: CoreText's old-Indic shaper may need mark/mkmk

View File

@@ -6,8 +6,10 @@ Ported from TerrarumSansBitmap.kt companion object and SheetConfig.kt.
# Font metrics # Font metrics
H = 20 H = 20
H_UNIHAN = 16 H_UNIHAN = 16
H_EMOJI1 = 16
W_HANGUL_BASE = 13 W_HANGUL_BASE = 13
W_UNIHAN = 16 W_UNIHAN = 16
W_EMOJI1 = 17
W_LATIN_WIDE = 9 W_LATIN_WIDE = 9
W_VAR_INIT = 15 W_VAR_INIT = 15
W_WIDEVAR_INIT = 31 W_WIDEVAR_INIT = 31
@@ -72,6 +74,18 @@ SHEET_ALPHABETIC_PRESENTATION_FORMS = 38
SHEET_HENTAIGANA_VARW = 39 SHEET_HENTAIGANA_VARW = 39
SHEET_CONTROL_PICTURES_VARW = 40 SHEET_CONTROL_PICTURES_VARW = 40
SHEET_LEGACY_COMPUTING_VARW = 41 SHEET_LEGACY_COMPUTING_VARW = 41
SHEET_CYRILIC_EXTB_VARW = 42
SHEET_CYRILIC_EXTA_VARW = 43
SHEET_CYRILIC_EXTC_VARW = 44
SHEET_LATIN_EXTE_VARW = 45
SHEET_LATIN_EXTF_VARW = 46
SHEET_LATIN_EXTG_VARW = 47
SHEET_OGHAM_VARW = 48
SHEET_COPTIC_VARW = 49
SHEET_CYRILIC_EXTD_VARW = 50
SHEET_MATHS1_VARW = 51
SHEET_EMOJI1 = 52
SHEET_ENCLOSED_ALPHNUM_VARW = 53
SHEET_UNKNOWN = 254 SHEET_UNKNOWN = 254
@@ -118,6 +132,18 @@ FILE_LIST = [
"hentaigana_variable.tga", "hentaigana_variable.tga",
"control_pictures_variable.tga", "control_pictures_variable.tga",
"symbols_for_legacy_computing_variable.tga", "symbols_for_legacy_computing_variable.tga",
"cyrilic_extB_variable.tga",
"cyrilic_extA_variable.tga",
"cyrilic_extC_variable.tga",
"latinExtE_variable.tga",
"latinExtF_variable.tga",
"latinExtG_variable.tga",
"ogham_variable.tga",
"coptic_variable.tga",
"cyrilic_extD_variable.tga",
"maths1_extrawide_variable.tga",
"emoji1.tga",
"enclosed_alphanumeric_variable.tga",
] ]
CODE_RANGE = [ CODE_RANGE = [
@@ -131,7 +157,7 @@ CODE_RANGE = [
list(range(0x400, 0x530)), # 7: Cyrillic list(range(0x400, 0x530)), # 7: Cyrillic
list(range(0xFF00, 0x10000)), # 8: Halfwidth/Fullwidth list(range(0xFF00, 0x10000)), # 8: Halfwidth/Fullwidth
list(range(0x2000, 0x20A0)), # 9: Uni Punct list(range(0x2000, 0x20A0)), # 9: Uni Punct
list(range(0x370, 0x3CF)), # 10: Greek list(range(0x370, 0x400)), # 10: Greek
list(range(0xE00, 0xE60)), # 11: Thai list(range(0xE00, 0xE60)), # 11: Thai
list(range(0x530, 0x590)), # 12: Armenian list(range(0x530, 0x590)), # 12: Armenian
list(range(0x10D0, 0x1100)), # 13: Georgian list(range(0x10D0, 0x1100)), # 13: Georgian
@@ -151,7 +177,7 @@ CODE_RANGE = [
list(range(0xA720, 0xA800)), # 27: Latin Ext D list(range(0xA720, 0xA800)), # 27: Latin Ext D
list(range(0x20A0, 0x20D0)), # 28: Currencies list(range(0x20A0, 0x20D0)), # 28: Currencies
list(range(0xFFE00, 0xFFFA0)), # 29: Internal list(range(0xFFE00, 0xFFFA0)), # 29: Internal
list(range(0x2100, 0x2150)), # 30: Letterlike list(range(0x2100, 0x2200)), # 30: Letterlike
list(range(0x1F100, 0x1F200)), # 31: Enclosed Alphanum Supl list(range(0x1F100, 0x1F200)), # 31: Enclosed Alphanum Supl
list(range(0x0B80, 0x0C00)) + list(range(0xF00C0, 0xF0100)), # 32: Tamil list(range(0x0B80, 0x0C00)) + list(range(0xF00C0, 0xF0100)), # 32: Tamil
list(range(0x980, 0xA00)), # 33: Bengali list(range(0x980, 0xA00)), # 33: Bengali
@@ -161,8 +187,20 @@ CODE_RANGE = [
list(range(0xF0520, 0xF0580)), # 37: Codestyle ASCII list(range(0xF0520, 0xF0580)), # 37: Codestyle ASCII
list(range(0xFB00, 0xFB18)), # 38: Alphabetic Presentation list(range(0xFB00, 0xFB18)), # 38: Alphabetic Presentation
list(range(0x1B000, 0x1B170)), # 39: Hentaigana list(range(0x1B000, 0x1B170)), # 39: Hentaigana
list(range(0x2400, 0x2440)), # 40: Control Pictures list(range(0x2400, 0x2450)), # 40: Control Pictures
list(range(0x1FB00, 0x1FC00)), # 41: Legacy Computing list(range(0x1FB00, 0x1FC00)), # 41: Legacy Computing
list(range(0xA640, 0xA6A0)), # 42: Cyrillic Ext B
list(range(0x2DE0, 0x2E00)), # 43: Cyrillic Ext A
list(range(0x1C80, 0x1C8F)), # 44: Cyrillic Ext C
list(range(0xAB30, 0xAB70)), # 45: Latin Ext E
list(range(0x10780, 0x107C0)), # 46: Latin Ext F
list(range(0x1DF00, 0x1E000)), # 47: Latin Ext G
list(range(0x1680, 0x16A0)), # 48: Ogham
list(range(0x2C80, 0x2D00)), # 49: Coptic
list(range(0x1E030, 0x1E090)), # 50: Cyrillic Ext D
list(range(0x2200, 0x2400)), # 51: Maths1
list(range(0x1F600, 0x1F650)), # 52: Emoji1
list(range(0x2460, 0x2500)), # 53: Enclosed Alphanum
] ]
CODE_RANGE_HANGUL_COMPAT = range(0x3130, 0x3190) CODE_RANGE_HANGUL_COMPAT = range(0x3130, 0x3190)
@@ -244,6 +282,8 @@ def get_cell_width(sheet_index):
return W_VAR_INIT + HGAP_VAR # 16 return W_VAR_INIT + HGAP_VAR # 16
if sheet_index == SHEET_UNIHAN: if sheet_index == SHEET_UNIHAN:
return W_UNIHAN return W_UNIHAN
if sheet_index == SHEET_EMOJI1:
return W_EMOJI1
if sheet_index == SHEET_HANGUL: if sheet_index == SHEET_HANGUL:
return W_HANGUL_BASE return W_HANGUL_BASE
if sheet_index == SHEET_CUSTOM_SYM: if sheet_index == SHEET_CUSTOM_SYM:
@@ -256,6 +296,8 @@ def get_cell_width(sheet_index):
def get_cell_height(sheet_index): def get_cell_height(sheet_index):
if sheet_index == SHEET_UNIHAN: if sheet_index == SHEET_UNIHAN:
return H_UNIHAN return H_UNIHAN
if sheet_index == SHEET_EMOJI1:
return H_EMOJI1
if sheet_index == SHEET_CUSTOM_SYM: if sheet_index == SHEET_CUSTOM_SYM:
return SIZE_CUSTOM_SYM return SIZE_CUSTOM_SYM
return H return H
@@ -539,5 +581,17 @@ def index_y(sheet_index, c):
SHEET_HENTAIGANA_VARW: lambda: (c - 0x1B000) // 16, SHEET_HENTAIGANA_VARW: lambda: (c - 0x1B000) // 16,
SHEET_CONTROL_PICTURES_VARW: lambda: (c - 0x2400) // 16, SHEET_CONTROL_PICTURES_VARW: lambda: (c - 0x2400) // 16,
SHEET_LEGACY_COMPUTING_VARW: lambda: (c - 0x1FB00) // 16, SHEET_LEGACY_COMPUTING_VARW: lambda: (c - 0x1FB00) // 16,
SHEET_CYRILIC_EXTB_VARW: lambda: (c - 0xA640) // 16,
SHEET_CYRILIC_EXTA_VARW: lambda: (c - 0x2DE0) // 16,
SHEET_CYRILIC_EXTC_VARW: lambda: (c - 0x1C80) // 16,
SHEET_LATIN_EXTE_VARW: lambda: (c - 0xAB30) // 16,
SHEET_LATIN_EXTF_VARW: lambda: (c - 0x10780) // 16,
SHEET_LATIN_EXTG_VARW: lambda: (c - 0x1DF00) // 16,
SHEET_OGHAM_VARW: lambda: (c - 0x1680) // 16,
SHEET_COPTIC_VARW: lambda: (c - 0x2C80) // 16,
SHEET_CYRILIC_EXTD_VARW: lambda: (c - 0x1E030) // 16,
SHEET_MATHS1_VARW: lambda: (c - 0x2200) // 16,
SHEET_EMOJI1: lambda: (c - 0x1F600) // 16,
SHEET_ENCLOSED_ALPHNUM_VARW: lambda: (c - 0x2460) // 16,
SHEET_HANGUL: lambda: 0, SHEET_HANGUL: lambda: 0,
}.get(sheet_index, lambda: c // 16)() }.get(sheet_index, lambda: c // 16)()

Binary file not shown.

BIN
demo.PNG

Binary file not shown.

Before

Width:  |  Height:  |  Size: 168 KiB

After

Width:  |  Height:  |  Size: 180 KiB

View File

@@ -25,21 +25,23 @@ How multilingual? Real multilingual!
􏻬আমি কাঁচ খেতে পারি, তাতে আমার কোনো ক্ষতি হয় না। 􀀀 􏻬আমি কাঁচ খেতে পারি, তাতে আমার কোনো ক্ষতি হয় না। 􀀀
􏻬󿿁Под южно дърво, цъфтящо в синьо, бягаше малко пухкаво зайче󿿀􀀀 􏻬󿿁Под южно дърво, цъфтящо в синьо, бягаше малко пухкаво зайче󿿀􀀀
􏻬ᎠᏍᎦᏯᎡᎦᎢᎾᎨᎢᎣᏍᏓᎤᎩᏍᏗᎥᎴᏓᎯᎲᎢᏔᎵᏕᎦᏟᏗᏖᎸᎳᏗᏗᎧᎵᎢᏘᎴᎩ ᏙᏱᏗᏜᏫᏗᏣᏚᎦᏫᏛᏄᏓᎦᏝᏃᎠᎾᏗᎭᏞᎦᎯᎦᏘᏓᏠᎨᏏᏕᏡᎬᏢᏓᏥᏩᏝᎡᎢᎪᎢ ᎠᎦᏂᏗᎮᎢᎫᎩᎬᏩᎴᎢᎠᏆᏅᏛᎫᏊᎾᎥᎠᏁᏙᎲᏐᏈᎵᎤᎩᎸᏓᏭᎷᏤᎢᏏᏉᏯᏌᏊ ᎤᏂᏋᎢᏡᎬᎢᎰᏩᎬᏤᎵᏍᏗᏱᎩᎱᎱᎤᎩᎴᎢᏦᎢᎠᏂᏧᏣᏨᎦᏥᎪᎥᏌᏊᎤᎶᏒᎢᎢᏡᎬᎢ ᎹᎦᎺᎵᏥᎻᎼᏏᎽᏗᏩᏂᎦᏘᎾᎿᎠᏁᎬᎢᏅᎩᎾᏂᎡᎢᏌᎶᎵᏎᎷᎠᏑᏍᏗᏪᎩ ᎠᎴ ᏬᏗᏲᏭᎾᏓᏍᏓᏴᏁᎢᎤᎦᏅᏮᏰᎵᏳᏂᎨᎢ􀀀 􏻬ᎠᏍᎦᏯᎡᎦᎢᎾᎨᎢᎣᏍᏓᎤᎩᏍᏗᎥᎴᏓᎯᎲᎢᏔᎵᏕᎦᏟᏗᏖᎸᎳᏗᏗᎧᎵᎢᏘᎴᎩ ᏙᏱᏗᏜᏫᏗᏣᏚᎦᏫᏛᏄᏓᎦᏝᏃᎠᎾᏗᎭᏞᎦᎯᎦᏘᏓᏠᎨᏏᏕᏡᎬᏢᏓᏥᏩᏝᎡᎢᎪᎢ ᎠᎦᏂᏗᎮᎢᎫᎩᎬᏩᎴᎢᎠᏆᏅᏛᎫᏊᎾᎥᎠᏁᏙᎲᏐᏈᎵᎤᎩᎸᏓᏭᎷᏤᎢᏏᏉᏯᏌᏊ ᎤᏂᏋᎢᏡᎬᎢᎰᏩᎬᏤᎵᏍᏗᏱᎩᎱᎱᎤᎩᎴᎢᏦᎢᎠᏂᏧᏣᏨᎦᏥᎪᎥᏌᏊᎤᎶᏒᎢᎢᏡᎬᎢ ᎹᎦᎺᎵᏥᎻᎼᏏᎽᏗᏩᏂᎦᏘᎾᎿᎠᏁᎬᎢᏅᎩᎾᏂᎡᎢᏌᎶᎵᏎᎷᎠᏑᏍᏗᏪᎩ ᎠᎴ ᏬᏗᏲᏭᎾᏓᏍᏓᏴᏁᎢᎤᎦᏅᏮᏰᎵᏳᏂᎨᎢ􀀀
􏻬Ѳеѡфа́нъ и҆ Алеѯі́й, ѕѣлѡ̀ возлюби́вше ѱалти́рь, воспѣ́ша при свѣ́тѣ ѕвѣ́здъ, помазꙋ́юще сщ҃е́нное мѵ́ро; серафими мн̑оꙮ҆читїи̑, ꙗ҆́кѡ ѻ҆́гнь, ѡ҆крꙋжа́хꙋ прⷭ҇то́лъ Бж҃їй, и҆ всѧ̀ землѧ̀ и҆спо́лнисѧ свѣ́та, ꙗ҆́кѡ ѕмі́й попра́нъ є҆́сть􀀀
􏻬ⲡⲓⲝⲉⲛⲟⲥ ⲅⲁⲣ ⲁϥϫⲉⲙ ⲟⲩⲫⲱⲥ ϧⲉⲛ ⲡⲓⲍⲏⲗⲟⲥ ⲛⲧⲉ ϯⲯⲩⲭⲏ· ⲁϥϣⲱⲡⲓ ⲇⲉ ⲕⲁⲧⲁ ⲡⲓⲑⲉⲗⲏⲙⲁ· ⲁϥϭⲓ ⲛϩⲱⲃ ⲛⲓⲃⲉⲛ ⲟⲩⲟϩ ⲁϥϯⲙⲟⲧ􀀀
􏻬Příliš žluťoučký kůň úpěl ďábelské ódy􀀀 􏻬Příliš žluťoučký kůň úpěl ďábelské ódy􀀀
􏻬Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen Walther spillede på xylofon􀀀 􏻬Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen Walther spillede på xylofon􀀀
􏻬PACK MY BOX WITH FIVE DOZEN LIQUOR JUGS􀀀 􏻬Sphinx of black quartz, judge my vow􀀀
􏻬hƿæt ƿe ᵹardena inᵹear ꝺaᵹum þeoꝺ cynninᵹa þꞃym ᵹeꝼꞃumon􀀀 􏻬hƿæt ƿe ᵹardena inᵹear ꝺaᵹum þeoꝺ cynninᵹa þꞃym ᵹeꝼꞃumon􀀀
􏻬Victor jagt zwölf Boxkämpfer quer über den großen Sylter Deich GROẞEN GROẞE􀀀 􏻬Victor jagt zwölf Boxkämpfer quer über den GROẞEN Sylter Deich􀀀
􏻬ζαφείρι δέξου πάγκαλο, βαθῶν ψυχῆς τὸ σῆμα􀀀 􏻬ζαφείρι δέξου πάγκαλο, βαθῶν ψυχῆς τὸ σῆμα􀀀
􏻬ΔΙΑΦΥΛΆΞΤΕ ΓΕΝΙΚΆ ΤΗ ΖΩΉ ΣΑΣ ΑΠΌ ΒΑΘΕΙΆ ΨΥΧΙΚΆ ΤΡΑΎΜΑΤΑ􀀀
􏻬სწრაფი ყავისფერი მელა გადაახტა ზარმაც ძაღლს ᲘᲜᲢᲔᲚ ᲞᲔᲜᲢᲘᲣᲛᲘ ᲛᲘᲙᲠᲝᲞᲠᲝᲪᲔᲡᲝᲠᲘ􀀀 􏻬სწრაფი ყავისფერი მელა გადაახტა ზარმაც ძაღლს ᲘᲜᲢᲔᲚ ᲞᲔᲜᲢᲘᲣᲛᲘ ᲛᲘᲙᲠᲝᲞᲠᲝᲪᲔᲡᲝᲠᲘ􀀀
􏻬ऋषियों को सताने वाले दुष्ट राक्षसों के राजा रावण का सर्वनाश करने वाले विष्णुवतार भगवान श्रीराम अयोध्या के महाराज दशरथ के􀀀 􏻬ऋषियों को सताने वाले दुष्ट राक्षसों के राजा रावण का सर्वनाश करने वाले विष्णुवतार भगवान श्रीराम अयोध्या के महाराज दशरथ के􀀀
􏻬Kæmi ný öxi hér, ykist þjófum nú bæði víl og ádrepa􀀀 􏻬Kæmi ný öxi hér, ykist þjófum nú bæði víl og ádrepa􀀀
􏻬Ċuaiġ bé ṁórṡáċ le dlúṫspád fíoꝛḟinn trí hata mo ḋea-ṗoꝛcáin ḃig􀀀 􏻬Scríoḃ Fergus ⁊ a ṁáṫaır dán le peann úr􀀀
􏻬あめつちほしそら やまかはみねたに くもきりむろこけ ひといぬうへすゑ ゆわさるおふせよ えの𛀁をなれゐて􀀀 􏻬あめつちほしそら やまかはみねたに くもきりむろこけ ひといぬうへすゑ ゆわさるおふせよ えの𛀁をなれゐて􀀀
􏻬トリナクコヱス ユメサマセ ミヨアケワタル ヒンカシヲ ソライロハエテ オキツヘニ ホフネムレヰヌ モヤノウチ􀀀 􏻬トリナクコヱス ユメサマセ ミヨアケワタル ヒンカシヲ ソライロハエテ オキツヘニ ホフネムレヰヌ モヤノウチ􀀀
􏻬田居に出で 菜摘むわれをぞ 君召すと 求食り追ひゆく 山城の 打酔へる子ら 藻葉干せよ え舟繋けぬ􀀀 􏻬田居に出で 菜摘むわれをぞ 君召すと 求食り追ひゆく 山城の 打酔へる子ら 藻葉干せよ え舟繋けぬ􀀀
􏻬정 참판 양반댁 규수 큰 교자 타고 혼례 치른 날  찦차를 타고 온 펲시맨과 쑛다리 똠방각하􀀀 􏻬콩고물과 우유가 들어간 빙수는 차게 먹어야 특별한 맛이 잘 표현된다 유쾌했던 땃쥐 토끼풀 쫓기 바쁨􀀀
􏻬찦차를 타고 온 펲시맨과 쑛다리 똠방각하 왜날뷁􀀀
􏻬쾅 ᄒᆞ는 소리 헨 “아이구 베락 털어져ᇝ인가?” 영 걷어진 쥥은 몰르곡 경헨 ᄇᆞᆰ도록 ᄌᆞᆷ ᄒᆞᆫᄌᆞᆷ들 안 잣수다􀀀 􏻬쾅 ᄒᆞ는 소리 헨 “아이구 베락 털어져ᇝ인가?” 영 걷어진 쥥은 몰르곡 경헨 ᄇᆞᆰ도록 ᄌᆞᆷ ᄒᆞᆫᄌᆞᆷ들 안 잣수다􀀀
􏻬Četri psihi faķīri vēlu vakarā zāģēja guļbūvei durvis, fonā šņācot mežam􀀀 􏻬Četri psihi faķīri vēlu vakarā zāģēja guļbūvei durvis, fonā šņācot mežam􀀀
􏻬Įlinkdama fechtuotojo špaga sublykčiojusi pragręžė apvalų arbūzą􀀀 􏻬Įlinkdama fechtuotojo špaga sublykčiojusi pragręžė apvalų arbūzą􀀀
@@ -57,7 +59,7 @@ How multilingual? Real multilingual!
􏻬Pijamalı hasta yağız şoföre çabucak güvendi􀀀 􏻬Pijamalı hasta yağız şoföre çabucak güvendi􀀀
􏻬Жебракують філософи при ґанку церкви в Гадячі, ще й шатро їхнє п’яне знаємо􀀀 􏻬Жебракують філософи при ґанку церкви в Гадячі, ще й шатро їхнє п’яне знаємо􀀀
􏻬Do bạch kim rất quý nên sẽ dùng để lắp vô xương􀀀 􏻬Do bạch kim rất quý nên sẽ dùng để lắp vô xương􀀀
􏻬日堀油告観観藤村抄海評業庁経賃室弁市。太撮収改売週法所何都慣次現。価紙一無三洋日話転手治稿載末替付致治。􀀀 􏻬日堀油告観観藤村抄海評業庁経賃室弁市。太撮収改売週法所何都慣次現。􀀀
􏻬[pʰnɣɬɥi.m͡ŋχɫʍɨnaɸ.cθʊɫɯ.ɹɨɫʏ͡ɛx.ɯ͡ɣaxɲaɣɫ.ɸtʰɑɣɴ]􀀀 􏻬[pʰnɣɬɥi.m͡ŋχɫʍɨnaɸ.cθʊɫɯ.ɹɨɫʏ͡ɛx.ɯ͡ɣaxɲaɣɫ.ɸtʰɑɣɴ]􀀀
􏻬⠑⠥⠊⠵⠀⠟⠫⠒⠵⠀⠓⠗⠎⠉⠂⠀⠠⠊⠗⠘⠍⠓⠎⠀⠨⠣⠩⠐⠥⠍⠑⠱⠀⠈⠪⠀⠨⠷⠎⠢⠈⠧⠀⠈⠏⠒⠐⠕⠝⠀⠕⠌⠎⠀⠊⠿⠊⠪⠶⠚⠊􀀀 􏻬⠑⠥⠊⠵⠀⠟⠫⠒⠵⠀⠓⠗⠎⠉⠂⠀⠠⠊⠗⠘⠍⠓⠎⠀⠨⠣⠩⠐⠥⠍⠑⠱⠀⠈⠪⠀⠨⠷⠎⠢⠈⠧⠀⠈⠏⠒⠐⠕⠝⠀⠕⠌⠎⠀⠊⠿⠊⠪⠶⠚⠊􀀀
@@ -96,7 +98,7 @@ How multilingual? Real multilingual!
􎳌‣ Unicode fractions, also known as super/subscripts􀀀 􎳌‣ Unicode fractions, also known as super/subscripts􀀀
􏻬ᄀᆞᄅᆞᇝ ᄀᆞᅀᅢ 자거늘 밀므리 사ᄋᆞ리로ᄃᆡ 나거ᅀᅡ ᄌᆞᄆᆞ니ᅌᅵ다 셤 안해 자시ᇙ 제 한비 사ᄋᆞ리로ᄃᆡ 뷔어ᅀᅡ ᄌᆞᄆᆞ니ᅌᅵ다􀀀 􏻬ᄀᆞᄅᆞᇝ ᄀᆞᅀᅢ 자거늘 밀므리 사ᄋᆞ리로ᄃᆡ 나거ᅀᅡ ᄌᆞᄆᆞ니ᅌᅵ다 셤 안해 자시ᇙ 제 한비 사ᄋᆞ리로ᄃᆡ 뷔어ᅀᅡ ᄌᆞᄆᆞ니ᅌᅵ다􀀀
􏻬쾅 ᄒᆞ는 소리 헨 “아이구, 베락 털어져ᇝ인가?” 영 걷어진 쥥은 몰르곡 경헨 나왕 보고들랑 영헤연 ᄇᆞᆰ도록 ᄌᆞᆷ ᄒᆞᆫᄌᆞᆷ들 안 잣수다 이 시간 동네 사람들.􀀀 􏻬쾅 ᄒᆞ는 소리 헨 “아이구, 베락 털어져ᇝ인가?” 영 걷어진 쥥은 몰르곡 경헨 나왕 보고들랑 영헤연 ᄇᆞᆰ도록 ᄌᆞᆷ ᄒᆞᆫᄌᆞᆷ들 안 잣수다􀀀
􎳌‣ Full support for Old Korean/Jeju dialect orthography􀀀 􎳌‣ Full support for Old Korean/Jeju dialect orthography􀀀
@@ -104,30 +106,38 @@ How multilingual? Real multilingual!
􎳌‣ Full support for Archaic Kana/Hentaigana􀀀 􎳌‣ Full support for Archaic Kana/Hentaigana􀀀
􏻬серафими мн̑оꙮ҆читїи̑, ꙗ҆́кѡ ѻ҆́гнь, ѡ҆крꙋжа́хꙋ прⷭ҇то́лъ Бж҃їй, и҆ всѧ̀ землѧ̀ и҆спо́лнисѧ свѣ́та􀀀
􎳌‣ Fan of Church Slavonic? Weve got you!􀀀
􏃯Supported Unicode Blocks:􀀀 􏃯Supported Unicode Blocks:􀀀
Basic Latin Basic Latin
Latin-1 Supplement Latin-1 Supplement
Latin Extended Additional Latin Extended Additional
Latin Extended-A/B/C/D Latin Extended-A/B/C/D/E/F/G
Armenian Armenian
Arrows
Bengali􏿆ᶠⁱ􀀀 Bengali􏿆ᶠⁱ􀀀
Braille Patterns Braille Patterns
Cherokee􏿆􀀀 Cherokee􏿆􀀀
CJK Symbols and Punctuation CJK Symbols and Punctuation
CJK Unified Ideographs􏿆⁶􀀀 CJK Unified Ideographs
CJK Unified Ideographs Extension A􏿆¹²·¹􀀀 CJK Unified Ideographs Extension A
Combining Diacritical Marks Combining Diacritical Marks
Control Pictures Control Pictures
Coptic
Currency Symbols Currency Symbols
Cyrillic􏿆ᴭ􀀀 Cyrillic
Cyrillic Supplement􏿆ᴭ􀀀 Cyrillic Supplement
Cyrillic Extended-A/B/C/D
Devanagari Devanagari
Enclosed Alphanumerics
Enclosed Alphanumeric Supplement Enclosed Alphanumeric Supplement
General Punctuations General Punctuations
Georgian􏿆ჼ􀀀 Georgian􏿆ჼ􀀀
Georgian Extended Georgian Extended
Greek and Coptic􏿆ᴱ􀀀 Greek and Coptic
Greek Extended Greek Extended
Halfwidth and Fullwidth Forms Halfwidth and Fullwidth Forms
Hangul Compatibility Jamo Hangul Compatibility Jamo
@@ -142,6 +152,11 @@ How multilingual? Real multilingual!
Kana Extended-A Kana Extended-A
Small Kana Extension Small Kana Extension
Letterlike Symbols Letterlike Symbols
Mathematical Operators
Miscellaneous Technical
Number Forms
Ogham
Optical Character Recognition
Phonetic Extensions Phonetic Extensions
Phonetic Extensions Supplement Phonetic Extensions Supplement
Runic Runic
@@ -152,9 +167,9 @@ How multilingual? Real multilingual!
Symbols for Legacy Computing Symbols for Legacy Computing
Tamil Tamil
Thai Thai
Yijing Hexagram Symbols
􏿆􀀀 No support for archæic letters  􏿆ᴱ􀀀 No support for Coptic 􏿆ᶠⁱ􀀀 No support for ligatures
􏿆ᶠⁱ􀀀 No support for ligatures  􏿆ჼ􀀀 Mkhedruli only 􏿆ᴬ􀀀 Uppercase only 􏿆ჼ􀀀 Mkhedruli only
􏿆⁶􀀀 􏿆⁷􀀀 􏿆⁹􀀀 􏿆¹²·¹􀀀 Up to the specified Unicode version
GitHubs issue page is open! You can report any 􏽕errors􀀀, or leave 􏽕suggestions􀀀. You can help this font to be more versatile. (for more languages, more frameworks) 􏽕Clone􀀀 this repo, make changes, and make a 􏽕pull request􀀀! I appreciate any and all supports. GitHubs issue page is open! You can report any 􏽕errors􀀀, or leave 􏽕suggestions􀀀. You can help this font to be more versatile. (for more languages, more frameworks) 􏽕Clone􀀀 this repo, make changes, and make a 􏽕pull request􀀀! I appreciate any and all supports.

View File

@@ -579,15 +579,15 @@ const EXAMPLES = [
{ zones: 'BDFGH', wye: false, chars: '\u027A', desc: 'BDFGH' }, { zones: 'BDFGH', wye: false, chars: '\u027A', desc: 'BDFGH' },
{ zones: 'BG', wye: true, chars: '/', desc: 'BG(Y)' }, { zones: 'BG', wye: true, chars: '/', desc: 'BG(Y)' },
{ zones: 'CD', wye: false, chars: '\u10B5', desc: 'CD' }, { zones: 'CD', wye: false, chars: '\u10B5', desc: 'CD' },
{ zones: 'CDEF', wye: true, chars: '\u03A6', desc: 'CDEF(Y)' }, { zones: 'CDEF', wye: true, chars: '\u03A6,v', desc: 'CDEF(Y)' },
{ zones: 'CDEFGH', wye: false, chars: 'a,c,e', desc: 'CDEFGH' }, { zones: 'CDEFGH', wye: false, chars: 'a,e', desc: 'CDEFGH' },
{ zones: 'CDEFGHJK', wye: false, chars: 'g', desc: 'CDEFGHJK' }, { zones: 'CDEFGHJK', wye: false, chars: 'g', desc: 'CDEFGHJK' },
{ zones: 'CDEFGHK', wye: false, chars: '\u019E', desc: 'CDEFGHK' }, { zones: 'CDEFGHK', wye: false, chars: '\u019E', desc: 'CDEFGHK' },
{ zones: 'CDEFGH', wye: true, chars: 'A', desc: 'CDEFGH(Y)' },
{ zones: 'CDEGH', wye: false, chars: 'c', desc: 'CDEGH' },
{ zones: 'AB', wye: true, chars: 'Y', desc: 'AB(Y)' }, { zones: 'AB', wye: true, chars: 'Y', desc: 'AB(Y)' },
{ zones: 'ABCD', wye: true, chars: 'V', desc: 'ABCD(Y)' }, { zones: 'ABCD', wye: true, chars: 'V', desc: 'ABCD(Y)' },
{ zones: 'CDEF', wye: true, chars: 'v', desc: 'CDEF(Y)' },
{ zones: 'EFGH', wye: true, chars: '\u028C', desc: 'EFGH(Y)' }, { zones: 'EFGH', wye: true, chars: '\u028C', desc: 'EFGH(Y)' },
{ zones: 'CDEFGH', wye: true, chars: 'A', desc: 'CDEFGH(Y)' },
]; ];
function buildExamples() { function buildExamples() {

View File

@@ -1,7 +1,7 @@
--- Pixel 0 --- Pixel 0
- Lowheight bit - Lowheight bit
- encoding: has pixel - it's low height - encoding: has pixel - it's low height
- used by the diacritics system to quickly look up if the character is low height without parsing the Pixel 1 - bit must be set if above-diacritics should be lowered (e.g. lowercase b, which has 'A' shape bit but considered lowheight)
### Legends ### Legends
# #
@@ -106,3 +106,5 @@ dot removal for diacritics:
- encoding: - encoding:
- <MSB> RRRRRRRR GGGGGGGG BBBBBBBB <LSB> - <MSB> RRRRRRRR GGGGGGGG BBBBBBBB <LSB>
--- Pixel 3
Unused for now.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
src/assets/coptic_variable.tga LFS Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
src/assets/emoji1.tga LFS Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
src/assets/latinExtE_variable.tga LFS Normal file

Binary file not shown.

BIN
src/assets/latinExtF_variable.tga LFS Normal file

Binary file not shown.

BIN
src/assets/latinExtG_variable.tga LFS Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
src/assets/ogham_variable.tga LFS Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -1,9 +1,13 @@
package net.torvald.terrarumsansbitmap package net.torvald.terrarumsansbitmap
import net.torvald.terrarumsansbitmap.gdx.CodePoint
/** /**
* Created by minjaesong on 2021-11-25. * Created by minjaesong on 2021-11-25.
*/ */
data class DiacriticsAnchor(val type: Int, val x: Int, val y: Int, val xUsed: Boolean, val yUsed: Boolean) data class DiacriticsAnchor(val type: Int, val x: Int, val y: Int) {
val isZero = (x == 0 && y == 0)
}
/** /**
* Created by minjaesong on 2018-08-07. * Created by minjaesong on 2018-08-07.
*/ */
@@ -15,7 +19,7 @@ data class GlyphProps(
val nudgeX: Int = 0, val nudgeX: Int = 0,
val nudgeY: Int = 0, val nudgeY: Int = 0,
val diacriticsAnchors: Array<DiacriticsAnchor> = Array(6) { DiacriticsAnchor(it, 0, 0, false, false) }, val diacriticsAnchors: Array<DiacriticsAnchor> = Array(6) { DiacriticsAnchor(it, 0, 0) },
val alignWhere: Int = 0, // ALIGN_LEFT..ALIGN_BEFORE val alignWhere: Int = 0, // ALIGN_LEFT..ALIGN_BEFORE
@@ -29,6 +33,8 @@ data class GlyphProps(
val isKernYtype: Boolean = false, val isKernYtype: Boolean = false,
val kerningMask: Int = 255, val kerningMask: Int = 255,
val dotRemoval: CodePoint? = null,
val directiveOpcode: Int = 0, // 8-bits wide val directiveOpcode: Int = 0, // 8-bits wide
val directiveArg1: Int = 0, // 8-bits wide val directiveArg1: Int = 0, // 8-bits wide
val directiveArg2: Int = 0, // 8-bits wide val directiveArg2: Int = 0, // 8-bits wide
@@ -95,10 +101,6 @@ data class GlyphProps(
diacriticsAnchors.forEach { diacriticsAnchors.forEach {
hash = hash xor it.type hash = hash xor it.type
hash = hash * 16777619 hash = hash * 16777619
hash = hash xor (it.x or (if (it.xUsed) 128 else 0))
hash = hash * 16777619
hash = hash xor (it.y or (if (it.yUsed) 128 else 0))
hash = hash * 16777619
} }
hash = hash xor tags hash = hash xor tags

View File

@@ -0,0 +1,134 @@
package net.torvald.terrarumsansbitmap.gdx
import com.badlogic.gdx.graphics.Pixmap
data class AtlasRegion(
val atlasX: Int,
val atlasY: Int,
val width: Int,
val height: Int,
val offsetX: Int = 0,
val offsetY: Int = 0
)
class GlyphAtlas(val atlasWidth: Int, val atlasHeight: Int) {
val pixmap = Pixmap(atlasWidth, atlasHeight, Pixmap.Format.RGBA8888).also { it.blending = Pixmap.Blending.SourceOver }
private val regions = HashMap<Long, AtlasRegion>()
private var cursorX = 0
private var cursorY = 0
private var shelfHeight = 0
private val pendingCells = ArrayList<PendingCell>()
private class PendingCell(
val sheetID: Int,
val cellX: Int,
val cellY: Int,
val cropped: Pixmap,
val offsetX: Int,
val offsetY: Int
)
private fun atlasKey(sheetID: Int, cellX: Int, cellY: Int): Long =
sheetID.toLong().shl(32) or cellX.toLong().shl(16) or cellY.toLong()
/** Scans the cell for its non-transparent bounding box, crops, and queues for deferred packing. */
fun queueCell(sheetID: Int, cellX: Int, cellY: Int, cellPixmap: Pixmap) {
var minX = cellPixmap.width
var minY = cellPixmap.height
var maxX = -1
var maxY = -1
for (y in 0 until cellPixmap.height) {
for (x in 0 until cellPixmap.width) {
if (cellPixmap.getPixel(x, y) and 0xFF != 0) {
if (x < minX) minX = x
if (y < minY) minY = y
if (x > maxX) maxX = x
if (y > maxY) maxY = y
}
}
}
if (maxX < 0) return // entirely transparent, skip
val cropW = maxX - minX + 1
val cropH = maxY - minY + 1
val cropped = Pixmap(cropW, cropH, Pixmap.Format.RGBA8888)
cropped.drawPixmap(cellPixmap, 0, 0, minX, minY, cropW, cropH)
pendingCells.add(PendingCell(sheetID, cellX, cellY, cropped, minX, minY))
}
/** Sorts queued cells by height desc then width desc, and packs into shelves. */
fun packAllQueued() {
pendingCells.sortWith(
compareByDescending<PendingCell> { it.cropped.height }
.thenByDescending { it.cropped.width }
)
for (cell in pendingCells) {
val w = cell.cropped.width
val h = cell.cropped.height
// start new shelf if cell doesn't fit horizontally
if (cursorX + w > atlasWidth) {
cursorX = 0
cursorY += shelfHeight
shelfHeight = 0
}
pixmap.drawPixmap(cell.cropped, cursorX, cursorY)
regions[atlasKey(cell.sheetID, cell.cellX, cell.cellY)] =
AtlasRegion(cursorX, cursorY, w, h, cell.offsetX, cell.offsetY)
cursorX += w
if (h > shelfHeight) shelfHeight = h
cell.cropped.dispose()
}
pendingCells.clear()
}
fun blitSheet(sheetID: Int, sheetPixmap: Pixmap, cellW: Int, cellH: Int, cols: Int, rows: Int) {
if (cursorX > 0) {
cursorX = 0
cursorY += shelfHeight
shelfHeight = 0
}
val baseY = cursorY
pixmap.drawPixmap(sheetPixmap, 0, baseY)
for (cy in 0 until rows) {
for (cx in 0 until cols) {
regions[atlasKey(sheetID, cx, cy)] = AtlasRegion(cx * cellW, baseY + cy * cellH, cellW, cellH)
}
}
cursorY = baseY + sheetPixmap.height
cursorX = 0
shelfHeight = 0
}
fun getRegion(sheetID: Int, cellX: Int, cellY: Int): AtlasRegion? =
regions[atlasKey(sheetID, cellX, cellY)]
fun clearRegion(region: AtlasRegion) {
for (y in 0 until region.height) {
for (x in 0 until region.width) {
pixmap.drawPixel(region.atlasX + x, region.atlasY + y, 0)
}
}
}
fun dispose() {
pixmap.dispose()
}
}

View File

@@ -318,7 +318,7 @@ class TerrarumSansBitmap(
/** Props of all printable Unicode points. */ /** Props of all printable Unicode points. */
private val glyphProps = HashMap<CodePoint, GlyphProps>() private val glyphProps = HashMap<CodePoint, GlyphProps>()
private val textReplaces = HashMap<CodePoint, CodePoint>() private val textReplaces = HashMap<CodePoint, CodePoint>()
private val sheets: Array<PixmapRegionPack> private lateinit var atlas: GlyphAtlas
// private var charsetOverride = 0 // private var charsetOverride = 0
@@ -326,9 +326,11 @@ class TerrarumSansBitmap(
// private val tempFiles = ArrayList<String>() // private val tempFiles = ArrayList<String>()
init { init {
val sheetsPack = ArrayList<PixmapRegionPack>() atlas = GlyphAtlas(4096, 4096)
var unihanPixmap: Pixmap? = null
var emoji1Pixmap: Pixmap? = null
// first we create pixmap to read pixels, then make texture using pixmap // first we create pixmap to read pixels, then pack into atlas
fileList.forEachIndexed { index, it -> fileList.forEachIndexed { index, it ->
val isVariable = it.endsWith("_variable.tga") val isVariable = it.endsWith("_variable.tga")
val isXYSwapped = it.contains("xyswap", true) val isXYSwapped = it.contains("xyswap", true)
@@ -346,32 +348,6 @@ class TerrarumSansBitmap(
else else
dbgprn("loading texture [STATIC] $it") dbgprn("loading texture [STATIC] $it")
// unpack gz if applicable
/*if (it.endsWith(".gz")) {
val tmpFilePath = tempDir + "/tmp_${it.dropLast(7)}.tga"
try {
val gzi = GZIPInputStream(Gdx.files.classpath(fontParentDir + it).read(8192))
val wholeFile = gzi.readBytes()
gzi.close()
val fos = BufferedOutputStream(FileOutputStream(tmpFilePath))
fos.write(wholeFile)
fos.flush()
fos.close()
pixmap = Pixmap(Gdx.files.absolute(tmpFilePath))
// tempFiles.add(tmpFilePath)
}
catch (e: GdxRuntimeException) {
//e.printStackTrace()
dbgprn("said texture not found, skipping...")
pixmap = Pixmap(1, 1, Pixmap.Format.RGBA8888)
}
//File(tmpFileName).delete()
}
else {*/
try { try {
pixmap = Pixmap(Gdx.files.classpath("assets/$it")) pixmap = Pixmap(Gdx.files.classpath("assets/$it"))
} }
@@ -387,7 +363,6 @@ class TerrarumSansBitmap(
System.exit(1) System.exit(1)
} }
} }
//}
if (isVariable) buildWidthTable(pixmap, codeRange[index], if (isExtraWide) 32 else 16) if (isVariable) buildWidthTable(pixmap, codeRange[index], if (isExtraWide) 32 else 16)
buildWidthTableFixed() buildWidthTableFixed()
@@ -395,40 +370,75 @@ class TerrarumSansBitmap(
setupDynamicTextReplacer() setupDynamicTextReplacer()
/*if (!noShadow) { if (index == SHEET_UNIHAN) {
makeShadowForSheet(pixmap) // defer wenquanyi packing to after all other sheets
}*/ unihanPixmap = pixmap
}
else if (index == SHEET_EMOJI1) {
// defer emoji1 packing to after all other sheets
emoji1Pixmap = pixmap
}
else {
val texRegPack = if (isExtraWide)
PixmapRegionPack(pixmap, W_WIDEVAR_INIT, H, HGAP_VAR, 0, xySwapped = isXYSwapped)
else if (isVariable)
PixmapRegionPack(pixmap, W_VAR_INIT, H, HGAP_VAR, 0, xySwapped = isXYSwapped)
else if (index == SHEET_HANGUL)
PixmapRegionPack(pixmap, W_HANGUL_BASE, H)
else if (index == SHEET_CUSTOM_SYM)
PixmapRegionPack(pixmap, SIZE_CUSTOM_SYM, SIZE_CUSTOM_SYM)
else if (index == SHEET_RUNIC)
PixmapRegionPack(pixmap, W_LATIN_WIDE, H)
else throw IllegalArgumentException("Unknown sheet index: $index")
// this code causes initial deva chars to be skipped from rendering
// val illegalCells = HashSet<Long>()
// for (code in codeRange[index]) {
// if (glyphProps[code]?.isIllegal == true) {
// val pos = getSheetwisePosition(0, code)
// illegalCells.add(pos[0].toLong().shl(16) or pos[1].toLong())
// }
// }
//
for (cy in 0 until texRegPack.verticalCount) {
for (cx in 0 until texRegPack.horizontalCount) {
// if (cx.toLong().shl(16) or cy.toLong() in illegalCells) continue
atlas.queueCell(index, cx, cy, texRegPack.get(cx, cy))
}
}
//val texture = Texture(pixmap) texRegPack.dispose()
val texRegPack = if (isExtraWide) pixmap.dispose()
PixmapRegionPack(pixmap, W_WIDEVAR_INIT, H, HGAP_VAR, 0, xySwapped = isXYSwapped) }
else if (isVariable)
PixmapRegionPack(pixmap, W_VAR_INIT, H, HGAP_VAR, 0, xySwapped = isXYSwapped)
else if (index == SHEET_UNIHAN)
PixmapRegionPack(pixmap, W_UNIHAN, H_UNIHAN) // the only exception that is height is 16
// below they all have height of 20 'H'
else if (index == SHEET_HANGUL)
PixmapRegionPack(pixmap, W_HANGUL_BASE, H)
else if (index == SHEET_CUSTOM_SYM)
PixmapRegionPack(pixmap, SIZE_CUSTOM_SYM, SIZE_CUSTOM_SYM) // TODO variable
else if (index == SHEET_RUNIC)
PixmapRegionPack(pixmap, W_LATIN_WIDE, H)
else throw IllegalArgumentException("Unknown sheet index: $index")
//texRegPack.texture.setFilter(minFilter, magFilter)
sheetsPack.add(texRegPack)
pixmap.dispose() // you are terminated
} }
sheets = sheetsPack.toTypedArray() // sort and pack all queued cells (tight-cropped, sorted by height then width)
atlas.packAllQueued()
// pack wenquanyi (SHEET_UNIHAN) last as a contiguous blit
unihanPixmap?.let {
val cols = it.width / W_UNIHAN
val rows = it.height / H_UNIHAN
atlas.blitSheet(SHEET_UNIHAN, it, W_UNIHAN, H_UNIHAN, cols, rows)
it.dispose()
}
// pack emoji1 as a contiguous blit (fixed 17x16 cells, 2px top/bottom padding)
emoji1Pixmap?.let {
val cols = it.width / W_EMOJI1
val rows = it.height / H_EMOJI1
atlas.blitSheet(SHEET_EMOJI1, it, W_EMOJI1, H_EMOJI1, cols, rows)
it.dispose()
}
// make sure null char is actually null (draws nothing and has zero width) // make sure null char is actually null (draws nothing and has zero width)
sheets[SHEET_ASCII_VARW].regions[0].setColor(0) atlas.getRegion(SHEET_ASCII_VARW, 0, 0)?.let { atlas.clearRegion(it) }
sheets[SHEET_ASCII_VARW].regions[0].fill()
glyphProps[0] = GlyphProps(0) glyphProps[0] = GlyphProps(0)
if (debug) {
com.badlogic.gdx.graphics.PixmapIO.writePNG(Gdx.files.absolute("$tempDir/glyph_atlas_dump.png"), atlas.pixmap)
dbgprn("atlas dumped to $tempDir/glyph_atlas_dump.png")
}
} }
override fun getLineHeight(): Float = LINE_HEIGHT.toFloat() * scale override fun getLineHeight(): Float = LINE_HEIGHT.toFloat() * scale
@@ -449,6 +459,7 @@ class TerrarumSansBitmap(
} }
private val offsetUnihan = (H - H_UNIHAN) / 2 private val offsetUnihan = (H - H_UNIHAN) / 2
private val offsetEmoji1 = (H - H_EMOJI1) / 2
private val offsetCustomSym = (H - SIZE_CUSTOM_SYM) / 2 private val offsetCustomSym = (H - SIZE_CUSTOM_SYM) / 2
private var flagFirstRun = true private var flagFirstRun = true
@@ -582,6 +593,9 @@ class TerrarumSansBitmap(
renderCol = getColour(c) renderCol = getColour(c)
} }
} }
else if (isNoDrawChar(c) || glyphProps[c]?.isIllegal == true) {
// whitespace/control/internal/invalid — no visible glyph, just advance position
}
else if (sheetID == SHEET_HANGUL) { else if (sheetID == SHEET_HANGUL) {
// Flookahead for {I, P, F} // Flookahead for {I, P, F}
@@ -599,39 +613,33 @@ class TerrarumSansBitmap(
val (indexCho, indexJung, indexJong) = indices val (indexCho, indexJung, indexJong) = indices
val (choRow, jungRow, jongRow) = rows val (choRow, jungRow, jongRow) = rows
val hangulSheet = sheets[SHEET_HANGUL] atlas.getRegion(SHEET_HANGUL, indexCho, choRow)?.let {
linotypePixmap.drawFromAtlas(atlas.pixmap, it, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
}
atlas.getRegion(SHEET_HANGUL, indexJung, jungRow)?.let {
val choTex = hangulSheet.get(indexCho, choRow) linotypePixmap.drawFromAtlas(atlas.pixmap, it, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
val jungTex = hangulSheet.get(indexJung, jungRow) }
val jongTex = hangulSheet.get(indexJong, jongRow) atlas.getRegion(SHEET_HANGUL, indexJong, jongRow)?.let {
linotypePixmap.drawFromAtlas(atlas.pixmap, it, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
linotypePixmap.drawPixmap(choTex, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol) }
linotypePixmap.drawPixmap(jungTex, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
linotypePixmap.drawPixmap(jongTex, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
index += hangulLength - 1 index += hangulLength - 1
} }
else { else {
try { val posY = posmap.y[index].flipY() +
val posY = posmap.y[index].flipY() + if (sheetID == SHEET_UNIHAN) // evil exceptions
if (sheetID == SHEET_UNIHAN) // evil exceptions offsetUnihan
offsetUnihan else if (sheetID == SHEET_EMOJI1)
else if (sheetID == SHEET_CUSTOM_SYM) offsetEmoji1
offsetCustomSym else if (sheetID == SHEET_CUSTOM_SYM)
else 0 offsetCustomSym
else 0
val posX = posmap.x[index] val posX = posmap.x[index]
val texture = sheets[sheetID].get(sheetX, sheetY) atlas.getRegion(sheetID, sheetX, sheetY)?.let {
linotypePixmap.drawFromAtlas(atlas.pixmap, it, posX + linotypePaddingX, posY + linotypePaddingY, renderCol)
linotypePixmap.drawPixmap(texture, posX + linotypePaddingX, posY + linotypePaddingY, renderCol)
}
catch (noSuchGlyph: ArrayIndexOutOfBoundsException) {
} }
} }
@@ -698,6 +706,9 @@ class TerrarumSansBitmap(
renderCol = getColour(c) renderCol = getColour(c)
} }
} }
else if (isNoDrawChar(c) || glyphProps[c]?.isIllegal == true) {
// whitespace/control/internal/invalid — no visible glyph, just advance position
}
else if (sheetID == SHEET_HANGUL) { else if (sheetID == SHEET_HANGUL) {
// Flookahead for {I, P, F} // Flookahead for {I, P, F}
@@ -715,39 +726,33 @@ class TerrarumSansBitmap(
val (indexCho, indexJung, indexJong) = indices val (indexCho, indexJung, indexJong) = indices
val (choRow, jungRow, jongRow) = rows val (choRow, jungRow, jongRow) = rows
val hangulSheet = sheets[SHEET_HANGUL] atlas.getRegion(SHEET_HANGUL, indexCho, choRow)?.let {
linotypePixmap.drawFromAtlas(atlas.pixmap, it, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
}
atlas.getRegion(SHEET_HANGUL, indexJung, jungRow)?.let {
val choTex = hangulSheet.get(indexCho, choRow) linotypePixmap.drawFromAtlas(atlas.pixmap, it, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
val jungTex = hangulSheet.get(indexJung, jungRow) }
val jongTex = hangulSheet.get(indexJong, jongRow) atlas.getRegion(SHEET_HANGUL, indexJong, jongRow)?.let {
linotypePixmap.drawFromAtlas(atlas.pixmap, it, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
linotypePixmap.drawPixmap(choTex, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol) }
linotypePixmap.drawPixmap(jungTex, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
linotypePixmap.drawPixmap(jongTex, posmap.x[index] + linotypePaddingX, linotypePaddingY, renderCol)
index += hangulLength - 1 index += hangulLength - 1
} }
else { else {
try { val posY = posmap.y[index].flipY() +
val posY = posmap.y[index].flipY() + if (sheetID == SHEET_UNIHAN) // evil exceptions
if (sheetID == SHEET_UNIHAN) // evil exceptions offsetUnihan
offsetUnihan else if (sheetID == SHEET_EMOJI1)
else if (sheetID == SHEET_CUSTOM_SYM) offsetEmoji1
offsetCustomSym else if (sheetID == SHEET_CUSTOM_SYM)
else 0 offsetCustomSym
else 0
val posX = posmap.x[index] val posX = posmap.x[index]
val texture = sheets[sheetID].get(sheetX, sheetY) atlas.getRegion(sheetID, sheetX, sheetY)?.let {
linotypePixmap.drawFromAtlas(atlas.pixmap, it, posX + linotypePaddingX, posY + linotypePaddingY, renderCol)
linotypePixmap.drawPixmap(texture, posX + linotypePaddingX, posY + linotypePaddingY, renderCol)
}
catch (noSuchGlyph: ArrayIndexOutOfBoundsException) {
} }
} }
@@ -826,7 +831,7 @@ class TerrarumSansBitmap(
override fun dispose() { override fun dispose() {
super.dispose() super.dispose()
textCache.values.forEach { it.dispose() } textCache.values.forEach { it.dispose() }
sheets.forEach { it.dispose() } atlas.dispose()
} }
fun getSheetType(c: CodePoint): Int { fun getSheetType(c: CodePoint): Int {
@@ -888,6 +893,18 @@ class TerrarumSansBitmap(
SHEET_HENTAIGANA_VARW -> hentaiganaIndexY(ch) SHEET_HENTAIGANA_VARW -> hentaiganaIndexY(ch)
SHEET_CONTROL_PICTURES_VARW -> controlPicturesIndexY(ch) SHEET_CONTROL_PICTURES_VARW -> controlPicturesIndexY(ch)
SHEET_LEGACY_COMPUTING_VARW -> legacyComputingIndexY(ch) SHEET_LEGACY_COMPUTING_VARW -> legacyComputingIndexY(ch)
SHEET_CYRILIC_EXTB_VARW -> cyrilicExtBIndexY(ch)
SHEET_CYRILIC_EXTA_VARW -> cyrilicExtAIndexY(ch)
SHEET_CYRILIC_EXTC_VARW -> cyrilicExtCIndexY(ch)
SHEET_LATIN_EXTE_VARW -> latinExtEIndexY(ch)
SHEET_LATIN_EXTF_VARW -> latinExtFIndexY(ch)
SHEET_LATIN_EXTG_VARW -> latinExtGIndexY(ch)
SHEET_OGHAM_VARW -> oghamIndexY(ch)
SHEET_COPTIC_VARW -> copticIndexY(ch)
SHEET_CYRILIC_EXTD_VARW -> cyrilicExtDIndexY(ch)
SHEET_MATHS1_VARW -> maths1IndexY(ch)
SHEET_EMOJI1 -> emoji1IndexY(ch)
SHEET_ENCLOSED_ALPHNUM_VARW -> enclosedAlphnumIndexY(ch)
else -> ch / 16 else -> ch / 16
} }
@@ -918,9 +935,9 @@ class TerrarumSansBitmap(
val isLowHeight = (pixmap.getPixel(codeStartX, codeStartY + 5).and(255) != 0) val isLowHeight = (pixmap.getPixel(codeStartX, codeStartY + 5).and(255) != 0)
// Keming machine parameters // Keming machine parameters
val kerningBit1 = pixmap.getPixel(codeStartX, codeStartY + 6).tagify() val kerningBit1 = pixmap.getPixel(codeStartX, codeStartY + 6).tagify() // glyph shape
val kerningBit2 = pixmap.getPixel(codeStartX, codeStartY + 7).tagify() val kerningBit2 = pixmap.getPixel(codeStartX, codeStartY + 7).tagify() // dot removal
val kerningBit3 = pixmap.getPixel(codeStartX, codeStartY + 8).tagify() val kerningBit3 = pixmap.getPixel(codeStartX, codeStartY + 8).tagify() // unused
var isKernYtype = ((kerningBit1 and 0x80000000.toInt()) != 0) var isKernYtype = ((kerningBit1 and 0x80000000.toInt()) != 0)
var kerningMask = kerningBit1.ushr(8).and(0xFFFFFF) var kerningMask = kerningBit1.ushr(8).and(0xFFFFFF)
val hasKernData = kerningBit1 and 255 != 0//(kerningBit1 and 255 != 0 && kerningMask != 0xFFFF) val hasKernData = kerningBit1 and 255 != 0//(kerningBit1 and 255 != 0 && kerningMask != 0xFFFF)
@@ -944,12 +961,12 @@ class TerrarumSansBitmap(
val shift = (3 - (it % 3)) * 8 val shift = (3 - (it % 3)) * 8
val yPixel = pixmap.getPixel(codeStartX, codeStartY + yPos).tagify() val yPixel = pixmap.getPixel(codeStartX, codeStartY + yPos).tagify()
val xPixel = pixmap.getPixel(codeStartX, codeStartY + yPos + 1).tagify() val xPixel = pixmap.getPixel(codeStartX, codeStartY + yPos + 1).tagify()
val yUsed = (yPixel ushr shift) and 128 != 0 val ySgn = ((yPixel ushr shift) and 128).let { if (it == 0) -1 else 1 }
val xUsed = (xPixel ushr shift) and 128 != 0 val xSgn = ((xPixel ushr shift) and 128).let { if (it == 0) -1 else 1 }
val y = if (yUsed) (yPixel ushr shift) and 127 else 0 val y = ((yPixel ushr shift) and 127) * ySgn
val x = if (xUsed) (xPixel ushr shift) and 127 else 0 val x = ((xPixel ushr shift) and 127) * xSgn
DiacriticsAnchor(it, x, y, xUsed, yUsed) DiacriticsAnchor(it, x, y)
}.toTypedArray() }.toTypedArray()
val alignWhere = (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 15).and(255) != 0).toInt() shl y) } val alignWhere = (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 15).and(255) != 0).toInt() shl y) }
@@ -968,7 +985,9 @@ class TerrarumSansBitmap(
GlyphProps.STACK_DONT GlyphProps.STACK_DONT
else (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 18).and(255) != 0).toInt() shl y) } else (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 18).and(255) != 0).toInt() shl y) }
glyphProps[code] = GlyphProps(width, isLowHeight, nudgeX, nudgeY, diacriticsAnchors, alignWhere, writeOnTop, stackWhere, IntArray(15), hasKernData, isKernYtype, kerningMask, directiveOpcode, directiveArg1, directiveArg2) val dotRemoval = if (kerningBit2 == 0) null else kerningBit2.ushr(8)
glyphProps[code] = GlyphProps(width, isLowHeight, nudgeX, nudgeY, diacriticsAnchors, alignWhere, writeOnTop, stackWhere, IntArray(15), hasKernData, isKernYtype, kerningMask, dotRemoval, directiveOpcode, directiveArg1, directiveArg2)
// extra info // extra info
val extCount = glyphProps[code]?.requiredExtInfoCount() ?: 0 val extCount = glyphProps[code]?.requiredExtInfoCount() ?: 0
@@ -1012,20 +1031,10 @@ class TerrarumSansBitmap(
codeRangeHangulCompat.forEach { glyphProps[it] = GlyphProps(W_HANGUL_BASE) } codeRangeHangulCompat.forEach { glyphProps[it] = GlyphProps(W_HANGUL_BASE) }
codeRange[SHEET_RUNIC].forEach { glyphProps[it] = GlyphProps(9) } codeRange[SHEET_RUNIC].forEach { glyphProps[it] = GlyphProps(9) }
codeRange[SHEET_UNIHAN].forEach { glyphProps[it] = GlyphProps(W_UNIHAN) } codeRange[SHEET_UNIHAN].forEach { glyphProps[it] = GlyphProps(W_UNIHAN) }
codeRange[SHEET_EMOJI1].forEach { glyphProps[it] = GlyphProps(W_EMOJI1) }
(0xD800..0xDFFF).forEach { glyphProps[it] = GlyphProps(0) } (0xD800..0xDFFF).forEach { glyphProps[it] = GlyphProps(0) }
(0x100000..0x10FFFF).forEach { glyphProps[it] = GlyphProps(0) } (0x100000..0x10FFFF).forEach { glyphProps[it] = GlyphProps(0) }
(0xFFFA0..0xFFFFF).forEach { glyphProps[it] = GlyphProps(0) } (0xFFFA0..0xFFFFF).forEach { glyphProps[it] = GlyphProps(0) }
// manually add width of one orphan insular letter
// WARNING: glyphs in 0xA770..0xA778 has invalid data, further care is required
glyphProps[0x1D79] = GlyphProps(9)
// U+007F is DEL originally, but this font stores bitmap of Replacement Character (U+FFFD)
// to this position. String replacer will replace U+FFFD into U+007F.
glyphProps[0x7F] = GlyphProps(15)
} }
private fun Int.halveWidth() = this / 2 + 1 private fun Int.halveWidth() = this / 2 + 1
@@ -1120,6 +1129,7 @@ class TerrarumSansBitmap(
var nonDiacriticCounter = 0 // index of last instance of non-diacritic char var nonDiacriticCounter = 0 // index of last instance of non-diacritic char
var stackUpwardCounter = 0 // TODO separate stack counter for centre- and right aligned var stackUpwardCounter = 0 // TODO separate stack counter for centre- and right aligned
var stackDownwardCounter = 0 var stackDownwardCounter = 0
var nudgeUpHighWater = 0 // tracks max nudgeY seen in current stack, so subsequent marks with lower nudge still clear the previous one
val HALF_VAR_INIT = W_VAR_INIT.minus(1).div(2) val HALF_VAR_INIT = W_VAR_INIT.minus(1).div(2)
@@ -1198,6 +1208,7 @@ class TerrarumSansBitmap(
stackUpwardCounter = 0 stackUpwardCounter = 0
stackDownwardCounter = 0 stackDownwardCounter = 0
nudgeUpHighWater = 0
} }
// FIXME HACK: using 0th diacritics' X-anchor pos as a type selector // FIXME HACK: using 0th diacritics' X-anchor pos as a type selector
/*else if (thisProp.writeOnTop && thisProp.diacriticsAnchors[0].x == GlyphProps.DIA_JOINER) { /*else if (thisProp.writeOnTop && thisProp.diacriticsAnchors[0].x == GlyphProps.DIA_JOINER) {
@@ -1217,30 +1228,32 @@ class TerrarumSansBitmap(
// set X pos according to alignment information // set X pos according to alignment information
posXbuffer[charIndex] = -thisProp.nudgeX + posXbuffer[charIndex] = -thisProp.nudgeX +
when (thisProp.alignWhere) { when (thisProp.alignWhere) {
GlyphProps.ALIGN_LEFT, GlyphProps.ALIGN_BEFORE -> posXbuffer[nonDiacriticCounter] GlyphProps.ALIGN_LEFT, GlyphProps.ALIGN_BEFORE -> {
val anchorPointX = if (itsProp.diacriticsAnchors[diacriticsType].isZero) itsProp.width else itsProp.diacriticsAnchors[diacriticsType].x
posXbuffer[nonDiacriticCounter] + anchorPointX
}
GlyphProps.ALIGN_RIGHT -> { GlyphProps.ALIGN_RIGHT -> {
// println("thisprop alignright $kerning, $extraWidth") // println("thisprop alignright $kerning, $extraWidth")
val anchorPoint = val anchorPointX = if (itsProp.diacriticsAnchors[diacriticsType].isZero) itsProp.width else itsProp.diacriticsAnchors[diacriticsType].x
if (!itsProp.diacriticsAnchors[diacriticsType].xUsed) itsProp.width else itsProp.diacriticsAnchors[diacriticsType].x
extraWidth += thisProp.width extraWidth += thisProp.width
posXbuffer[nonDiacriticCounter] + anchorPoint - W_VAR_INIT + kerning + extraWidth posXbuffer[nonDiacriticCounter] + anchorPointX - W_VAR_INIT + kerning + extraWidth
} }
GlyphProps.ALIGN_CENTRE -> { GlyphProps.ALIGN_CENTRE -> {
val anchorPoint = val anchorPointX = if (itsProp.diacriticsAnchors[diacriticsType].isZero) itsProp.width.div(2) else itsProp.diacriticsAnchors[diacriticsType].x
if (!itsProp.diacriticsAnchors[diacriticsType].xUsed) itsProp.width.div(2) else itsProp.diacriticsAnchors[diacriticsType].x
if (itsProp.alignWhere == GlyphProps.ALIGN_RIGHT) { if (itsProp.alignWhere == GlyphProps.ALIGN_RIGHT) {
if (thisChar in 0x900..0x902) if (thisChar in 0x900..0x902)
posXbuffer[nonDiacriticCounter] + anchorPoint + (itsProp.width - 1).div(2) posXbuffer[nonDiacriticCounter] + anchorPointX + (itsProp.width - 1).div(2)
else else
posXbuffer[nonDiacriticCounter] + anchorPoint + (itsProp.width + 1).div(2) posXbuffer[nonDiacriticCounter] + anchorPointX + (itsProp.width + 1).div(2)
} else { } else {
if (thisChar in 0x900..0x902) if (thisChar in 0x900..0x902)
posXbuffer[nonDiacriticCounter] + anchorPoint - (W_VAR_INIT + 1) / 2 posXbuffer[nonDiacriticCounter] + anchorPointX - (W_VAR_INIT + 1) / 2
else else
posXbuffer[nonDiacriticCounter] + anchorPoint - HALF_VAR_INIT posXbuffer[nonDiacriticCounter] + anchorPointX - HALF_VAR_INIT
} }
} }
else -> throw InternalError("Unsupported alignment: ${thisProp.alignWhere}") else -> throw InternalError("Unsupported alignment: ${thisProp.alignWhere}")
@@ -1277,14 +1290,14 @@ class TerrarumSansBitmap(
// set Y pos according to diacritics position // set Y pos according to diacritics position
when (thisProp.stackWhere) { when (thisProp.stackWhere) {
GlyphProps.STACK_DOWN -> { GlyphProps.STACK_DOWN -> {
posYbuffer[charIndex] = (-thisProp.nudgeY + H_DIACRITICS * stackDownwardCounter) * flipY.toSign() posYbuffer[charIndex] = (-thisProp.nudgeY + H_DIACRITICS * stackDownwardCounter) * flipY.toSign() - thisProp.nudgeY
stackDownwardCounter++ stackDownwardCounter++
} }
GlyphProps.STACK_UP -> { GlyphProps.STACK_UP -> {
posYbuffer[charIndex] = -thisProp.nudgeY + (-H_DIACRITICS * stackUpwardCounter + -thisProp.nudgeY) * flipY.toSign() val effectiveNudge = maxOf(thisProp.nudgeY, nudgeUpHighWater)
posYbuffer[charIndex] = -effectiveNudge + (-H_DIACRITICS * stackUpwardCounter + -effectiveNudge) * flipY.toSign() + effectiveNudge
// shift down on lowercase if applicable // shift down on lowercase if applicable
if (getSheetType(thisChar) in autoShiftDownOnLowercase && if (lastNonDiacriticChar.isLowHeight()) {
lastNonDiacriticChar.isLowHeight()) {
//dbgprn("AAARRRRHHHH for character ${thisChar.toHex()}") //dbgprn("AAARRRRHHHH for character ${thisChar.toHex()}")
//dbgprn("lastNonDiacriticChar: ${lastNonDiacriticChar.toHex()}") //dbgprn("lastNonDiacriticChar: ${lastNonDiacriticChar.toHex()}")
//dbgprn("cond: ${thisProp.alignXPos == GlyphProps.DIA_OVERLAY}, charIndex: $charIndex") //dbgprn("cond: ${thisProp.alignXPos == GlyphProps.DIA_OVERLAY}, charIndex: $charIndex")
@@ -1294,6 +1307,7 @@ class TerrarumSansBitmap(
posYbuffer[charIndex] += H_STACKUP_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign posYbuffer[charIndex] += H_STACKUP_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign
} }
nudgeUpHighWater = effectiveNudge
stackUpwardCounter++ stackUpwardCounter++
// dbgprn("lastNonDiacriticChar: ${lastNonDiacriticChar.charInfo()}; stack counter: $stackUpwardCounter") // dbgprn("lastNonDiacriticChar: ${lastNonDiacriticChar.charInfo()}; stack counter: $stackUpwardCounter")
@@ -1302,22 +1316,25 @@ class TerrarumSansBitmap(
posYbuffer[charIndex] = (-thisProp.nudgeY + H_DIACRITICS * stackDownwardCounter) * flipY.toSign() posYbuffer[charIndex] = (-thisProp.nudgeY + H_DIACRITICS * stackDownwardCounter) * flipY.toSign()
stackDownwardCounter++ stackDownwardCounter++
val effectiveNudge = maxOf(thisProp.nudgeY, nudgeUpHighWater)
posYbuffer[charIndex] = (-thisProp.nudgeY + -H_DIACRITICS * stackUpwardCounter) * flipY.toSign() posYbuffer[charIndex] = (-effectiveNudge + -H_DIACRITICS * stackUpwardCounter) * flipY.toSign()
// shift down on lowercase if applicable // shift down on lowercase if applicable
if (getSheetType(thisChar) in autoShiftDownOnLowercase && if (lastNonDiacriticChar.isLowHeight()) {
lastNonDiacriticChar.isLowHeight()) {
if (diacriticsType == GlyphProps.DIA_OVERLAY) if (diacriticsType == GlyphProps.DIA_OVERLAY)
posYbuffer[charIndex] += H_OVERLAY_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign posYbuffer[charIndex] += H_OVERLAY_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign
else else
posYbuffer[charIndex] += H_STACKUP_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign posYbuffer[charIndex] += H_STACKUP_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign
} }
nudgeUpHighWater = effectiveNudge
stackUpwardCounter++ stackUpwardCounter++
} }
// for BEFORE_N_AFTER, do nothing in here // for BEFORE_N_AFTER, do nothing in here
} }
// nudge Y pos according to anchor position
posYbuffer[charIndex] -= itsProp.diacriticsAnchors[diacriticsType].y
// Don't reset extraWidth here! // Don't reset extraWidth here!
} }
} }
@@ -1327,7 +1344,7 @@ class TerrarumSansBitmap(
if (str.isNotEmpty()) { if (str.isNotEmpty()) {
val lastCharProp = glyphProps[str.last()] val lastCharProp = glyphProps[str.last()]
val penultCharProp = glyphProps[str[nonDiacriticCounter]] ?: val penultCharProp = glyphProps[str[nonDiacriticCounter]] ?:
(if (errorOnUnknownChar) throw throw InternalError("No GlyphProps for char '${str[nonDiacriticCounter]}' " + (if (errorOnUnknownChar) throw InternalError("No GlyphProps for char '${str[nonDiacriticCounter]}' " +
"(${str[nonDiacriticCounter].charInfo()})") else nullProp) "(${str[nonDiacriticCounter].charInfo()})") else nullProp)
posXbuffer[posXbuffer.lastIndex] = posXbuffer[posXbuffer.lastIndex - 1] + // DON'T add 1 to house the shadow, it totally breaks stuffs posXbuffer[posXbuffer.lastIndex] = posXbuffer[posXbuffer.lastIndex - 1] + // DON'T add 1 to house the shadow, it totally breaks stuffs
if (lastCharProp != null && lastCharProp.writeOnTop >= 0) { if (lastCharProp != null && lastCharProp.writeOnTop >= 0) {
@@ -1523,8 +1540,8 @@ class TerrarumSansBitmap(
} }
// for lowercase i and j, if cNext is a diacritic that goes on top, remove the dots // for lowercase i and j, if cNext is a diacritic that goes on top, remove the dots
else if (diacriticDotRemoval.containsKey(c) && (glyphProps[cNext]?.writeOnTop ?: -1) >= 0 && glyphProps[cNext]?.stackWhere == GlyphProps.STACK_UP) { else if (glyphProps[c]!!.dotRemoval != null && (glyphProps[cNext]?.writeOnTop ?: -1) >= 0 && glyphProps[cNext]?.stackWhere == GlyphProps.STACK_UP) {
seq.add(diacriticDotRemoval[c]!!) seq.add(glyphProps[c]!!.dotRemoval!!)
} }
// BEGIN of tamil subsystem implementation // BEGIN of tamil subsystem implementation
@@ -2158,6 +2175,16 @@ class TerrarumSansBitmap(
} }
} }
private fun Pixmap.drawFromAtlas(atlas: Pixmap, region: AtlasRegion, xPos: Int, yPos: Int, col: Int) {
for (y in 0 until region.height) {
for (x in 0 until region.width) {
val pixel = atlas.getPixel(region.atlasX + x, region.atlasY + y)
val newPixel = pixel colorTimes col
this.drawPixel(xPos + x + region.offsetX, yPos + y + region.offsetY, newPixel)
}
}
}
private fun Color.toRGBA8888() = private fun Color.toRGBA8888() =
(this.r * 255f).toInt().shl(24) or (this.r * 255f).toInt().shl(24) or
(this.g * 255f).toInt().shl(16) or (this.g * 255f).toInt().shl(16) or
@@ -2554,6 +2581,8 @@ class TerrarumSansBitmap(
internal const val H = 20 internal const val H = 20
internal const val H_UNIHAN = 16 internal const val H_UNIHAN = 16
internal const val W_EMOJI1 = 17
internal const val H_EMOJI1 = 16
internal const val H_DIACRITICS = 3 internal const val H_DIACRITICS = 3
@@ -2604,6 +2633,18 @@ class TerrarumSansBitmap(
internal const val SHEET_HENTAIGANA_VARW = 39 internal const val SHEET_HENTAIGANA_VARW = 39
internal const val SHEET_CONTROL_PICTURES_VARW = 40 internal const val SHEET_CONTROL_PICTURES_VARW = 40
internal const val SHEET_LEGACY_COMPUTING_VARW = 41 internal const val SHEET_LEGACY_COMPUTING_VARW = 41
internal const val SHEET_CYRILIC_EXTB_VARW = 42
internal const val SHEET_CYRILIC_EXTA_VARW = 43
internal const val SHEET_CYRILIC_EXTC_VARW = 44
internal const val SHEET_LATIN_EXTE_VARW = 45
internal const val SHEET_LATIN_EXTF_VARW = 46
internal const val SHEET_LATIN_EXTG_VARW = 47
internal const val SHEET_OGHAM_VARW = 48
internal const val SHEET_COPTIC_VARW = 49
internal const val SHEET_CYRILIC_EXTD_VARW = 50
internal const val SHEET_MATHS1_VARW = 51
internal const val SHEET_EMOJI1 = 52
internal const val SHEET_ENCLOSED_ALPHNUM_VARW = 53
internal const val SHEET_UNKNOWN = 254 internal const val SHEET_UNKNOWN = 254
@@ -2625,10 +2666,6 @@ class TerrarumSansBitmap(
const val MOVABLE_BLOCK_1 = 0xFFFF0 const val MOVABLE_BLOCK_1 = 0xFFFF0
private val autoShiftDownOnLowercase = arrayOf(
SHEET_DIACRITICAL_MARKS_VARW
)
private val fileList = arrayOf( // MUST BE MATCHING WITH SHEET INDICES!! private val fileList = arrayOf( // MUST BE MATCHING WITH SHEET INDICES!!
"ascii_variable.tga", "ascii_variable.tga",
"hangul_johab.tga", "hangul_johab.tga",
@@ -2672,6 +2709,18 @@ class TerrarumSansBitmap(
"hentaigana_variable.tga", "hentaigana_variable.tga",
"control_pictures_variable.tga", "control_pictures_variable.tga",
"symbols_for_legacy_computing_variable.tga", "symbols_for_legacy_computing_variable.tga",
"cyrilic_extB_variable.tga",
"cyrilic_extA_variable.tga",
"cyrilic_extC_variable.tga",
"latinExtE_variable.tga",
"latinExtF_variable.tga",
"latinExtG_variable.tga",
"ogham_variable.tga",
"coptic_variable.tga",
"cyrilic_extD_variable.tga",
"maths1_extrawide_variable.tga",
"emoji1.tga",
"enclosed_alphanumeric_variable.tga",
) )
internal val codeRange = arrayOf( // MUST BE MATCHING WITH SHEET INDICES!! internal val codeRange = arrayOf( // MUST BE MATCHING WITH SHEET INDICES!!
0..0xFF, // SHEET_ASCII_VARW 0..0xFF, // SHEET_ASCII_VARW
@@ -2684,7 +2733,7 @@ class TerrarumSansBitmap(
0x400..0x52F, // SHEET_CYRILIC_VARW 0x400..0x52F, // SHEET_CYRILIC_VARW
0xFF00..0xFFFF, // SHEET_HALFWIDTH_FULLWIDTH_VARW 0xFF00..0xFFFF, // SHEET_HALFWIDTH_FULLWIDTH_VARW
0x2000..0x209F, // SHEET_UNI_PUNCT_VARW 0x2000..0x209F, // SHEET_UNI_PUNCT_VARW
0x370..0x3CE, // SHEET_GREEK_VARW 0x370..0x3FF, // SHEET_GREEK_VARW
0xE00..0xE5F, // SHEET_THAI_VARW 0xE00..0xE5F, // SHEET_THAI_VARW
0x530..0x58F, // SHEET_HAYEREN_VARW 0x530..0x58F, // SHEET_HAYEREN_VARW
0x10D0..0x10FF, // SHEET_KARTULI_VARW 0x10D0..0x10FF, // SHEET_KARTULI_VARW
@@ -2704,7 +2753,7 @@ class TerrarumSansBitmap(
0xA720..0xA7FF, // SHEET_EXTD_VARW 0xA720..0xA7FF, // SHEET_EXTD_VARW
0x20A0..0x20CF, // SHEET_CURRENCIES_VARW 0x20A0..0x20CF, // SHEET_CURRENCIES_VARW
0xFFE00..0xFFF9F, // SHEET_INTERNAL_VARW 0xFFE00..0xFFF9F, // SHEET_INTERNAL_VARW
0x2100..0x214F, // SHEET_LETTERLIKE_MATHS_VARW 0x2100..0x21FF, // SHEET_LETTERLIKE_MATHS_VARW
0x1F100..0x1F1FF, // SHEET_ENCLOSED_ALPHNUM_SUPL_VARW 0x1F100..0x1F1FF, // SHEET_ENCLOSED_ALPHNUM_SUPL_VARW
(0x0B80..0x0BFF) + (0xF00C0..0xF00FF), // SHEET_TAMIL_VARW (0x0B80..0x0BFF) + (0xF00C0..0xF00FF), // SHEET_TAMIL_VARW
0x980..0x9FF, // SHEET_BENGALI_VARW 0x980..0x9FF, // SHEET_BENGALI_VARW
@@ -2714,8 +2763,20 @@ class TerrarumSansBitmap(
0xF0520..0xF057F, // SHEET_CODESTYLE_ASCII_VARW 0xF0520..0xF057F, // SHEET_CODESTYLE_ASCII_VARW
0xFB00..0xFB17, // SHEET_ALPHABETIC_PRESENTATION_FORMS 0xFB00..0xFB17, // SHEET_ALPHABETIC_PRESENTATION_FORMS
0x1B000..0x1B16F, // SHEET_HENTAIGANA_VARW 0x1B000..0x1B16F, // SHEET_HENTAIGANA_VARW
0x2400..0x243F, // SHEET_CONTROL_PICTURES_VARW 0x2400..0x244F, // SHEET_CONTROL_PICTURES_VARW
0x1FB00..0x1FBFF, // SHEET_LEGACY_COMPUTING_VARW 0x1FB00..0x1FBFF, // SHEET_LEGACY_COMPUTING_VARW
0xA640..0xA69F, // SHEET_CYRILIC_EXTB_VARW
0x2DE0..0x2DFF, // SHEET_CYRILIC_EXTA_VARW
0x1C80..0x1C8F, // SHEET_CYRILIC_EXTC_VARW
0xAB30..0xAB6F, // SHEET_LATIN_EXTE_VARW
0x10780..0x107BF, // SHEET_LATIN_EXTF_VARW
0x1DF00..0x1DFFF, // SHEET_LATIN_EXTG_VARW
0x1680..0x169F, // SHEET_OGHAM_VARW
0x2C80..0x2CFF, // SHEET_COPTIC_VARW
0x1E030..0x1E08F, // SHEET_CYRILIC_EXTD_VARW
0x2200..0x23FF, // SHEET_MATHS1_VARW
0x1F600..0x1F64F, // SHEET_EMOJI1
0x2460..0x24FF, // SHEET_ENCLOSED_ALPHNUM_VARW
) )
private val codeRangeHangulCompat = 0x3130..0x318F private val codeRangeHangulCompat = 0x3130..0x318F
@@ -2733,11 +2794,6 @@ class TerrarumSansBitmap(
0x20..0x7F, 0x20..0x7F,
) )
private val diacriticDotRemoval = hashMapOf(
'i'.toInt() to 0x131,
'j'.toInt() to 0x237
)
internal fun Int.charInfo() = "U+${this.toString(16).padStart(4, '0').toUpperCase()}: ${Character.getName(this)}" internal fun Int.charInfo() = "U+${this.toString(16).padStart(4, '0').toUpperCase()}: ${Character.getName(this)}"
const val NQSP = 0x2000 const val NQSP = 0x2000
@@ -3016,6 +3072,13 @@ class TerrarumSansBitmap(
private fun isBulgarian(c: CodePoint) = c in 0xF0000..0xF005F private fun isBulgarian(c: CodePoint) = c in 0xF0000..0xF005F
private fun isSerbian(c: CodePoint) = c in 0xF0060..0xF00BF private fun isSerbian(c: CodePoint) = c in 0xF0060..0xF00BF
fun isColourCode(c: CodePoint) = c == 0x100000 || c in 0x10F000..0x10FFFF fun isColourCode(c: CodePoint) = c == 0x100000 || c in 0x10F000..0x10FFFF
private fun isNoDrawChar(c: CodePoint): Boolean =
c <= 0x20 || c == NBSP || c == SHY || c == OBJ ||
c in 0x2000..0x200D ||
c in 0xD800..0xDFFF ||
c in 0xF800..0xF8FF ||
c in 0xFFF70..0xFFF9F ||
c >= 0xFFFA0
private fun isCharsetOverride(c: CodePoint) = c in 0xFFFC0..0xFFFCF private fun isCharsetOverride(c: CodePoint) = c in 0xFFFC0..0xFFFCF
private fun isDevanagari(c: CodePoint) = c in codeRange[SHEET_DEVANAGARI_VARW] private fun isDevanagari(c: CodePoint) = c in codeRange[SHEET_DEVANAGARI_VARW]
private fun isHangulCompat(c: CodePoint) = c in codeRangeHangulCompat private fun isHangulCompat(c: CodePoint) = c in codeRangeHangulCompat
@@ -3066,6 +3129,18 @@ class TerrarumSansBitmap(
private fun hentaiganaIndexY(c: CodePoint) = (c - 0x1B000) / 16 private fun hentaiganaIndexY(c: CodePoint) = (c - 0x1B000) / 16
private fun controlPicturesIndexY(c: CodePoint) = (c - 0x2400) / 16 private fun controlPicturesIndexY(c: CodePoint) = (c - 0x2400) / 16
private fun legacyComputingIndexY(c: CodePoint) = (c - 0x1FB00) / 16 private fun legacyComputingIndexY(c: CodePoint) = (c - 0x1FB00) / 16
private fun cyrilicExtBIndexY(c: CodePoint) = (c - 0xA640) / 16
private fun cyrilicExtAIndexY(c: CodePoint) = (c - 0x2DE0) / 16
private fun cyrilicExtCIndexY(c: CodePoint) = (c - 0x1C80) / 16
private fun latinExtEIndexY(c: CodePoint) = (c - 0xAB30) / 16
private fun latinExtFIndexY(c: CodePoint) = (c - 0x10780) / 16
private fun latinExtGIndexY(c: CodePoint) = (c - 0x1DF00) / 16
private fun oghamIndexY(c: CodePoint) = (c - 0x1680) / 16
private fun copticIndexY(c: CodePoint) = (c - 0x2C80) / 16
private fun cyrilicExtDIndexY(c: CodePoint) = (c - 0x1E030) / 16
private fun maths1IndexY(c: CodePoint) = (c - 0x2200) / 16
private fun emoji1IndexY(c: CodePoint) = (c - 0x1F600) / 16
private fun enclosedAlphnumIndexY(c: CodePoint) = (c - 0x2460) / 16
val charsetOverrideDefault = Character.toChars(CHARSET_OVERRIDE_DEFAULT).toSurrogatedString() val charsetOverrideDefault = Character.toChars(CHARSET_OVERRIDE_DEFAULT).toSurrogatedString()
val charsetOverrideBulgarian = Character.toChars(CHARSET_OVERRIDE_BG_BG).toSurrogatedString() val charsetOverrideBulgarian = Character.toChars(CHARSET_OVERRIDE_BG_BG).toSurrogatedString()

View File

@@ -229,16 +229,16 @@ class TerrarumTypewriterBitmap(
val nudgeY = nudgingBits.ushr(16).toByte().toInt() // signed 8-bit int val nudgeY = nudgingBits.ushr(16).toByte().toInt() // signed 8-bit int
val diacriticsAnchors = (0..5).map { val diacriticsAnchors = (0..5).map {
val yPos = 11 + (it / 3) * 2 val yPos = 13 - (it / 3) * 2
val shift = (3 - (it % 3)) * 8 val shift = (3 - (it % 3)) * 8
val yPixel = pixmap.getPixel(codeStartX, codeStartY + yPos).tagify() val yPixel = pixmap.getPixel(codeStartX, codeStartY + yPos).tagify()
val xPixel = pixmap.getPixel(codeStartX, codeStartY + yPos + 1).tagify() val xPixel = pixmap.getPixel(codeStartX, codeStartY + yPos + 1).tagify()
val y = (yPixel ushr shift) and 127 val ySgn = ((yPixel ushr shift) and 128).let { if (it == 0) -1 else 1 }
val x = (xPixel ushr shift) and 127 val xSgn = ((xPixel ushr shift) and 128).let { if (it == 0) -1 else 1 }
val yUsed = (yPixel ushr shift) >= 128 val y = ((yPixel ushr shift) and 127) * ySgn
val xUsed = (yPixel ushr shift) >= 128 val x = ((xPixel ushr shift) and 127) * xSgn
DiacriticsAnchor(it, x, y, xUsed, yUsed) DiacriticsAnchor(it, x, y)
}.toTypedArray() }.toTypedArray()
val alignWhere = (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 15).and(255) != 0).toInt() shl y) } val alignWhere = (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 15).and(255) != 0).toInt() shl y) }

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
work_files/coptic_variable.kra LFS Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
work_files/emoji1.kra LFS Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
work_files/latinExtE_variable.kra LFS Normal file

Binary file not shown.

BIN
work_files/latinExtF_variable.kra LFS Normal file

Binary file not shown.

BIN
work_files/latinExtG_variable.kra LFS Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
work_files/ogham_variable.kra LFS Normal file

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More