20 Commits

Author SHA1 Message Date
minjaesong
39603d897b more fixes 2026-03-08 03:12:40 +09:00
minjaesong
2b1f9a866f more fixes 2026-03-08 02:38:03 +09:00
minjaesong
af1d720ec2 anchor fixes 2026-03-08 01:44:17 +09:00
minjaesong
8a52fcfb91 tfw part of your code was full of hacks 2026-03-08 00:32:42 +09:00
minjaesong
b82dbecc30 keming machine dot removal directive for OTF 2026-03-07 22:47:01 +09:00
minjaesong
2008bbf6dd keming machine dot removal directive 2026-03-07 22:41:24 +09:00
minjaesong
163e3d7b3e Arrow and Number Forms unicode block 2026-03-07 22:13:22 +09:00
minjaesong
f26998b641 Cyrillic Ext-C 2026-03-07 21:34:40 +09:00
minjaesong
4121648dfc minor fixes (2) 2026-03-07 21:08:44 +09:00
minjaesong
30de279a08 minor fixes 2026-03-07 21:03:10 +09:00
minjaesong
956599b83f Full Cyrillic, CyrlExtA/B support (enables church slavonic) 2026-03-07 20:55:47 +09:00
minjaesong
71ea63b48e calculators update 2026-03-07 18:03:36 +09:00
minjaesong
3a4aa165f6 more readme (2) 2026-03-07 12:59:40 +09:00
minjaesong
f402563e6c more readme 2026-03-07 12:57:16 +09:00
minjaesong
5a251ad381 readme update 2026-03-07 12:53:45 +09:00
minjaesong
7c90766394 optimised contour generation 2026-03-06 23:27:57 +09:00
minjaesong
bc827be492 minor changes 2026-03-06 16:34:28 +09:00
minjaesong
9c4c8d3153 gitignore updates 2026-03-06 15:51:52 +09:00
minjaesong
0c99a27ffe Autokem: CNN-based glyph labeller for Keming Machine 2026-03-06 15:43:47 +09:00
minjaesong
adab8fa0ef fix: some glyphs having rgb of 254,254,254 2026-03-06 11:15:23 +09:00
58 changed files with 2343 additions and 212 deletions

1
.gitattributes vendored
View File

@@ -8,6 +8,7 @@
*.kra filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text
*.wav filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.tga binary diff=hex
*.kra binary diff=hex

14
.gitignore vendored
View File

@@ -14,10 +14,22 @@ tmp_*
*-autosave.kra
.directory
*/__pycache__
# from OTF build
**/__pycache__
OTFbuild/*.ttf
OTFbuild/*.otf
OTFbuild/*.woff
OTFbuild/*.woff2
OTFbuild/*.md
*.fea
*.xdp-*
# from Autokem build
Autokem/*.o
Autokem/autokem
Autokem/*.md
# exceptions
!**/CLAUDE.md
!CLAUDE.md

121
Autokem/CLAUDE.md Normal file
View File

@@ -0,0 +1,121 @@
# Autokem
CNN-based tool that predicts kerning tag bits for font sprite sheets.
Trains on manually-tagged `*_variable.tga` sheets (~2650 samples across 24 sheets), then applies learned predictions to new or untagged sheets.
## Building
```bash
cd Autokem
make # optimised build (-Ofast)
make debug # ASan + UBSan, no optimisation
make clean
```
## Usage
```bash
./autokem train # train on ../src/assets/*_variable.tga
./autokem apply ../src/assets/foo_variable.tga # apply model to a sheet
./autokem stats # print model tensor shapes + metadata
./autokem help
```
- `train` scans `../src/assets/` for `*_variable.tga` (skips `*extrawide*`), collects labelled samples, trains with 80/20 split + early stopping, saves `autokem.safetensors`
- `apply` creates `.bak` backup, runs inference per cell, writes Y+5 (lowheight) and Y+6 (kern data) pixels. Skips cells with width=0, writeOnTop, or compiler directives
- Model file `autokem.safetensors` must be in the working directory
## Architecture
### Neural network
```
Input: 15x20x1 binary (300 values, alpha >= 0x80 → 1.0)
Conv2D(1→12, 3x3, same) → LeakyReLU(0.01)
Conv2D(12→16, 3x3, same) → LeakyReLU(0.01)
Flatten → 4800
Dense(4800→24) → LeakyReLU(0.01)
├── Dense(24→10) → sigmoid (shape bits A-H, J, K)
├── Dense(24→1) → sigmoid (Y-type)
└── Dense(24→1) → sigmoid (lowheight)
Total: ~117,388 params (~460 KB float32)
```
Training: Adam (lr=0.001, beta1=0.9, beta2=0.999), BCE loss, batch size 32, early stopping patience 10.
### File layout
| File | Purpose |
|------|---------|
| `main.c` | CLI dispatch |
| `tga.h/tga.c` | TGA reader/writer — BGRA↔RGBA8888, row-order handling, per-pixel write-in-place |
| `nn.h/nn.c` | Tensor, Conv2D (same padding), Dense, LeakyReLU, sigmoid, Adam, He init |
| `safetensor.h/safetensor.c` | `.safetensors` serialisation — 12 named tensors + JSON metadata |
| `train.h/train.c` | Data collection from sheets, training loop, validation, label distribution |
| `apply.h/apply.c` | Backup, eligibility checks, inference, pixel composition |
## Pixel format
All pixels are RGBA8888: `(R<<24) | (G<<16) | (B<<8) | A`. TGA files store bytes as BGRA — the reader/writer swaps B↔R.
### Tag column (rightmost pixel column of each 16x20 cell)
| Row | Field | Encoding |
|-----|-------|----------|
| Y+0..Y+4 | Width | 5-bit binary, alpha != 0 → bit set |
| Y+5 | lowheight | alpha=0xFF → lowheight, alpha=0 → not |
| Y+6 | Kern data | See below |
| Y+9 | Compiler directive | opcode in R byte; skip cell if != 0 |
| Y+17 | writeOnTop | alpha != 0 → skip cell |
### Y+6 kern data pixel
```
R byte: Y0000000 (Y-type flag in MSB, bit 31)
G byte: JK000000 (J = bit 23, K = bit 22)
B byte: ABCDEFGH (A = bit 15, ..., H = bit 8)
A byte: 0xFF (hasKernData flag — must be 0xFF, not 0x01)
```
`tagify(pixel)`: returns 0 if alpha == 0, else full pixel value.
`kerningMask = (pixel >> 8) & 0xFFFFFF` then extract individual bits.
### Shape bit layout
```
A-B top (unset for lowheight minuscules like e)
|-|
C-D middle hole for majuscules (like C)
E-F middle hole for minuscules (like c)
G-H
--- baseline
|-|
J-K descender
```
## Key pitfalls
- **Alpha must be 0xFF, not 0x01.** All manually-tagged sheets use alpha=255 for kern/lowheight pixels. Writing alpha=1 is functionally accepted by the font engine (`& 0xFF != 0`) but produces visually transparent pixels that look like nothing was written.
- **TGA byte order**: file stores BGRA, memory is RGBA8888. Must swap B↔R on both read and write.
- **Row order**: check TGA descriptor bit 5 (`top_to_bottom`) for both read and write paths.
- **XY-swap**: `*_xyswap_variable.tga` sheets use column-major cell enumeration. Both train and apply detect `xyswap` in the filename.
- **Overfitting**: 117K params vs ~2650 samples — early stopping is essential. The model will memorise training data almost perfectly.
- **Sigmoid stability**: two-branch form (`x >= 0` vs `x < 0`) to avoid `exp()` overflow.
## Reference files
| File | What to check |
|------|---------------|
| `TerrarumSansBitmap.kt:917-930` | Tag parsing (Y+5, Y+6, tagify) |
| `TerrarumSansBitmap.kt:3082-3134` | Keming rules, kemingBitMask, rule matching |
| `OTFbuild/tga_reader.py` | TGA BGRA→RGBA conversion (reference impl) |
| `OTFbuild/glyph_parser.py:107-194` | Sheet parsing, eligibility, xyswap |
| `keming_machine.txt` | Bit encoding spec, shape examples, rule definitions |
## Verification
1. `make && ./autokem train` — should find ~2650 samples, label distribution should show A~55%, C~92%, etc.
2. `./autokem stats` — prints tensor shapes, training metadata
3. `./autokem apply ../src/assets/currencies_variable.tga` — creates `.bak`, writes kern bits
4. Check applied pixels with Python: `from tga_reader import read_tga; img.get_pixel(tag_x, tag_y+6)` — alpha should be 0xFF, not 0x00
5. `java -jar FontDemoGDX.jar` with modified sheet to visually verify kerning

22
Autokem/Makefile Normal file
View File

@@ -0,0 +1,22 @@
CC = gcc
CFLAGS = -Ofast -Wall -Wextra -std=c11
LDFLAGS = -lm
SRC = main.c tga.c nn.c safetensor.c train.c apply.c
OBJ = $(SRC:.c=.o)
all: autokem
autokem: $(OBJ)
$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)
%.o: %.c
$(CC) $(CFLAGS) -c $< -o $@
debug: CFLAGS = -g -Wall -Wextra -std=c11 -fsanitize=address,undefined
debug: LDFLAGS += -fsanitize=address,undefined
debug: clean autokem
clean:
rm -f *.o autokem
.PHONY: all debug clean

164
Autokem/apply.c Normal file
View File

@@ -0,0 +1,164 @@
#include "apply.h"
#include "tga.h"
#include "nn.h"
#include "safetensor.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* Copy file for backup */
static int copy_file(const char *src, const char *dst) {
FILE *in = fopen(src, "rb");
if (!in) return -1;
FILE *out = fopen(dst, "wb");
if (!out) { fclose(in); return -1; }
char buf[4096];
size_t n;
while ((n = fread(buf, 1, sizeof(buf), in)) > 0) {
if (fwrite(buf, 1, n, out) != n) {
fclose(in); fclose(out);
return -1;
}
}
fclose(in);
fclose(out);
return 0;
}
int apply_model(const char *tga_path) {
/* Validate filename */
const char *basename = strrchr(tga_path, '/');
basename = basename ? basename + 1 : tga_path;
if (strstr(basename, "variable") == NULL) {
fprintf(stderr, "Error: %s does not appear to be a variable sheet\n", tga_path);
return 1;
}
if (strstr(basename, "extrawide") != NULL) {
fprintf(stderr, "Error: extrawide sheets are not supported\n");
return 1;
}
int is_xyswap = (strstr(basename, "xyswap") != NULL);
/* Create backup */
char bakpath[512];
snprintf(bakpath, sizeof(bakpath), "%s.bak", tga_path);
if (copy_file(tga_path, bakpath) != 0) {
fprintf(stderr, "Error: failed to create backup %s\n", bakpath);
return 1;
}
printf("Backup: %s\n", bakpath);
/* Load model */
Network *net = network_create();
if (safetensor_load("autokem.safetensors", net) != 0) {
fprintf(stderr, "Error: failed to load model\n");
network_free(net);
return 1;
}
/* Load TGA */
TgaImage *img = tga_read(tga_path);
if (!img) {
fprintf(stderr, "Error: cannot read %s\n", tga_path);
network_free(net);
return 1;
}
int cell_w = 16, cell_h = 20;
int cols = img->width / cell_w;
int rows = img->height / cell_h;
int total_cells = cols * rows;
int processed = 0, updated = 0, skipped = 0;
for (int index = 0; index < total_cells; index++) {
int cell_x, cell_y;
if (is_xyswap) {
cell_x = (index / cols) * cell_w;
cell_y = (index % cols) * cell_h;
} else {
cell_x = (index % cols) * cell_w;
cell_y = (index / cols) * cell_h;
}
int tag_x = cell_x + (cell_w - 1);
int tag_y = cell_y;
/* Read width */
int width = 0;
for (int y = 0; y < 5; y++) {
if (tga_get_pixel(img, tag_x, tag_y + y) & 0xFF)
width |= (1 << y);
}
if (width == 0) { skipped++; continue; }
/* Check writeOnTop at Y+17 — skip if defined */
uint32_t wot = tga_get_pixel(img, tag_x, tag_y + 17);
if ((wot & 0xFF) != 0) { skipped++; continue; }
/* Check compiler directive at Y+9 — skip if opcode != 0 */
uint32_t dir_pixel = tagify(tga_get_pixel(img, tag_x, tag_y + 9));
int opcode = (int)((dir_pixel >> 24) & 0xFF);
if (opcode != 0) { skipped++; continue; }
/* Extract 15x20 binary input */
float input[300];
for (int gy = 0; gy < 20; gy++) {
for (int gx = 0; gx < 15; gx++) {
uint32_t p = tga_get_pixel(img, cell_x + gx, cell_y + gy);
input[gy * 15 + gx] = ((p & 0x80) != 0) ? 1.0f : 0.0f;
}
}
/* Inference */
float output[12];
network_infer(net, input, output);
/* Threshold at 0.5 */
int A = output[0] >= 0.5f;
int B = output[1] >= 0.5f;
int C = output[2] >= 0.5f;
int D = output[3] >= 0.5f;
int E = output[4] >= 0.5f;
int F = output[5] >= 0.5f;
int G = output[6] >= 0.5f;
int H = output[7] >= 0.5f;
int J = output[8] >= 0.5f;
int K = output[9] >= 0.5f;
int ytype = output[10] >= 0.5f;
int lowheight = output[11] >= 0.5f;
/* Compose Y+5 pixel: lowheight (alpha=0xFF when set) */
uint32_t lh_pixel = lowheight ? 0xFFFFFFFF : 0x00000000;
tga_write_pixel(tga_path, img, tag_x, tag_y + 5, lh_pixel);
/* Compose Y+6 pixel:
* Red byte: Y0000000 -> bit 31
* Green byte: JK000000 -> bits 23,22
* Blue byte: ABCDEFGH -> bits 15-8
* Alpha: 0xFF = hasKernData */
uint32_t pixel = 0;
pixel |= (uint32_t)(ytype ? 0x80 : 0) << 24;
pixel |= (uint32_t)((J ? 0x80 : 0) | (K ? 0x40 : 0)) << 16;
pixel |= (uint32_t)(A<<7 | B<<6 | C<<5 | D<<4 | E<<3 | F<<2 | G<<1 | H) << 8;
pixel |= 0xFF;
tga_write_pixel(tga_path, img, tag_x, tag_y + 6, pixel);
processed++;
updated++;
}
printf("Processed: %d cells, Updated: %d, Skipped: %d (of %d total)\n",
processed, updated, skipped, total_cells);
tga_free(img);
network_free(net);
return 0;
}

8
Autokem/apply.h Normal file
View File

@@ -0,0 +1,8 @@
#ifndef APPLY_H
#define APPLY_H
/* Apply trained model to a spritesheet.
Creates .bak backup, then writes predicted kerning bits. */
int apply_model(const char *tga_path);
#endif

BIN
Autokem/autokem.safetensors LFS Normal file

Binary file not shown.

40
Autokem/main.c Normal file
View File

@@ -0,0 +1,40 @@
#include <stdio.h>
#include <string.h>
#include "train.h"
#include "apply.h"
#include "safetensor.h"
static void print_usage(void) {
printf("Usage: autokem <command> [args]\n");
printf("Commands:\n");
printf(" train Train model on existing spritesheets\n");
printf(" apply <file.tga> Apply trained model to a spritesheet\n");
printf(" stats Print model statistics\n");
printf(" help Print this message\n");
}
int main(int argc, char **argv) {
if (argc < 2) {
print_usage();
return 1;
}
if (strcmp(argv[1], "train") == 0) {
return train_model();
} else if (strcmp(argv[1], "apply") == 0) {
if (argc < 3) {
fprintf(stderr, "Error: apply requires a TGA file path\n");
return 1;
}
return apply_model(argv[2]);
} else if (strcmp(argv[1], "stats") == 0) {
return safetensor_stats("autokem.safetensors");
} else if (strcmp(argv[1], "help") == 0) {
print_usage();
return 0;
} else {
fprintf(stderr, "Unknown command: %s\n", argv[1]);
print_usage();
return 1;
}
}

556
Autokem/nn.c Normal file
View File

@@ -0,0 +1,556 @@
#define _GNU_SOURCE
#include "nn.h"
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <time.h>
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
/* ---- Tensor ---- */
Tensor *tensor_alloc(int ndim, const int *shape) {
Tensor *t = malloc(sizeof(Tensor));
t->ndim = ndim;
t->size = 1;
for (int i = 0; i < ndim; i++) {
t->shape[i] = shape[i];
t->size *= shape[i];
}
for (int i = ndim; i < 4; i++) t->shape[i] = 0;
t->data = malloc((size_t)t->size * sizeof(float));
return t;
}
Tensor *tensor_zeros(int ndim, const int *shape) {
Tensor *t = tensor_alloc(ndim, shape);
memset(t->data, 0, (size_t)t->size * sizeof(float));
return t;
}
void tensor_free(Tensor *t) {
if (!t) return;
free(t->data);
free(t);
}
/* ---- RNG (Box-Muller) ---- */
static uint64_t rng_state = 0;
static void rng_seed(uint64_t s) { rng_state = s; }
static uint64_t xorshift64(void) {
uint64_t x = rng_state;
x ^= x << 13;
x ^= x >> 7;
x ^= x << 17;
rng_state = x;
return x;
}
static float rand_uniform(void) {
return (float)(xorshift64() & 0x7FFFFFFF) / (float)0x7FFFFFFF;
}
static float rand_normal(void) {
float u1, u2;
do { u1 = rand_uniform(); } while (u1 < 1e-10f);
u2 = rand_uniform();
return sqrtf(-2.0f * logf(u1)) * cosf(2.0f * (float)M_PI * u2);
}
/* He init: std = sqrt(2/fan_in) */
static void he_init(Tensor *w, int fan_in) {
float std = sqrtf(2.0f / (float)fan_in);
for (int i = 0; i < w->size; i++)
w->data[i] = rand_normal() * std;
}
/* ---- Activations ---- */
static inline float leaky_relu(float x) {
return x >= 0.0f ? x : 0.01f * x;
}
static inline float leaky_relu_grad(float x) {
return x >= 0.0f ? 1.0f : 0.01f;
}
static inline float sigmoid_f(float x) {
if (x >= 0.0f) {
float ez = expf(-x);
return 1.0f / (1.0f + ez);
} else {
float ez = expf(x);
return ez / (1.0f + ez);
}
}
/* ---- Conv2D forward/backward ---- */
static void conv2d_init(Conv2D *c, int in_ch, int out_ch, int kh, int kw) {
c->in_ch = in_ch;
c->out_ch = out_ch;
c->kh = kh;
c->kw = kw;
int wshape[] = {out_ch, in_ch, kh, kw};
int bshape[] = {out_ch};
c->weight = tensor_alloc(4, wshape);
c->bias = tensor_zeros(1, bshape);
c->grad_weight = tensor_zeros(4, wshape);
c->grad_bias = tensor_zeros(1, bshape);
c->m_weight = tensor_zeros(4, wshape);
c->v_weight = tensor_zeros(4, wshape);
c->m_bias = tensor_zeros(1, bshape);
c->v_bias = tensor_zeros(1, bshape);
c->input_cache = NULL;
he_init(c->weight, in_ch * kh * kw);
}
static void conv2d_free(Conv2D *c) {
tensor_free(c->weight);
tensor_free(c->bias);
tensor_free(c->grad_weight);
tensor_free(c->grad_bias);
tensor_free(c->m_weight);
tensor_free(c->v_weight);
tensor_free(c->m_bias);
tensor_free(c->v_bias);
tensor_free(c->input_cache);
}
/* Forward: input [batch, in_ch, H, W] -> output [batch, out_ch, H, W] (same padding) */
static Tensor *conv2d_forward(Conv2D *c, Tensor *input, int training) {
int batch = input->shape[0];
int in_ch = c->in_ch, out_ch = c->out_ch;
int H = input->shape[2], W = input->shape[3];
int kh = c->kh, kw = c->kw;
int ph = kh / 2, pw = kw / 2;
if (training) {
tensor_free(c->input_cache);
c->input_cache = tensor_alloc(input->ndim, input->shape);
memcpy(c->input_cache->data, input->data, (size_t)input->size * sizeof(float));
}
int oshape[] = {batch, out_ch, H, W};
Tensor *out = tensor_alloc(4, oshape);
for (int b = 0; b < batch; b++) {
for (int oc = 0; oc < out_ch; oc++) {
for (int oh = 0; oh < H; oh++) {
for (int ow = 0; ow < W; ow++) {
float sum = c->bias->data[oc];
for (int ic = 0; ic < in_ch; ic++) {
for (int fh = 0; fh < kh; fh++) {
for (int fw = 0; fw < kw; fw++) {
int ih = oh + fh - ph;
int iw = ow + fw - pw;
if (ih >= 0 && ih < H && iw >= 0 && iw < W) {
float inp = input->data[((b * in_ch + ic) * H + ih) * W + iw];
float wt = c->weight->data[((oc * in_ch + ic) * kh + fh) * kw + fw];
sum += inp * wt;
}
}
}
}
out->data[((b * out_ch + oc) * H + oh) * W + ow] = sum;
}
}
}
}
return out;
}
/* Backward: grad_output [batch, out_ch, H, W] -> grad_input [batch, in_ch, H, W] */
static Tensor *conv2d_backward(Conv2D *c, Tensor *grad_output) {
Tensor *input = c->input_cache;
int batch = input->shape[0];
int in_ch = c->in_ch, out_ch = c->out_ch;
int H = input->shape[2], W = input->shape[3];
int kh = c->kh, kw = c->kw;
int ph = kh / 2, pw = kw / 2;
Tensor *grad_input = tensor_zeros(input->ndim, input->shape);
for (int b = 0; b < batch; b++) {
for (int oc = 0; oc < out_ch; oc++) {
for (int oh = 0; oh < H; oh++) {
for (int ow = 0; ow < W; ow++) {
float go = grad_output->data[((b * out_ch + oc) * H + oh) * W + ow];
c->grad_bias->data[oc] += go;
for (int ic = 0; ic < in_ch; ic++) {
for (int fh = 0; fh < kh; fh++) {
for (int fw = 0; fw < kw; fw++) {
int ih = oh + fh - ph;
int iw = ow + fw - pw;
if (ih >= 0 && ih < H && iw >= 0 && iw < W) {
float inp = input->data[((b * in_ch + ic) * H + ih) * W + iw];
c->grad_weight->data[((oc * in_ch + ic) * kh + fh) * kw + fw] += go * inp;
grad_input->data[((b * in_ch + ic) * H + ih) * W + iw] +=
go * c->weight->data[((oc * in_ch + ic) * kh + fh) * kw + fw];
}
}
}
}
}
}
}
}
return grad_input;
}
/* ---- Dense forward/backward ---- */
static void dense_init(Dense *d, int in_f, int out_f) {
d->in_features = in_f;
d->out_features = out_f;
int wshape[] = {out_f, in_f};
int bshape[] = {out_f};
d->weight = tensor_alloc(2, wshape);
d->bias = tensor_zeros(1, bshape);
d->grad_weight = tensor_zeros(2, wshape);
d->grad_bias = tensor_zeros(1, bshape);
d->m_weight = tensor_zeros(2, wshape);
d->v_weight = tensor_zeros(2, wshape);
d->m_bias = tensor_zeros(1, bshape);
d->v_bias = tensor_zeros(1, bshape);
d->input_cache = NULL;
he_init(d->weight, in_f);
}
static void dense_free(Dense *d) {
tensor_free(d->weight);
tensor_free(d->bias);
tensor_free(d->grad_weight);
tensor_free(d->grad_bias);
tensor_free(d->m_weight);
tensor_free(d->v_weight);
tensor_free(d->m_bias);
tensor_free(d->v_bias);
tensor_free(d->input_cache);
}
/* Forward: input [batch, in_f] -> output [batch, out_f] */
static Tensor *dense_forward(Dense *d, Tensor *input, int training) {
int batch = input->shape[0];
int in_f = d->in_features, out_f = d->out_features;
if (training) {
tensor_free(d->input_cache);
d->input_cache = tensor_alloc(input->ndim, input->shape);
memcpy(d->input_cache->data, input->data, (size_t)input->size * sizeof(float));
}
int oshape[] = {batch, out_f};
Tensor *out = tensor_alloc(2, oshape);
for (int b = 0; b < batch; b++) {
for (int o = 0; o < out_f; o++) {
float sum = d->bias->data[o];
for (int i = 0; i < in_f; i++) {
sum += input->data[b * in_f + i] * d->weight->data[o * in_f + i];
}
out->data[b * out_f + o] = sum;
}
}
return out;
}
/* Backward: grad_output [batch, out_f] -> grad_input [batch, in_f] */
static Tensor *dense_backward(Dense *d, Tensor *grad_output) {
Tensor *input = d->input_cache;
int batch = input->shape[0];
int in_f = d->in_features, out_f = d->out_features;
int gshape[] = {batch, in_f};
Tensor *grad_input = tensor_zeros(2, gshape);
for (int b = 0; b < batch; b++) {
for (int o = 0; o < out_f; o++) {
float go = grad_output->data[b * out_f + o];
d->grad_bias->data[o] += go;
for (int i = 0; i < in_f; i++) {
d->grad_weight->data[o * in_f + i] += go * input->data[b * in_f + i];
grad_input->data[b * in_f + i] += go * d->weight->data[o * in_f + i];
}
}
}
return grad_input;
}
/* ---- LeakyReLU helpers on tensors ---- */
static Tensor *apply_leaky_relu(Tensor *input) {
Tensor *out = tensor_alloc(input->ndim, input->shape);
for (int i = 0; i < input->size; i++)
out->data[i] = leaky_relu(input->data[i]);
return out;
}
static Tensor *apply_leaky_relu_backward(Tensor *grad_output, Tensor *pre_activation) {
Tensor *grad = tensor_alloc(grad_output->ndim, grad_output->shape);
for (int i = 0; i < grad_output->size; i++)
grad->data[i] = grad_output->data[i] * leaky_relu_grad(pre_activation->data[i]);
return grad;
}
/* ---- Sigmoid on tensor ---- */
static Tensor *apply_sigmoid(Tensor *input) {
Tensor *out = tensor_alloc(input->ndim, input->shape);
for (int i = 0; i < input->size; i++)
out->data[i] = sigmoid_f(input->data[i]);
return out;
}
/* ---- Adam step for a single parameter tensor ---- */
static void adam_update(Tensor *param, Tensor *grad, Tensor *m, Tensor *v,
float lr, float beta1, float beta2, float eps, int t) {
float bc1 = 1.0f - powf(beta1, (float)t);
float bc2 = 1.0f - powf(beta2, (float)t);
for (int i = 0; i < param->size; i++) {
m->data[i] = beta1 * m->data[i] + (1.0f - beta1) * grad->data[i];
v->data[i] = beta2 * v->data[i] + (1.0f - beta2) * grad->data[i] * grad->data[i];
float m_hat = m->data[i] / bc1;
float v_hat = v->data[i] / bc2;
param->data[i] -= lr * m_hat / (sqrtf(v_hat) + eps);
}
}
/* ---- Network ---- */
Network *network_create(void) {
rng_seed((uint64_t)time(NULL) ^ 0xDEADBEEF);
Network *net = calloc(1, sizeof(Network));
conv2d_init(&net->conv1, 1, 12, 3, 3);
conv2d_init(&net->conv2, 12, 16, 3, 3);
dense_init(&net->fc1, 4800, 24);
dense_init(&net->head_shape, 24, 10);
dense_init(&net->head_ytype, 24, 1);
dense_init(&net->head_lowheight, 24, 1);
return net;
}
void network_free(Network *net) {
if (!net) return;
conv2d_free(&net->conv1);
conv2d_free(&net->conv2);
dense_free(&net->fc1);
dense_free(&net->head_shape);
dense_free(&net->head_ytype);
dense_free(&net->head_lowheight);
tensor_free(net->act_conv1);
tensor_free(net->act_relu1);
tensor_free(net->act_conv2);
tensor_free(net->act_relu2);
tensor_free(net->act_flat);
tensor_free(net->act_fc1);
tensor_free(net->act_relu3);
tensor_free(net->out_shape);
tensor_free(net->out_ytype);
tensor_free(net->out_lowheight);
free(net);
}
static void free_activations(Network *net) {
tensor_free(net->act_conv1); net->act_conv1 = NULL;
tensor_free(net->act_relu1); net->act_relu1 = NULL;
tensor_free(net->act_conv2); net->act_conv2 = NULL;
tensor_free(net->act_relu2); net->act_relu2 = NULL;
tensor_free(net->act_flat); net->act_flat = NULL;
tensor_free(net->act_fc1); net->act_fc1 = NULL;
tensor_free(net->act_relu3); net->act_relu3 = NULL;
tensor_free(net->out_shape); net->out_shape = NULL;
tensor_free(net->out_ytype); net->out_ytype = NULL;
tensor_free(net->out_lowheight); net->out_lowheight = NULL;
}
void network_forward(Network *net, Tensor *input, int training) {
free_activations(net);
/* Conv1 -> LeakyReLU */
net->act_conv1 = conv2d_forward(&net->conv1, input, training);
net->act_relu1 = apply_leaky_relu(net->act_conv1);
/* Conv2 -> LeakyReLU */
net->act_conv2 = conv2d_forward(&net->conv2, net->act_relu1, training);
net->act_relu2 = apply_leaky_relu(net->act_conv2);
/* Flatten: [batch, 16, 20, 15] -> [batch, 4800] */
int batch = net->act_relu2->shape[0];
int flat_size = net->act_relu2->size / batch;
int fshape[] = {batch, flat_size};
net->act_flat = tensor_alloc(2, fshape);
memcpy(net->act_flat->data, net->act_relu2->data, (size_t)net->act_relu2->size * sizeof(float));
/* FC1 -> LeakyReLU */
net->act_fc1 = dense_forward(&net->fc1, net->act_flat, training);
net->act_relu3 = apply_leaky_relu(net->act_fc1);
/* Three heads with sigmoid */
Tensor *logit_shape = dense_forward(&net->head_shape, net->act_relu3, training);
Tensor *logit_ytype = dense_forward(&net->head_ytype, net->act_relu3, training);
Tensor *logit_lowheight = dense_forward(&net->head_lowheight, net->act_relu3, training);
net->out_shape = apply_sigmoid(logit_shape);
net->out_ytype = apply_sigmoid(logit_ytype);
net->out_lowheight = apply_sigmoid(logit_lowheight);
tensor_free(logit_shape);
tensor_free(logit_ytype);
tensor_free(logit_lowheight);
}
void network_backward(Network *net, Tensor *target_shape, Tensor *target_ytype, Tensor *target_lowheight) {
int batch = net->out_shape->shape[0];
/* BCE gradient at sigmoid: d_logit = pred - target */
/* Head: shape (10 outputs) */
int gs[] = {batch, 10};
Tensor *grad_logit_shape = tensor_alloc(2, gs);
for (int i = 0; i < batch * 10; i++)
grad_logit_shape->data[i] = (net->out_shape->data[i] - target_shape->data[i]) / (float)batch;
int gy[] = {batch, 1};
Tensor *grad_logit_ytype = tensor_alloc(2, gy);
for (int i = 0; i < batch; i++)
grad_logit_ytype->data[i] = (net->out_ytype->data[i] - target_ytype->data[i]) / (float)batch;
Tensor *grad_logit_lh = tensor_alloc(2, gy);
for (int i = 0; i < batch; i++)
grad_logit_lh->data[i] = (net->out_lowheight->data[i] - target_lowheight->data[i]) / (float)batch;
/* Backward through heads */
Tensor *grad_relu3_s = dense_backward(&net->head_shape, grad_logit_shape);
Tensor *grad_relu3_y = dense_backward(&net->head_ytype, grad_logit_ytype);
Tensor *grad_relu3_l = dense_backward(&net->head_lowheight, grad_logit_lh);
/* Sum gradients from three heads */
int r3shape[] = {batch, 24};
Tensor *grad_relu3 = tensor_zeros(2, r3shape);
for (int i = 0; i < batch * 24; i++)
grad_relu3->data[i] = grad_relu3_s->data[i] + grad_relu3_y->data[i] + grad_relu3_l->data[i];
tensor_free(grad_logit_shape);
tensor_free(grad_logit_ytype);
tensor_free(grad_logit_lh);
tensor_free(grad_relu3_s);
tensor_free(grad_relu3_y);
tensor_free(grad_relu3_l);
/* LeakyReLU backward (fc1 output) */
Tensor *grad_fc1_out = apply_leaky_relu_backward(grad_relu3, net->act_fc1);
tensor_free(grad_relu3);
/* Dense fc1 backward */
Tensor *grad_flat = dense_backward(&net->fc1, grad_fc1_out);
tensor_free(grad_fc1_out);
/* Unflatten: [batch, 4800] -> [batch, 16, 20, 15] */
int ushape[] = {batch, 16, 20, 15};
Tensor *grad_relu2 = tensor_alloc(4, ushape);
memcpy(grad_relu2->data, grad_flat->data, (size_t)grad_flat->size * sizeof(float));
tensor_free(grad_flat);
/* LeakyReLU backward (conv2 output) */
Tensor *grad_conv2_out = apply_leaky_relu_backward(grad_relu2, net->act_conv2);
tensor_free(grad_relu2);
/* Conv2 backward */
Tensor *grad_relu1 = conv2d_backward(&net->conv2, grad_conv2_out);
tensor_free(grad_conv2_out);
/* LeakyReLU backward (conv1 output) */
Tensor *grad_conv1_out = apply_leaky_relu_backward(grad_relu1, net->act_conv1);
tensor_free(grad_relu1);
/* Conv1 backward */
Tensor *grad_input = conv2d_backward(&net->conv1, grad_conv1_out);
tensor_free(grad_conv1_out);
tensor_free(grad_input);
}
void network_adam_step(Network *net, float lr, float beta1, float beta2, float eps, int t) {
adam_update(net->conv1.weight, net->conv1.grad_weight, net->conv1.m_weight, net->conv1.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->conv1.bias, net->conv1.grad_bias, net->conv1.m_bias, net->conv1.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->conv2.weight, net->conv2.grad_weight, net->conv2.m_weight, net->conv2.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->conv2.bias, net->conv2.grad_bias, net->conv2.m_bias, net->conv2.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->fc1.weight, net->fc1.grad_weight, net->fc1.m_weight, net->fc1.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->fc1.bias, net->fc1.grad_bias, net->fc1.m_bias, net->fc1.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->head_shape.weight, net->head_shape.grad_weight, net->head_shape.m_weight, net->head_shape.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->head_shape.bias, net->head_shape.grad_bias, net->head_shape.m_bias, net->head_shape.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->head_ytype.weight, net->head_ytype.grad_weight, net->head_ytype.m_weight, net->head_ytype.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->head_ytype.bias, net->head_ytype.grad_bias, net->head_ytype.m_bias, net->head_ytype.v_bias, lr, beta1, beta2, eps, t);
adam_update(net->head_lowheight.weight, net->head_lowheight.grad_weight, net->head_lowheight.m_weight, net->head_lowheight.v_weight, lr, beta1, beta2, eps, t);
adam_update(net->head_lowheight.bias, net->head_lowheight.grad_bias, net->head_lowheight.m_bias, net->head_lowheight.v_bias, lr, beta1, beta2, eps, t);
}
void network_zero_grad(Network *net) {
memset(net->conv1.grad_weight->data, 0, (size_t)net->conv1.grad_weight->size * sizeof(float));
memset(net->conv1.grad_bias->data, 0, (size_t)net->conv1.grad_bias->size * sizeof(float));
memset(net->conv2.grad_weight->data, 0, (size_t)net->conv2.grad_weight->size * sizeof(float));
memset(net->conv2.grad_bias->data, 0, (size_t)net->conv2.grad_bias->size * sizeof(float));
memset(net->fc1.grad_weight->data, 0, (size_t)net->fc1.grad_weight->size * sizeof(float));
memset(net->fc1.grad_bias->data, 0, (size_t)net->fc1.grad_bias->size * sizeof(float));
memset(net->head_shape.grad_weight->data, 0, (size_t)net->head_shape.grad_weight->size * sizeof(float));
memset(net->head_shape.grad_bias->data, 0, (size_t)net->head_shape.grad_bias->size * sizeof(float));
memset(net->head_ytype.grad_weight->data, 0, (size_t)net->head_ytype.grad_weight->size * sizeof(float));
memset(net->head_ytype.grad_bias->data, 0, (size_t)net->head_ytype.grad_bias->size * sizeof(float));
memset(net->head_lowheight.grad_weight->data, 0, (size_t)net->head_lowheight.grad_weight->size * sizeof(float));
memset(net->head_lowheight.grad_bias->data, 0, (size_t)net->head_lowheight.grad_bias->size * sizeof(float));
}
float network_bce_loss(Network *net, Tensor *target_shape, Tensor *target_ytype, Tensor *target_lowheight) {
float loss = 0.0f;
int batch = net->out_shape->shape[0];
for (int i = 0; i < batch * 10; i++) {
float p = net->out_shape->data[i];
float t = target_shape->data[i];
p = fmaxf(1e-7f, fminf(1.0f - 1e-7f, p));
loss -= t * logf(p) + (1.0f - t) * logf(1.0f - p);
}
for (int i = 0; i < batch; i++) {
float p = net->out_ytype->data[i];
float t = target_ytype->data[i];
p = fmaxf(1e-7f, fminf(1.0f - 1e-7f, p));
loss -= t * logf(p) + (1.0f - t) * logf(1.0f - p);
}
for (int i = 0; i < batch; i++) {
float p = net->out_lowheight->data[i];
float t = target_lowheight->data[i];
p = fmaxf(1e-7f, fminf(1.0f - 1e-7f, p));
loss -= t * logf(p) + (1.0f - t) * logf(1.0f - p);
}
return loss / (float)batch;
}
void network_infer(Network *net, const float *input300, float *output12) {
int ishape[] = {1, 1, 20, 15};
Tensor *input = tensor_alloc(4, ishape);
memcpy(input->data, input300, 300 * sizeof(float));
network_forward(net, input, 0);
/* output order: A,B,C,D,E,F,G,H,J,K, ytype, lowheight */
for (int i = 0; i < 10; i++)
output12[i] = net->out_shape->data[i];
output12[10] = net->out_ytype->data[0];
output12[11] = net->out_lowheight->data[0];
tensor_free(input);
}

90
Autokem/nn.h Normal file
View File

@@ -0,0 +1,90 @@
#ifndef NN_H
#define NN_H
#include <stdint.h>
/* ---- Tensor ---- */
typedef struct {
float *data;
int shape[4]; /* up to 4 dims */
int ndim;
int size; /* total number of elements */
} Tensor;
Tensor *tensor_alloc(int ndim, const int *shape);
Tensor *tensor_zeros(int ndim, const int *shape);
void tensor_free(Tensor *t);
/* ---- Layers ---- */
typedef struct {
int in_ch, out_ch, kh, kw;
Tensor *weight; /* [out_ch, in_ch, kh, kw] */
Tensor *bias; /* [out_ch] */
Tensor *grad_weight;
Tensor *grad_bias;
/* Adam moments */
Tensor *m_weight, *v_weight;
Tensor *m_bias, *v_bias;
/* cached input for backward */
Tensor *input_cache;
} Conv2D;
typedef struct {
int in_features, out_features;
Tensor *weight; /* [out_features, in_features] */
Tensor *bias; /* [out_features] */
Tensor *grad_weight;
Tensor *grad_bias;
Tensor *m_weight, *v_weight;
Tensor *m_bias, *v_bias;
Tensor *input_cache;
} Dense;
/* ---- Network ---- */
typedef struct {
Conv2D conv1; /* 1->12, 3x3 */
Conv2D conv2; /* 12->16, 3x3 */
Dense fc1; /* 4800->24 */
Dense head_shape; /* 24->10 (bits A-H, J, K) */
Dense head_ytype; /* 24->1 */
Dense head_lowheight;/* 24->1 */
/* activation caches (allocated per forward) */
Tensor *act_conv1;
Tensor *act_relu1;
Tensor *act_conv2;
Tensor *act_relu2;
Tensor *act_flat;
Tensor *act_fc1;
Tensor *act_relu3;
Tensor *out_shape;
Tensor *out_ytype;
Tensor *out_lowheight;
} Network;
/* Init / free */
Network *network_create(void);
void network_free(Network *net);
/* Forward pass. input: [batch, 1, 20, 15]. Outputs stored in net->out_* */
void network_forward(Network *net, Tensor *input, int training);
/* Backward pass. targets: shape[batch,10], ytype[batch,1], lowheight[batch,1] */
void network_backward(Network *net, Tensor *target_shape, Tensor *target_ytype, Tensor *target_lowheight);
/* Adam update step */
void network_adam_step(Network *net, float lr, float beta1, float beta2, float eps, int t);
/* Zero all gradients */
void network_zero_grad(Network *net);
/* Compute BCE loss (sum of all heads) */
float network_bce_loss(Network *net, Tensor *target_shape, Tensor *target_ytype, Tensor *target_lowheight);
/* Single-sample inference: input float[300], output float[12] (A-H,J,K,ytype,lowheight) */
void network_infer(Network *net, const float *input300, float *output12);
#endif

269
Autokem/safetensor.c Normal file
View File

@@ -0,0 +1,269 @@
#include "safetensor.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
/* Tensor registry entry */
typedef struct {
const char *name;
float *data;
int size;
int ndim;
int shape[4];
} TensorEntry;
static void collect_tensors(Network *net, TensorEntry *entries, int *count) {
int n = 0;
#define ADD(nm, layer, field) do { \
entries[n].name = nm; \
entries[n].data = net->layer.field->data; \
entries[n].size = net->layer.field->size; \
entries[n].ndim = net->layer.field->ndim; \
for (int i = 0; i < net->layer.field->ndim; i++) \
entries[n].shape[i] = net->layer.field->shape[i]; \
n++; \
} while(0)
ADD("conv1.weight", conv1, weight);
ADD("conv1.bias", conv1, bias);
ADD("conv2.weight", conv2, weight);
ADD("conv2.bias", conv2, bias);
ADD("fc1.weight", fc1, weight);
ADD("fc1.bias", fc1, bias);
ADD("head_shape.weight", head_shape, weight);
ADD("head_shape.bias", head_shape, bias);
ADD("head_ytype.weight", head_ytype, weight);
ADD("head_ytype.bias", head_ytype, bias);
ADD("head_lowheight.weight", head_lowheight, weight);
ADD("head_lowheight.bias", head_lowheight, bias);
#undef ADD
*count = n;
}
int safetensor_save(const char *path, Network *net, int total_samples, int epochs, float val_loss) {
TensorEntry entries[12];
int count;
collect_tensors(net, entries, &count);
/* Build JSON header */
char header[8192];
int pos = 0;
pos += snprintf(header + pos, sizeof(header) - (size_t)pos, "{");
/* metadata */
pos += snprintf(header + pos, sizeof(header) - (size_t)pos,
"\"__metadata__\":{\"samples\":\"%d\",\"epochs\":\"%d\",\"val_loss\":\"%.6f\"},",
total_samples, epochs, (double)val_loss);
/* tensor entries */
size_t data_offset = 0;
for (int i = 0; i < count; i++) {
size_t byte_size = (size_t)entries[i].size * sizeof(float);
pos += snprintf(header + pos, sizeof(header) - (size_t)pos,
"\"%s\":{\"dtype\":\"F32\",\"shape\":[", entries[i].name);
for (int d = 0; d < entries[i].ndim; d++) {
if (d > 0) pos += snprintf(header + pos, sizeof(header) - (size_t)pos, ",");
pos += snprintf(header + pos, sizeof(header) - (size_t)pos, "%d", entries[i].shape[d]);
}
pos += snprintf(header + pos, sizeof(header) - (size_t)pos,
"],\"data_offsets\":[%zu,%zu]}", data_offset, data_offset + byte_size);
if (i < count - 1)
pos += snprintf(header + pos, sizeof(header) - (size_t)pos, ",");
data_offset += byte_size;
}
pos += snprintf(header + pos, sizeof(header) - (size_t)pos, "}");
/* Pad header to 8-byte alignment */
size_t header_len = (size_t)pos;
size_t padded = (header_len + 7) & ~(size_t)7;
while (header_len < padded) {
header[header_len++] = ' ';
}
FILE *f = fopen(path, "wb");
if (!f) {
fprintf(stderr, "Error: cannot open %s for writing\n", path);
return -1;
}
/* 8-byte LE header length */
uint64_t hlen = (uint64_t)header_len;
fwrite(&hlen, 8, 1, f);
/* JSON header */
fwrite(header, 1, header_len, f);
/* Raw tensor data */
for (int i = 0; i < count; i++) {
fwrite(entries[i].data, sizeof(float), (size_t)entries[i].size, f);
}
fclose(f);
printf("Saved model to %s (%zu bytes)\n", path, 8 + header_len + data_offset);
return 0;
}
/* Minimal JSON parser: find tensor by name, extract data_offsets */
static int find_tensor_offsets(const char *json, size_t json_len, const char *name,
size_t *off_start, size_t *off_end) {
/* Search for "name": */
size_t nlen = strlen(name);
for (size_t i = 0; i + nlen + 3 < json_len; i++) {
if (json[i] == '"' && strncmp(json + i + 1, name, nlen) == 0 && json[i + 1 + nlen] == '"') {
/* Found the key, now find data_offsets */
const char *doff = strstr(json + i, "\"data_offsets\"");
if (!doff || (size_t)(doff - json) > json_len) return -1;
const char *bracket = strchr(doff, '[');
if (!bracket) return -1;
if (sscanf(bracket, "[%zu,%zu]", off_start, off_end) != 2) return -1;
return 0;
}
}
return -1;
}
int safetensor_load(const char *path, Network *net) {
FILE *f = fopen(path, "rb");
if (!f) {
fprintf(stderr, "Error: cannot open %s\n", path);
return -1;
}
uint64_t header_len;
if (fread(&header_len, 8, 1, f) != 1) { fclose(f); return -1; }
char *json = malloc((size_t)header_len + 1);
if (fread(json, 1, (size_t)header_len, f) != (size_t)header_len) {
free(json);
fclose(f);
return -1;
}
json[header_len] = '\0';
long data_start = 8 + (long)header_len;
TensorEntry entries[12];
int count;
collect_tensors(net, entries, &count);
for (int i = 0; i < count; i++) {
size_t off_start, off_end;
if (find_tensor_offsets(json, (size_t)header_len, entries[i].name, &off_start, &off_end) != 0) {
fprintf(stderr, "Error: tensor '%s' not found in %s\n", entries[i].name, path);
free(json);
fclose(f);
return -1;
}
size_t byte_size = off_end - off_start;
if (byte_size != (size_t)entries[i].size * sizeof(float)) {
fprintf(stderr, "Error: size mismatch for '%s': expected %zu, got %zu\n",
entries[i].name, (size_t)entries[i].size * sizeof(float), byte_size);
free(json);
fclose(f);
return -1;
}
fseek(f, data_start + (long)off_start, SEEK_SET);
if (fread(entries[i].data, 1, byte_size, f) != byte_size) {
fprintf(stderr, "Error: failed to read tensor '%s'\n", entries[i].name);
free(json);
fclose(f);
return -1;
}
}
free(json);
fclose(f);
return 0;
}
int safetensor_stats(const char *path) {
FILE *f = fopen(path, "rb");
if (!f) {
fprintf(stderr, "Error: cannot open %s\n", path);
return -1;
}
uint64_t header_len;
if (fread(&header_len, 8, 1, f) != 1) { fclose(f); return -1; }
char *json = malloc((size_t)header_len + 1);
if (fread(json, 1, (size_t)header_len, f) != (size_t)header_len) {
free(json);
fclose(f);
return -1;
}
json[header_len] = '\0';
fclose(f);
printf("Model: %s\n", path);
printf("Header length: %lu bytes\n", (unsigned long)header_len);
/* Extract a JSON string value: find "key":"value" and return value */
/* Helper: find value for key within metadata block */
const char *meta = strstr(json, "\"__metadata__\"");
if (meta) {
const char *keys[] = {"samples", "epochs", "val_loss"};
const char *labels[] = {"Training samples", "Epochs", "Validation loss"};
for (int k = 0; k < 3; k++) {
char search[64];
snprintf(search, sizeof(search), "\"%s\"", keys[k]);
const char *found = strstr(meta, search);
if (!found) continue;
/* skip past key and colon to opening quote of value */
const char *colon = strchr(found + strlen(search), ':');
if (!colon) continue;
const char *vstart = strchr(colon, '"');
if (!vstart) continue;
vstart++;
const char *vend = strchr(vstart, '"');
if (!vend) continue;
printf("%s: %.*s\n", labels[k], (int)(vend - vstart), vstart);
}
}
/* List tensors */
const char *tensor_names[] = {
"conv1.weight", "conv1.bias", "conv2.weight", "conv2.bias",
"fc1.weight", "fc1.bias",
"head_shape.weight", "head_shape.bias",
"head_ytype.weight", "head_ytype.bias",
"head_lowheight.weight", "head_lowheight.bias"
};
int total_params = 0;
printf("\nTensors:\n");
for (int i = 0; i < 12; i++) {
size_t off_start, off_end;
if (find_tensor_offsets(json, (size_t)header_len, tensor_names[i], &off_start, &off_end) == 0) {
int params = (int)(off_end - off_start) / 4;
total_params += params;
/* Extract shape */
const char *key = strstr(json, tensor_names[i]);
if (key) {
const char *shp = strstr(key, "\"shape\"");
if (shp) {
const char *br = strchr(shp, '[');
const char *bre = strchr(shp, ']');
if (br && bre) {
printf(" %-28s shape=[%.*s] params=%d\n",
tensor_names[i], (int)(bre - br - 1), br + 1, params);
}
}
}
}
}
printf("\nTotal parameters: %d (%.1f KB as float32)\n", total_params, (float)total_params * 4.0f / 1024.0f);
free(json);
return 0;
}

16
Autokem/safetensor.h Normal file
View File

@@ -0,0 +1,16 @@
#ifndef SAFETENSOR_H
#define SAFETENSOR_H
#include "nn.h"
/* Save network weights to .safetensors format.
metadata: optional string pairs (key,value,...,NULL) */
int safetensor_save(const char *path, Network *net, int total_samples, int epochs, float val_loss);
/* Load network weights from .safetensors file. */
int safetensor_load(const char *path, Network *net);
/* Print model stats from .safetensors file. */
int safetensor_stats(const char *path);
#endif

105
Autokem/tga.c Normal file
View File

@@ -0,0 +1,105 @@
#include "tga.h"
#include <stdlib.h>
#include <string.h>
TgaImage *tga_read(const char *path) {
FILE *f = fopen(path, "rb");
if (!f) return NULL;
uint8_t header[18];
if (fread(header, 1, 18, f) != 18) { fclose(f); return NULL; }
uint8_t id_length = header[0];
uint8_t colour_map_type = header[1];
uint8_t image_type = header[2];
/* skip colour map spec (bytes 3-7) */
/* image spec starts at byte 8 */
uint16_t width = header[12] | (header[13] << 8);
uint16_t height = header[14] | (header[15] << 8);
uint8_t bpp = header[16];
uint8_t descriptor = header[17];
if (colour_map_type != 0 || image_type != 2 || bpp != 32) {
fclose(f);
return NULL;
}
int top_to_bottom = (descriptor & 0x20) != 0;
/* skip image ID */
if (id_length > 0) fseek(f, id_length, SEEK_CUR);
long pixel_data_offset = 18 + id_length;
TgaImage *img = malloc(sizeof(TgaImage));
if (!img) { fclose(f); return NULL; }
img->width = width;
img->height = height;
img->pixel_data_offset = pixel_data_offset;
img->top_to_bottom = top_to_bottom;
img->pixels = malloc((size_t)width * height * sizeof(uint32_t));
if (!img->pixels) { free(img); fclose(f); return NULL; }
for (int row = 0; row < height; row++) {
int y = top_to_bottom ? row : (height - 1 - row);
for (int x = 0; x < width; x++) {
uint8_t bgra[4];
if (fread(bgra, 1, 4, f) != 4) {
free(img->pixels); free(img); fclose(f);
return NULL;
}
/* TGA stores BGRA, convert to RGBA8888 */
uint32_t r = bgra[2], g = bgra[1], b = bgra[0], a = bgra[3];
img->pixels[y * width + x] = (r << 24) | (g << 16) | (b << 8) | a;
}
}
fclose(f);
return img;
}
uint32_t tga_get_pixel(const TgaImage *img, int x, int y) {
if (x < 0 || x >= img->width || y < 0 || y >= img->height) return 0;
return img->pixels[y * img->width + x];
}
int tga_write_pixel(const char *path, TgaImage *img, int x, int y, uint32_t rgba) {
if (x < 0 || x >= img->width || y < 0 || y >= img->height) return -1;
/* compute file row: reverse the mapping used during read */
int file_row;
if (img->top_to_bottom) {
file_row = y;
} else {
file_row = img->height - 1 - y;
}
long offset = img->pixel_data_offset + ((long)file_row * img->width + x) * 4;
FILE *f = fopen(path, "r+b");
if (!f) return -1;
fseek(f, offset, SEEK_SET);
/* convert RGBA8888 to TGA BGRA */
uint8_t bgra[4];
bgra[2] = (rgba >> 24) & 0xFF; /* R */
bgra[1] = (rgba >> 16) & 0xFF; /* G */
bgra[0] = (rgba >> 8) & 0xFF; /* B */
bgra[3] = rgba & 0xFF; /* A */
size_t written = fwrite(bgra, 1, 4, f);
fclose(f);
/* also update in-memory pixel array */
img->pixels[y * img->width + x] = rgba;
return (written == 4) ? 0 : -1;
}
void tga_free(TgaImage *img) {
if (!img) return;
free(img->pixels);
free(img);
}

33
Autokem/tga.h Normal file
View File

@@ -0,0 +1,33 @@
#ifndef TGA_H
#define TGA_H
#include <stdint.h>
#include <stdio.h>
typedef struct {
int width;
int height;
uint32_t *pixels; /* RGBA8888: R<<24 | G<<16 | B<<8 | A */
long pixel_data_offset; /* byte offset of pixel data in file */
int top_to_bottom;
} TgaImage;
/* Read an uncompressed 32-bit TGA file. Returns NULL on error. */
TgaImage *tga_read(const char *path);
/* Get pixel at (x,y) as RGBA8888. Returns 0 for out-of-bounds. */
uint32_t tga_get_pixel(const TgaImage *img, int x, int y);
/* Write a single pixel (RGBA8888) to TGA file on disk at (x,y).
Opens/closes the file internally. */
int tga_write_pixel(const char *path, TgaImage *img, int x, int y, uint32_t rgba);
/* Free a TgaImage. */
void tga_free(TgaImage *img);
/* tagify: returns 0 if alpha==0, else full pixel value */
static inline uint32_t tagify(uint32_t pixel) {
return (pixel & 0xFF) == 0 ? 0 : pixel;
}
#endif

423
Autokem/train.c Normal file
View File

@@ -0,0 +1,423 @@
#include "train.h"
#include "tga.h"
#include "nn.h"
#include "safetensor.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <time.h>
#include <dirent.h>
/* ---- Data sample ---- */
typedef struct {
float input[300]; /* 15x20 binary */
float shape[10]; /* A,B,C,D,E,F,G,H,J,K */
float ytype;
float lowheight;
} Sample;
/* ---- Bit extraction from kerning mask ---- */
/* kerningMask = pixel >> 8 & 0xFFFFFF
* Layout: Red=Y0000000, Green=JK000000, Blue=ABCDEFGH
* After >> 8: bits 23-16 = Red[7:0], bits 15-8 = Green[7:0], bits 7-0 = Blue[7:0]
* Y = bit 23 (already extracted separately as isKernYtype)
* J = bit 15, K = bit 14
* A = bit 7, B = bit 6, ..., H = bit 0
*/
static void extract_shape_bits(int kerning_mask, float *shape) {
shape[0] = (float)((kerning_mask >> 7) & 1); /* A */
shape[1] = (float)((kerning_mask >> 6) & 1); /* B */
shape[2] = (float)((kerning_mask >> 5) & 1); /* C */
shape[3] = (float)((kerning_mask >> 4) & 1); /* D */
shape[4] = (float)((kerning_mask >> 3) & 1); /* E */
shape[5] = (float)((kerning_mask >> 2) & 1); /* F */
shape[6] = (float)((kerning_mask >> 1) & 1); /* G */
shape[7] = (float)((kerning_mask >> 0) & 1); /* H */
shape[8] = (float)((kerning_mask >> 15) & 1); /* J */
shape[9] = (float)((kerning_mask >> 14) & 1); /* K */
}
/* ---- Collect samples from one TGA ---- */
static int collect_from_sheet(const char *path, int is_xyswap, Sample *samples, int max_samples) {
TgaImage *img = tga_read(path);
if (!img) {
fprintf(stderr, "Warning: cannot read %s\n", path);
return 0;
}
int cell_w = 16, cell_h = 20;
int cols = img->width / cell_w;
int rows = img->height / cell_h;
int total_cells = cols * rows;
int count = 0;
for (int index = 0; index < total_cells && count < max_samples; index++) {
int cell_x, cell_y;
if (is_xyswap) {
cell_x = (index / cols) * cell_w;
cell_y = (index % cols) * cell_h;
} else {
cell_x = (index % cols) * cell_w;
cell_y = (index / cols) * cell_h;
}
int tag_x = cell_x + (cell_w - 1); /* rightmost column */
int tag_y = cell_y;
/* Read width (5-bit binary from Y+0..Y+4) */
int width = 0;
for (int y = 0; y < 5; y++) {
if (tga_get_pixel(img, tag_x, tag_y + y) & 0xFF)
width |= (1 << y);
}
if (width == 0) continue;
/* Read kerning data pixel at Y+6 */
uint32_t kern_pixel = tagify(tga_get_pixel(img, tag_x, tag_y + 6));
if ((kern_pixel & 0xFF) == 0) continue; /* no kern data */
/* Extract labels */
int is_kern_ytype = (kern_pixel & 0x80000000u) != 0;
int kerning_mask = (int)((kern_pixel >> 8) & 0xFFFFFF);
int is_low_height = (tga_get_pixel(img, tag_x, tag_y + 5) & 0xFF) != 0;
Sample *s = &samples[count];
extract_shape_bits(kerning_mask, s->shape);
s->ytype = (float)is_kern_ytype;
s->lowheight = (float)is_low_height;
/* Extract 15x20 binary input from the glyph area */
for (int gy = 0; gy < 20; gy++) {
for (int gx = 0; gx < 15; gx++) {
uint32_t p = tga_get_pixel(img, cell_x + gx, cell_y + gy);
s->input[gy * 15 + gx] = ((p & 0x80) != 0) ? 1.0f : 0.0f;
}
}
count++;
}
tga_free(img);
return count;
}
/* ---- Fisher-Yates shuffle ---- */
static void shuffle_indices(int *arr, int n) {
for (int i = n - 1; i > 0; i--) {
int j = rand() % (i + 1);
int tmp = arr[i]; arr[i] = arr[j]; arr[j] = tmp;
}
}
/* ---- Copy network weights ---- */
static void copy_tensor_data(Tensor *dst, Tensor *src) {
memcpy(dst->data, src->data, (size_t)src->size * sizeof(float));
}
static void save_weights(Network *net, Network *best) {
copy_tensor_data(best->conv1.weight, net->conv1.weight);
copy_tensor_data(best->conv1.bias, net->conv1.bias);
copy_tensor_data(best->conv2.weight, net->conv2.weight);
copy_tensor_data(best->conv2.bias, net->conv2.bias);
copy_tensor_data(best->fc1.weight, net->fc1.weight);
copy_tensor_data(best->fc1.bias, net->fc1.bias);
copy_tensor_data(best->head_shape.weight, net->head_shape.weight);
copy_tensor_data(best->head_shape.bias, net->head_shape.bias);
copy_tensor_data(best->head_ytype.weight, net->head_ytype.weight);
copy_tensor_data(best->head_ytype.bias, net->head_ytype.bias);
copy_tensor_data(best->head_lowheight.weight, net->head_lowheight.weight);
copy_tensor_data(best->head_lowheight.bias, net->head_lowheight.bias);
}
/* ---- Training ---- */
int train_model(void) {
const char *assets_dir = "../src/assets";
const int max_total = 16384;
Sample *all_samples = calloc((size_t)max_total, sizeof(Sample));
if (!all_samples) { fprintf(stderr, "Error: out of memory\n"); return 1; }
int total = 0;
/* Scan for *_variable.tga files */
DIR *dir = opendir(assets_dir);
if (!dir) {
fprintf(stderr, "Error: cannot open %s\n", assets_dir);
free(all_samples);
return 1;
}
struct dirent *ent;
int file_count = 0;
while ((ent = readdir(dir)) != NULL) {
const char *name = ent->d_name;
size_t len = strlen(name);
/* Must end with _variable.tga */
if (len < 14) continue;
if (strcmp(name + len - 13, "_variable.tga") != 0) continue;
/* Skip extrawide */
if (strstr(name, "extrawide") != NULL) continue;
/* Check for xyswap */
int is_xyswap = (strstr(name, "xyswap") != NULL);
char fullpath[512];
snprintf(fullpath, sizeof(fullpath), "%s/%s", assets_dir, name);
int got = collect_from_sheet(fullpath, is_xyswap, all_samples + total, max_total - total);
if (got > 0) {
printf(" %s: %d samples\n", name, got);
total += got;
file_count++;
}
}
closedir(dir);
printf("Collected %d samples from %d sheets\n", total, file_count);
if (total < 10) {
fprintf(stderr, "Error: too few samples to train\n");
free(all_samples);
return 1;
}
/* Print label distribution */
{
const char *bit_names[] = {"A","B","C","D","E","F","G","H","J","K","Ytype","LowH"};
int counts[12] = {0};
int nonzero_input = 0;
for (int i = 0; i < total; i++) {
for (int b = 0; b < 10; b++)
counts[b] += (int)all_samples[i].shape[b];
counts[10] += (int)all_samples[i].ytype;
counts[11] += (int)all_samples[i].lowheight;
for (int p = 0; p < 300; p++)
if (all_samples[i].input[p] > 0.5f) { nonzero_input++; break; }
}
printf("Label distribution:\n ");
for (int b = 0; b < 12; b++)
printf("%s:%d(%.0f%%) ", bit_names[b], counts[b], 100.0 * counts[b] / total);
printf("\n Non-empty inputs: %d/%d\n\n", nonzero_input, total);
}
/* Shuffle and split 80/20 */
srand((unsigned)time(NULL));
int *indices = malloc((size_t)total * sizeof(int));
for (int i = 0; i < total; i++) indices[i] = i;
shuffle_indices(indices, total);
int n_train = (int)(total * 0.8);
int n_val = total - n_train;
printf("Train: %d, Validation: %d\n\n", n_train, n_val);
/* Create network */
Network *net = network_create();
Network *best_net = network_create();
int batch_size = 32;
float lr = 0.001f, beta1 = 0.9f, beta2 = 0.999f, eps = 1e-8f;
int max_epochs = 200;
int patience = 10;
float best_val_loss = 1e30f;
int patience_counter = 0;
int best_epoch = 0;
int adam_t = 0;
for (int epoch = 0; epoch < max_epochs; epoch++) {
/* Shuffle training indices */
shuffle_indices(indices, n_train);
float train_loss = 0.0f;
int n_batches = 0;
/* Training loop */
for (int start = 0; start < n_train; start += batch_size) {
int bs = (start + batch_size <= n_train) ? batch_size : (n_train - start);
/* Build batch tensors */
int ishape[] = {bs, 1, 20, 15};
Tensor *input = tensor_alloc(4, ishape);
int sshape[] = {bs, 10};
Tensor *tgt_shape = tensor_alloc(2, sshape);
int yshape[] = {bs, 1};
Tensor *tgt_ytype = tensor_alloc(2, yshape);
Tensor *tgt_lh = tensor_alloc(2, yshape);
for (int i = 0; i < bs; i++) {
Sample *s = &all_samples[indices[start + i]];
memcpy(input->data + i * 300, s->input, 300 * sizeof(float));
memcpy(tgt_shape->data + i * 10, s->shape, 10 * sizeof(float));
tgt_ytype->data[i] = s->ytype;
tgt_lh->data[i] = s->lowheight;
}
/* Forward */
network_zero_grad(net);
network_forward(net, input, 1);
/* Loss */
float loss = network_bce_loss(net, tgt_shape, tgt_ytype, tgt_lh);
train_loss += loss;
n_batches++;
/* Backward */
network_backward(net, tgt_shape, tgt_ytype, tgt_lh);
/* Adam step */
adam_t++;
network_adam_step(net, lr, beta1, beta2, eps, adam_t);
tensor_free(input);
tensor_free(tgt_shape);
tensor_free(tgt_ytype);
tensor_free(tgt_lh);
}
train_loss /= (float)n_batches;
/* Validation */
float val_loss = 0.0f;
int val_batches = 0;
for (int start = 0; start < n_val; start += batch_size) {
int bs = (start + batch_size <= n_val) ? batch_size : (n_val - start);
int ishape[] = {bs, 1, 20, 15};
Tensor *input = tensor_alloc(4, ishape);
int sshape[] = {bs, 10};
Tensor *tgt_shape = tensor_alloc(2, sshape);
int yshape[] = {bs, 1};
Tensor *tgt_ytype = tensor_alloc(2, yshape);
Tensor *tgt_lh = tensor_alloc(2, yshape);
for (int i = 0; i < bs; i++) {
Sample *s = &all_samples[indices[n_train + start + i]];
memcpy(input->data + i * 300, s->input, 300 * sizeof(float));
memcpy(tgt_shape->data + i * 10, s->shape, 10 * sizeof(float));
tgt_ytype->data[i] = s->ytype;
tgt_lh->data[i] = s->lowheight;
}
network_forward(net, input, 0);
val_loss += network_bce_loss(net, tgt_shape, tgt_ytype, tgt_lh);
val_batches++;
tensor_free(input);
tensor_free(tgt_shape);
tensor_free(tgt_ytype);
tensor_free(tgt_lh);
}
val_loss /= (float)val_batches;
printf("Epoch %3d: train_loss=%.4f val_loss=%.4f", epoch + 1, (double)train_loss, (double)val_loss);
if (val_loss < best_val_loss) {
best_val_loss = val_loss;
best_epoch = epoch + 1;
patience_counter = 0;
save_weights(net, best_net);
printf(" *best*");
} else {
patience_counter++;
}
printf("\n");
if (patience_counter >= patience) {
printf("\nEarly stopping at epoch %d (best epoch: %d)\n", epoch + 1, best_epoch);
break;
}
}
/* Restore best weights and save */
save_weights(best_net, net);
safetensor_save("autokem.safetensors", net, total, best_epoch, best_val_loss);
/* Compute final per-bit accuracy on validation set */
{
const char *bit_names[] = {"A","B","C","D","E","F","G","H","J","K","Ytype","LowH"};
int correct_per_bit[12] = {0};
int total_per_bit = n_val;
int n_examples = 0;
const int max_examples = 8;
printf("\nGlyph Tags — validation predictions:\n");
for (int i = 0; i < n_val; i++) {
Sample *s = &all_samples[indices[n_train + i]];
float output[12];
network_infer(net, s->input, output);
int pred_bits[12], tgt_bits[12];
int any_mismatch = 0;
for (int b = 0; b < 10; b++) {
pred_bits[b] = output[b] >= 0.5f ? 1 : 0;
tgt_bits[b] = (int)s->shape[b];
if (pred_bits[b] == tgt_bits[b]) correct_per_bit[b]++;
else any_mismatch = 1;
}
pred_bits[10] = output[10] >= 0.5f ? 1 : 0;
tgt_bits[10] = (int)s->ytype;
if (pred_bits[10] == tgt_bits[10]) correct_per_bit[10]++;
else any_mismatch = 1;
pred_bits[11] = output[11] >= 0.5f ? 1 : 0;
tgt_bits[11] = (int)s->lowheight;
if (pred_bits[11] == tgt_bits[11]) correct_per_bit[11]++;
else any_mismatch = 1;
/* Print a few examples (mix of correct and mismatched) */
if (n_examples < max_examples && (any_mismatch || i < 4)) {
/* Build tag string: e.g. "ABCDEFGH(B)" or "AB(Y)" */
char actual[32] = "", predicted[32] = "";
int ap = 0, pp = 0;
const char shape_chars[] = "ABCDEFGHJK";
for (int b = 0; b < 10; b++) {
if (tgt_bits[b]) actual[ap++] = shape_chars[b];
if (pred_bits[b]) predicted[pp++] = shape_chars[b];
}
actual[ap] = '\0'; predicted[pp] = '\0';
char actual_tag[48], pred_tag[48];
snprintf(actual_tag, sizeof(actual_tag), "%s%s%s",
ap > 0 ? actual : "(empty)",
tgt_bits[10] ? "(Y)" : "(B)",
tgt_bits[11] ? " low" : "");
snprintf(pred_tag, sizeof(pred_tag), "%s%s%s",
pp > 0 ? predicted : "(empty)",
pred_bits[10] ? "(Y)" : "(B)",
pred_bits[11] ? " low" : "");
printf(" actual=%-20s pred=%-20s %s\n", actual_tag, pred_tag,
any_mismatch ? "MISMATCH" : "ok");
n_examples++;
}
}
printf("\nPer-bit accuracy (%d val samples):\n ", n_val);
int total_correct = 0;
for (int b = 0; b < 12; b++) {
printf("%s:%.1f%% ", bit_names[b], 100.0 * correct_per_bit[b] / total_per_bit);
total_correct += correct_per_bit[b];
}
printf("\n Overall: %d/%d (%.2f%%)\n",
total_correct, n_val * 12, 100.0 * total_correct / (n_val * 12));
}
network_free(net);
network_free(best_net);
free(all_samples);
free(indices);
return 0;
}

8
Autokem/train.h Normal file
View File

@@ -0,0 +1,8 @@
#ifndef TRAIN_H
#define TRAIN_H
/* Train model on existing spritesheets in ../src/assets/
Saves to autokem.safetensors */
int train_model(void);
#endif

View File

@@ -46,8 +46,8 @@ Rightmost vertical column (should be 20 px tall) contains the tags. Tags are def
W |= Width of the character
W |
W -'
m --Is this character lowheight?
K -,
K |
K |= Tags used by the "Keming Machine"
K -'
Q ---Compiler Directive (see below)
@@ -77,29 +77,32 @@ Up&Down:
<MSB,Red> SXXXXXXX SYYYYYYY 00000000 <LSB,Blue>
Each X and Y numbers are Signed 8-Bit Integer.
Each X and Y numbers are TWO'S COMPLEMENT Signed 8-Bit Integer.
X-positive: nudges towards left
Y-positive: nudges towards up
Y-positive: nudges towards down
#### Diacritics Anchor Point Encoding
4 Pixels are further divided as follows:
| LSB | | Red | Green | Blue |
| ------------ | ------------ | ------------ | ------------ | ------------ |
| ------------ | ------------ | ------------ | ----------- | ------------ |
| Y | Anchor point Y for: | undefined | undefined | undefined |
| X | Anchor point X for: | undefined | undefined | undefined |
| Y | Anchor point Y for: | (unused) | (unused) | (unused) |
| Y | Anchor point Y for: | Type-0 | Type-1 | Type-2 |
| X | Anchor point X for: | Type-0 | Type-1 | Type-2 |
| **MSB** | | | | |
<MSB,Red> 1Y1Y1Y1Y 1Y2Y2Y2Y 1Y3Y3Y3Y <LSB,Blue>
<MSB,Red> 1X1X1X1X 1X2X2X2X 1X3X3X3X <LSB,Blue>
<MSB,Red> 1Y1Y1Y1Y 2Y2Y2Y2Y 3Y3Y3Y3Y <LSB,Blue>
<MSB,Red> 1X1X1X1X 2X2X2X2X 3X3X3X3X <LSB,Blue>
where Red is first, Green is second, Blue is the third diacritics.
MSB for each word must be set so that the pixel would appear brighter on the image editor.
(the font program will only read low 7 bits for each RGB channel)
Each X and Y numbers are SIGN AND MAGNITUDE 8-Bit Integer.
X-positive: nudges towards left
Y-positive: nudges towards down
#### Diacritics Type Bit Encoding

View File

@@ -1,14 +1,15 @@
"""
Convert 1-bit bitmap arrays to TrueType quadratic outlines.
Convert 1-bit bitmap arrays to CFF outlines by tracing connected pixel blobs.
Each set pixel becomes part of a rectangle contour drawn clockwise.
Adjacent identical horizontal runs are merged vertically into rectangles.
Each connected component of filled pixels becomes a single closed contour
(plus additional contours for any holes). Adjacent collinear edges are
merged, minimising vertex count.
Scale: x_left = col * SCALE, y_top = (BASELINE_ROW - row) * SCALE
Scale: x = col * SCALE, y = (BASELINE_ROW - row) * SCALE
where BASELINE_ROW = 16 (ascent in pixels).
"""
from typing import Dict, List, Tuple
from typing import List, Tuple
import sheet_config as SC
@@ -16,15 +17,56 @@ SCALE = SC.SCALE
BASELINE_ROW = 16 # pixels from top to baseline
def _turn_priority(in_dx, in_dy, out_dx, out_dy):
"""
Return priority for outgoing direction relative to incoming.
Lower = preferred (rightmost turn in y-down grid coordinates).
This produces outer contours that are CW in font coordinates (y-up)
and hole contours that are CCW, matching the non-zero winding rule.
"""
# Normalise to unit directions
nidx = (1 if in_dx > 0 else -1) if in_dx else 0
nidy = (1 if in_dy > 0 else -1) if in_dy else 0
ndx = (1 if out_dx > 0 else -1) if out_dx else 0
ndy = (1 if out_dy > 0 else -1) if out_dy else 0
# Right turn in y-down coords: (-in_dy, in_dx)
if (ndx, ndy) == (-nidy, nidx):
return 0
# Straight
if (ndx, ndy) == (nidx, nidy):
return 1
# Left turn: (in_dy, -in_dx)
if (ndx, ndy) == (nidy, -nidx):
return 2
# U-turn
return 3
def _simplify(contour):
"""Remove collinear intermediate vertices from a rectilinear contour."""
n = len(contour)
if n < 3:
return contour
result = []
for i in range(n):
p = contour[(i - 1) % n]
c = contour[i]
q = contour[(i + 1) % n]
# Cross product of consecutive edge vectors
if (c[0] - p[0]) * (q[1] - c[1]) - (c[1] - p[1]) * (q[0] - c[0]) != 0:
result.append(c)
return result if len(result) >= 3 else contour
def trace_bitmap(bitmap, glyph_width_px):
"""
Convert a bitmap to a list of rectangle contours.
Convert a bitmap to polygon contours by tracing connected pixel blobs.
Each rectangle is ((x0, y0), (x1, y1)) in font units, where:
- (x0, y0) is bottom-left
- (x1, y1) is top-right
Returns list of (x0, y0, x1, y1) tuples representing rectangles.
Returns a list of contours, where each contour is a list of (x, y)
tuples in font units. Outer contours are clockwise, hole contours
counter-clockwise (non-zero winding rule).
"""
if not bitmap or not bitmap[0]:
return []
@@ -32,66 +74,79 @@ def trace_bitmap(bitmap, glyph_width_px):
h = len(bitmap)
w = len(bitmap[0])
# Step 1: Find horizontal runs per row
runs = [] # list of (row, col_start, col_end)
for row in range(h):
col = 0
while col < w:
if bitmap[row][col]:
start = col
while col < w and bitmap[row][col]:
col += 1
runs.append((row, start, col))
else:
col += 1
def filled(r, c):
return 0 <= r < h and 0 <= c < w and bitmap[r][c]
# Step 2: Merge vertically adjacent identical runs into rectangles
rects = [] # (row_start, row_end, col_start, col_end)
used = [False] * len(runs)
# -- Step 1: collect directed boundary edges --
# Pixel (r, c) occupies grid square (c, r)-(c+1, r+1).
# Edge direction keeps the filled region to the left (in y-down coords).
edge_map = {} # start_vertex -> [end_vertex, ...]
for i, (row, cs, ce) in enumerate(runs):
if used[i]:
continue
# Try to extend this run downward
row_end = row + 1
j = i + 1
while j < len(runs):
r2, cs2, ce2 = runs[j]
if r2 > row_end:
break
if r2 == row_end and cs2 == cs and ce2 == ce and not used[j]:
used[j] = True
row_end = r2 + 1
j += 1
rects.append((row, row_end, cs, ce))
for r in range(h):
for c in range(w):
if not bitmap[r][c]:
continue
if not filled(r - 1, c): # top boundary
edge_map.setdefault((c, r), []).append((c + 1, r))
if not filled(r + 1, c): # bottom boundary
edge_map.setdefault((c + 1, r + 1), []).append((c, r + 1))
if not filled(r, c - 1): # left boundary
edge_map.setdefault((c, r + 1), []).append((c, r))
if not filled(r, c + 1): # right boundary
edge_map.setdefault((c + 1, r), []).append((c + 1, r + 1))
# Step 3: Convert to font coordinates
if not edge_map:
return []
# -- Step 2: trace contours using rightmost-turn rule --
used = set()
contours = []
for row_start, row_end, col_start, col_end in rects:
x0 = col_start * SCALE
x1 = col_end * SCALE
y_top = (BASELINE_ROW - row_start) * SCALE
y_bottom = (BASELINE_ROW - row_end) * SCALE
contours.append((x0, y_bottom, x1, y_top))
for sv in sorted(edge_map):
for ev in edge_map[sv]:
if (sv, ev) in used:
continue
path = [sv]
prev, curr = sv, ev
used.add((sv, ev))
while curr != sv:
path.append(curr)
idx, idy = curr[0] - prev[0], curr[1] - prev[1]
candidates = [e for e in edge_map.get(curr, [])
if (curr, e) not in used]
if not candidates:
break
best = min(candidates,
key=lambda e: _turn_priority(
idx, idy, e[0] - curr[0], e[1] - curr[1]))
used.add((curr, best))
prev, curr = curr, best
path = _simplify(path)
if len(path) >= 3:
contours.append([
(x * SCALE, (BASELINE_ROW - y) * SCALE)
for x, y in path
])
return contours
def draw_glyph_to_pen(contours, pen, x_offset=0, y_offset=0):
"""
Draw rectangle contours to a TTGlyphPen or similar pen.
Each rectangle is drawn as a clockwise closed contour (4 on-curve points).
Draw polygon contours to a T2CharStringPen (or compatible pen).
Each contour is a list of (x, y) vertices forming a closed polygon.
x_offset/y_offset shift all contours (used for alignment positioning).
"""
for x0, y0, x1, y1 in contours:
ax0 = x0 + x_offset
ax1 = x1 + x_offset
ay0 = y0 + y_offset
ay1 = y1 + y_offset
# Clockwise: bottom-left -> top-left -> top-right -> bottom-right
pen.moveTo((ax0, ay0))
pen.lineTo((ax0, ay1))
pen.lineTo((ax1, ay1))
pen.lineTo((ax1, ay0))
for contour in contours:
x, y = contour[0]
pen.moveTo((x + x_offset, y + y_offset))
for x, y in contour[1:]:
pen.lineTo((x + x_offset, y + y_offset))
pen.closePath()

Binary file not shown.

View File

@@ -370,10 +370,9 @@ def build_font(assets_dir, output_path, no_bitmap=False, no_features=False):
x_offset = 0
x_offset -= g.props.nudge_x * SCALE
# For STACK_DOWN marks (below-base diacritics), negative nudge_y
# means "shift content down to below baseline". The sign convention
# is opposite to non-marks where positive nudge_y means shift down.
if g.props.stack_where == SC.STACK_DOWN and g.props.write_on_top >= 0:
# For marks (write_on_top >= 0), positive nudge_y means shift UP
# in the bitmap engine (opposite to non-marks where positive = down).
if g.props.write_on_top >= 0:
y_offset = g.props.nudge_y * SCALE
else:
y_offset = -g.props.nudge_y * SCALE

View File

@@ -38,6 +38,7 @@ class GlyphProps:
has_kern_data: bool = False
is_kern_y_type: bool = False
kerning_mask: int = 255
dot_removal: Optional[int] = None # codepoint to replace with when followed by a STACK_UP mark
directive_opcode: int = 0
directive_arg1: int = 0
directive_arg2: int = 0
@@ -131,7 +132,8 @@ def parse_variable_sheet(image, sheet_index, cell_w, cell_h, cols, is_xy_swapped
# Kerning data
kerning_bit1 = _tagify(image.get_pixel(code_start_x, code_start_y + 6))
# kerning_bit2 and kerning_bit3 are reserved
kerning_bit2 = _tagify(image.get_pixel(code_start_x, code_start_y + 7))
dot_removal = None if kerning_bit2 == 0 else (kerning_bit2 >> 8)
is_kern_y_type = (kerning_bit1 & 0x80000000) != 0
kerning_mask = (kerning_bit1 >> 8) & 0xFFFFFF
has_kern_data = (kerning_bit1 & 0xFF) != 0
@@ -188,7 +190,7 @@ def parse_variable_sheet(image, sheet_index, cell_w, cell_h, cols, is_xy_swapped
align_where=align_where, write_on_top=write_on_top,
stack_where=stack_where, ext_info=ext_info,
has_kern_data=has_kern_data, is_kern_y_type=is_kern_y_type,
kerning_mask=kerning_mask,
kerning_mask=kerning_mask, dot_removal=dot_removal,
directive_opcode=directive_opcode, directive_arg1=directive_arg1,
directive_arg2=directive_arg2,
)

View File

@@ -70,6 +70,11 @@ languagesystem sund dflt;
if ccmp_code:
parts.append(ccmp_code)
# ccmp: dot removal (e.g. i→ı, j→ȷ when followed by STACK_UP marks)
dot_removal_code = _generate_dot_removal(glyphs, has)
if dot_removal_code:
parts.append(dot_removal_code)
# Hangul jamo GSUB assembly
hangul_code = _generate_hangul_gsub(glyphs, has, jamo_data)
if hangul_code:
@@ -162,6 +167,54 @@ def _generate_ccmp(replacewith_subs, has):
return '\n'.join(lines)
def _generate_dot_removal(glyphs, has):
"""Generate ccmp contextual substitution for dot removal.
When a base glyph tagged with dot_removal (kerning bit 2, pixel Y+7) is
followed by a STACK_UP mark, substitute the base with its dotless form.
Matches the Kotlin engine's dotRemoval logic.
"""
# Collect all STACK_UP marks
stack_up_marks = []
for cp, g in glyphs.items():
if g.props.write_on_top >= 0 and g.props.stack_where == SC.STACK_UP and has(cp):
stack_up_marks.append(cp)
if not stack_up_marks:
return ""
# Collect all base glyphs with dot_removal
dot_removal_subs = []
for cp, g in glyphs.items():
if g.props.dot_removal is not None and has(cp) and has(g.props.dot_removal):
dot_removal_subs.append((cp, g.props.dot_removal))
if not dot_removal_subs:
return ""
lines = []
# Define the STACK_UP marks class
mark_names = ' '.join(glyph_name(cp) for cp in sorted(stack_up_marks))
lines.append(f"@stackUpMarks = [{mark_names}];")
lines.append("")
# Single substitution lookup for the replacements
lines.append("lookup DotRemoval {")
for src_cp, dst_cp in sorted(dot_removal_subs):
lines.append(f" sub {glyph_name(src_cp)} by {glyph_name(dst_cp)};")
lines.append("} DotRemoval;")
lines.append("")
# Contextual rules in ccmp
lines.append("feature ccmp {")
for src_cp, _ in sorted(dot_removal_subs):
lines.append(f" sub {glyph_name(src_cp)}' lookup DotRemoval @stackUpMarks;")
lines.append("} ccmp;")
return '\n'.join(lines)
def _generate_hangul_gsub(glyphs, has, jamo_data):
"""
Generate Hangul jamo GSUB lookups for syllable assembly.
@@ -1750,7 +1803,7 @@ def _generate_mark(glyphs, has):
mark_groups = {} # (mark_type, align, is_dia, stack_cat) -> [(cp, g), ...]
for cp, g in marks.items():
is_dia = (0x0300 <= cp <= 0x036F)
is_dia = True # all marks (write_on_top >= 0) are diacritics; Kotlin applies lowheight shiftdown unconditionally
sc = _stack_cat(g.props.stack_where)
key = (g.props.write_on_top, g.props.align_where, is_dia, sc)
mark_groups.setdefault(key, []).append((cp, g))
@@ -1846,7 +1899,9 @@ def _generate_mark(glyphs, has):
# Lowheight adjustment for combining diacritical marks:
# shift base anchor Y down so diacritics sit closer to
# the shorter base glyph.
if is_dia and g.props.is_low_height:
# Only applies to 'up' marks (and overlay), not 'dn',
# matching Kotlin which only adjusts in STACK_UP/STACK_UP_N_DOWN.
if is_dia and g.props.is_low_height and scat != 'dn':
if mark_type == 2: # overlay
ay -= SC.H_OVERLAY_LOWERCASE_SHIFTDOWN * SC.SCALE
else: # above (type 0)
@@ -1878,12 +1933,13 @@ def _generate_mark(glyphs, has):
lines.append(f"lookup {mkmk_name} {{")
if scat == 'up':
m2y = SC.ASCENT + SC.H_DIACRITICS * SC.SCALE
m2y_base = SC.ASCENT + SC.H_DIACRITICS * SC.SCALE
else: # 'dn'
m2y = SC.ASCENT - SC.H_DIACRITICS * SC.SCALE
m2y_base = SC.ASCENT - SC.H_DIACRITICS * SC.SCALE
for cp, g in mark_list:
mx = mark_anchors.get(cp, 0)
m2y = m2y_base
lines.append(
f" pos mark {glyph_name(cp)}"
f" <anchor {mx} {m2y}> mark {class_name};"
@@ -1893,6 +1949,46 @@ def _generate_mark(glyphs, has):
lines.append("")
mkmk_lookup_names.append(mkmk_name)
# --- Nudge-Y gap correction for MarkToMark ---
# Without cascade, the gap between consecutive 'up' marks is
# H_DIACRITICS + nudge_y_2 - nudge_y_1
# which is less than H_DIACRITICS when nudge_y_1 > nudge_y_2
# (e.g. Cyrillic uni2DED nudge=2 followed by uni0487 nudge=0).
# Add contextual positioning to compensate: shift mark2 up by
# (nudge_y_1 - nudge_y_2) * SCALE for each such pair.
# This keeps Thai correct (same nudge on both marks → no correction)
# while fixing Cyrillic (different nudge → correction applied).
nudge_groups = {} # nudge_y -> [glyph_name, ...]
for (mark_type, align, is_dia, scat), mark_list in sorted(mark_groups.items()):
if scat != 'up':
continue
for cp, g in mark_list:
ny = g.props.nudge_y
nudge_groups.setdefault(ny, []).append(glyph_name(cp))
distinct_nudges = sorted(nudge_groups.keys())
correction_pairs = []
for n1 in distinct_nudges:
for n2 in distinct_nudges:
if n1 > n2:
correction_pairs.append((n1, n2, (n1 - n2) * SC.SCALE))
if correction_pairs:
for ny, glyphs in sorted(nudge_groups.items()):
lines.append(f"@up_nudge_{ny} = [{' '.join(sorted(glyphs))}];")
lines.append("")
mkmk_corr_name = "mkmk_nudge_correct"
lines.append(f"lookup {mkmk_corr_name} {{")
for n1, n2, val in correction_pairs:
lines.append(
f" pos @up_nudge_{n1} <0 0 0 0>"
f" @up_nudge_{n2} <0 {val} 0 0>;"
)
lines.append(f"}} {mkmk_corr_name};")
lines.append("")
mkmk_lookup_names.append(mkmk_corr_name)
# Register MarkToBase lookups under mark.
# dev2 is excluded: HarfBuzz/DirectWrite use abvm for Devanagari marks.
# deva is INCLUDED: CoreText's old-Indic shaper may need mark/mkmk

View File

@@ -72,6 +72,9 @@ SHEET_ALPHABETIC_PRESENTATION_FORMS = 38
SHEET_HENTAIGANA_VARW = 39
SHEET_CONTROL_PICTURES_VARW = 40
SHEET_LEGACY_COMPUTING_VARW = 41
SHEET_CYRILIC_EXTB_VARW = 42
SHEET_CYRILIC_EXTA_VARW = 43
SHEET_CYRILIC_EXTC_VARW = 44
SHEET_UNKNOWN = 254
@@ -118,6 +121,9 @@ FILE_LIST = [
"hentaigana_variable.tga",
"control_pictures_variable.tga",
"symbols_for_legacy_computing_variable.tga",
"cyrilic_extB_variable.tga",
"cyrilic_extA_variable.tga",
"cyrilic_extC_variable.tga",
]
CODE_RANGE = [
@@ -151,7 +157,7 @@ CODE_RANGE = [
list(range(0xA720, 0xA800)), # 27: Latin Ext D
list(range(0x20A0, 0x20D0)), # 28: Currencies
list(range(0xFFE00, 0xFFFA0)), # 29: Internal
list(range(0x2100, 0x2150)), # 30: Letterlike
list(range(0x2100, 0x2200)), # 30: Letterlike
list(range(0x1F100, 0x1F200)), # 31: Enclosed Alphanum Supl
list(range(0x0B80, 0x0C00)) + list(range(0xF00C0, 0xF0100)), # 32: Tamil
list(range(0x980, 0xA00)), # 33: Bengali
@@ -163,6 +169,9 @@ CODE_RANGE = [
list(range(0x1B000, 0x1B170)), # 39: Hentaigana
list(range(0x2400, 0x2440)), # 40: Control Pictures
list(range(0x1FB00, 0x1FC00)), # 41: Legacy Computing
list(range(0xA640, 0xA6A0)), # 42: Cyrillic Ext B
list(range(0x2DE0, 0x2E00)), # 43: Cyrillic Ext A
list(range(0x1C80, 0x1C8F)), # 43: Cyrillic Ext C
]
CODE_RANGE_HANGUL_COMPAT = range(0x3130, 0x3190)
@@ -539,5 +548,8 @@ def index_y(sheet_index, c):
SHEET_HENTAIGANA_VARW: lambda: (c - 0x1B000) // 16,
SHEET_CONTROL_PICTURES_VARW: lambda: (c - 0x2400) // 16,
SHEET_LEGACY_COMPUTING_VARW: lambda: (c - 0x1FB00) // 16,
SHEET_CYRILIC_EXTB_VARW: lambda: (c - 0xA640) // 16,
SHEET_CYRILIC_EXTA_VARW: lambda: (c - 0x2DE0) // 16,
SHEET_CYRILIC_EXTC_VARW: lambda: (c - 0x1C80) // 16,
SHEET_HANGUL: lambda: 0,
}.get(sheet_index, lambda: c // 16)()

Binary file not shown.

View File

@@ -2,17 +2,21 @@
![Font sample — necessary information in this image is also provided below.](demo.PNG)
This font is a bitmap font used in [my game project called Terrarum](https://github.com/minjaesong/Terrarum) (hence the name). The font supports more than 90 % of european languages, as well as Chinese, Japanese, and Korean.
This font is a bitmap font used in [my game project called Terrarum](https://github.com/curioustorvald/Terrarum) (hence the name). The font supports more than 90 % of european languages, as well as Chinese, Japanese, and Korean.
The JAR package is meant to be used with LibGDX (extends ```BitmapFont``` class). If you are not using the framework, please refer to the __Font metrics__ section to implement the font metrics correctly on your system.
The font is provided in following formats:
* **OTF** — This is the version you want most likely. It's compatible with anything that supports OpenType fonts.
* **WOFF2** — This is OTF font repackaged as a web font. You will want this if you want to use it on a web page.
* **JAR** — This is the version you want if you work with LibGDX. (extends ```BitmapFont``` class)
The issue page is open. If you have some issues to submit, or have a question, please leave it on the page.
#### Notes and Limitations
- Displaying Bulgarian/Serbian variants of Cyrillic requires special Control Characters. (`GameFontBase.charsetOverrideBulgarian` -- U+FFFC1; `GameFontBase.charsetOverrideSerbian` -- U+FFFC2)
- JAR version comes with its own shaping and typesetting engine, texture caching, and self-contained assets. It is NOT compatible with `GlyphLayout`.
- (JAR only) Displaying Bulgarian/Serbian variants of Cyrillic requires special Control Characters. (`GameFontBase.charsetOverrideBulgarian` -- U+FFFC1; `GameFontBase.charsetOverrideSerbian` -- U+FFFC2)
- All Han characters are in Mainland Chinese variant. There is no plan to support the other variants unless there is someone willing to do the drawing of the characters
- Only the Devanagari and Tamil has full (as much as I can) ligature support for Indic scripts -- Bengali script does not have any ligature support
- Slick2d versions are now unsupported. I couldn't extend myself to work on both versions, but I'm still welcome to merge your pull requests.
### Design Goals
@@ -23,13 +27,11 @@ The issue page is open. If you have some issues to submit, or have a question, p
## Download
- Go ahead to the [release tab](https://github.com/minjaesong/Terrarum-sans-bitmap/releases), and download the most recent version. It is **not** advised to use the .jar found within the repository, they're experimental builds I use during the development, and may contain bugs like leaking memory.
- Go ahead to the [release tab](https://github.com/curioustorvald/Terrarum-sans-bitmap/releases), and download the most recent version. It is **not** advised to use the .jar found within the repository, they're experimental builds I use during the development, and may contain bugs like leaking memory.
## Using on your game
## Using on your LibGDX project
- Firstly, place the .jar to your library path and assets folder to the main directory of the app, then:
### Using on LibGDX
- Firstly, place the .jar to your library path, then:
On your code (Kotlin):
@@ -40,7 +42,7 @@ On your code (Kotlin):
lateinit var fontGame: Font
override fun create() {
fontGame = TerrarumSansBitmap(path_to_assets, ...)
fontGame = TerrarumSansBitmap(...)
...
}
@@ -62,7 +64,7 @@ On your code (Java):
Font fontGame;
@Override void create() {
fontGame = new TerrarumSansBitmap(path_to_assets, ...);
fontGame = new TerrarumSansBitmap(...);
...
}
@@ -91,7 +93,7 @@ U+100000 is used to disable previously-applied color codes (going back to origin
## Contribution guidelines
Please refer to [CONTRIBUTING.md](https://github.com/minjaesong/Terrarum-sans-bitmap/blob/master/CONTRIBUTING.md)
Please refer to [CONTRIBUTING.md](https://github.com/curioustorvald/Terrarum-sans-bitmap/blob/master/CONTRIBUTING.md)
## Acknowledgement

BIN
demo.PNG

Binary file not shown.

Before

Width:  |  Height:  |  Size: 168 KiB

After

Width:  |  Height:  |  Size: 177 KiB

View File

@@ -25,6 +25,7 @@ How multilingual? Real multilingual!
􏻬আমি কাঁচ খেতে পারি, তাতে আমার কোনো ক্ষতি হয় না। 􀀀
􏻬󿿁Под южно дърво, цъфтящо в синьо, бягаше малко пухкаво зайче󿿀􀀀
􏻬ᎠᏍᎦᏯᎡᎦᎢᎾᎨᎢᎣᏍᏓᎤᎩᏍᏗᎥᎴᏓᎯᎲᎢᏔᎵᏕᎦᏟᏗᏖᎸᎳᏗᏗᎧᎵᎢᏘᎴᎩ ᏙᏱᏗᏜᏫᏗᏣᏚᎦᏫᏛᏄᏓᎦᏝᏃᎠᎾᏗᎭᏞᎦᎯᎦᏘᏓᏠᎨᏏᏕᏡᎬᏢᏓᏥᏩᏝᎡᎢᎪᎢ ᎠᎦᏂᏗᎮᎢᎫᎩᎬᏩᎴᎢᎠᏆᏅᏛᎫᏊᎾᎥᎠᏁᏙᎲᏐᏈᎵᎤᎩᎸᏓᏭᎷᏤᎢᏏᏉᏯᏌᏊ ᎤᏂᏋᎢᏡᎬᎢᎰᏩᎬᏤᎵᏍᏗᏱᎩᎱᎱᎤᎩᎴᎢᏦᎢᎠᏂᏧᏣᏨᎦᏥᎪᎥᏌᏊᎤᎶᏒᎢᎢᏡᎬᎢ ᎹᎦᎺᎵᏥᎻᎼᏏᎽᏗᏩᏂᎦᏘᎾᎿᎠᏁᎬᎢᏅᎩᎾᏂᎡᎢᏌᎶᎵᏎᎷᎠᏑᏍᏗᏪᎩ ᎠᎴ ᏬᏗᏲᏭᎾᏓᏍᏓᏴᏁᎢᎤᎦᏅᏮᏰᎵᏳᏂᎨᎢ􀀀
􏻬Ѳеѡфа́нъ и҆ Алеѯі́й, ѕѣлѡ̀ возлюби́вше ѱалти́рь, воспѣ́ша при свѣ́тѣ ѕвѣ́здъ, помазꙋ́юще сщ҃е́нное мѵ́ро; серафими мн̑оꙮ҆читїи̑, ꙗ҆́кѡ ѻ҆́гнь, ѡ҆крꙋжа́хꙋ прⷭ҇то́лъ Бж҃їй, и҆ всѧ̀ землѧ̀ и҆спо́лнисѧ свѣ́та, ꙗ҆́кѡ ѕмі́й попра́нъ є҆́сть􀀀
􏻬Příliš žluťoučký kůň úpěl ďábelské ódy􀀀
􏻬Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen Walther spillede på xylofon􀀀
􏻬PACK MY BOX WITH FIVE DOZEN LIQUOR JUGS􀀀
@@ -104,6 +105,10 @@ How multilingual? Real multilingual!
􎳌‣ Full support for Archaic Kana/Hentaigana􀀀
􏻬серафими мн̑оꙮ҆читїи̑, ꙗ҆́кѡ ѻ҆́гнь, ѡ҆крꙋжа́хꙋ прⷭ҇то́лъ Бж҃їй, и҆ всѧ̀ землѧ̀ и҆спо́лнисѧ свѣ́та􀀀
􎳌‣ Fan of Church Slavonic? Weve got you!􀀀
􏃯Supported Unicode Blocks:􀀀
Basic Latin
@@ -111,6 +116,7 @@ How multilingual? Real multilingual!
Latin Extended Additional
Latin Extended-A/B/C/D
Armenian
Arrows
Bengali􏿆ᶠⁱ􀀀
Braille Patterns
Cherokee􏿆⁷􀀀
@@ -120,8 +126,9 @@ How multilingual? Real multilingual!
Combining Diacritical Marks
Control Pictures
Currency Symbols
Cyrillic􏿆ᴭ􀀀
Cyrillic Supplement􏿆ᴭ􀀀
Cyrillic
Cyrillic Supplement
Cyrillic Extended-A/B/C
Devanagari
Enclosed Alphanumeric Supplement
General Punctuations
@@ -140,6 +147,7 @@ How multilingual? Real multilingual!
Katakana Phonetic Extensions
Kana Supplement
Kana Extended-A
Number Forms
Small Kana Extension
Letterlike Symbols
Phonetic Extensions
@@ -153,7 +161,7 @@ How multilingual? Real multilingual!
Tamil
Thai
􏿆ᴭ􀀀 No support for archæic letters  􏿆ᴱ􀀀 No support for Coptic
􏿆ᴱ􀀀 No support for Coptic
􏿆ᶠⁱ􀀀 No support for ligatures  􏿆ჼ􀀀 Mkhedruli only
􏿆⁶􀀀 􏿆⁷􀀀 􏿆⁹􀀀 􏿆¹²·¹􀀀 Up to the specified Unicode version

View File

@@ -44,7 +44,7 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
.grid-label-right { text-align: left; }
.grid-sep { grid-column: 1 / -1; height: 3px; background: #999; margin: 2px 0; border-radius: 1px; }
.grid-dot { text-align: center; color: #ccc; font-size: 0.7em; }
.grid-spacer { height: 12px; }
.grid-spacer { height: 36px; }
/* Y toggle */
.y-toggle { margin-top: 16px; }
@@ -98,10 +98,10 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
.pixel-inactive { font-size: 0.85em; color: #999; }
.bit-display { font-family: 'Consolas', 'Fira Code', monospace; font-size: 0.8em; color: #777; margin-top: 2px; }
/* Mask display */
/* Mask input/display */
.mask-section { margin-top: 16px; padding: 10px; background: #f8f8f8; border-radius: 6px; border: 1px solid #e0e0e0; }
.mask-section .label { font-size: 0.8em; color: #1a5fb4; margin-bottom: 4px; }
.mask-val { font-family: 'Consolas', 'Fira Code', monospace; font-size: 0.95em; color: #111; }
.mask-input-row { display: flex; gap: 8px; align-items: center; }
/* Examples */
.examples-section { margin-top: 20px; }
@@ -125,14 +125,14 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
<body>
<h1>Keming Machine Tag Calculator</h1>
<p class="subtitle">Calculate pixel colour values for the three Keming Machine tag pixels (K at Y+6, Y+7, Y+8)</p>
<p class="subtitle">Calculate pixel colour values for the three Keming Machine tag pixels (K at Y+5, Y+6, Y+7)</p>
<div class="main">
<div class="col-left">
<!-- Pixel 1: Lowheight -->
<div class="panel lowheight-section">
<h2>Pixel 1 &mdash; Low Height (Y+6)</h2>
<h2>Pixel 1 &mdash; Low Height (Y+5)</h2>
<div class="lowheight-row">
<button class="lowheight-btn" id="lowheightBtn" onclick="toggleLowheight()">Low Height: OFF</button>
</div>
@@ -144,7 +144,7 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
<!-- Pixel 2: Shape Grid -->
<div class="panel shape-section">
<h2>Pixel 2 &mdash; Glyph Shape (Y+7)</h2>
<h2>Pixel 2 &mdash; Glyph Shape (Y+6)</h2>
<p style="font-size:0.8em; color:#666; margin-bottom:12px;">Click zones to mark which parts of the glyph are occupied.</p>
<div class="grid-wrapper">
@@ -183,9 +183,6 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
<!-- Baseline separator -->
<div class="grid-sep"></div>
<!-- Spacer row -->
<span></span><span class="grid-spacer"></span><span></span><span class="grid-spacer"></span><span></span>
<!-- Row: J K (below baseline) -->
<span class="grid-label grid-label-left">desc</span>
<button class="zone-btn" data-zone="J" onclick="toggleZone(this)">J</button>
@@ -210,17 +207,22 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
</p>
</div>
<!-- Kerning mask output -->
<!-- Kerning mask input/output -->
<div class="mask-section">
<div class="label">Kerning Mask (24-bit, used by rules)</div>
<div id="maskVal" class="mask-val">0x0000FF</div>
<div id="maskBin" class="bit-display">00000000 00000000 11111111</div>
<div class="label">Kerning Mask (24-bit hex colour)</div>
<div class="mask-input-row">
<input type="text" class="cp-input" id="maskInput" placeholder="e.g. #800AFF, 0x800AFF" oninput="updateFromMask()" value="">
</div>
<p class="cp-formats" style="margin-top:4px;">
Accepts: <code>#RRGGBB</code>, <code>0xRRGGBB</code>, or <code>RRGGBB</code>
</p>
<div id="maskBin" class="bit-display" style="margin-top:6px;">00000000 00000000 00000000</div>
</div>
</div>
<!-- Pixel 3: Dot Removal -->
<div class="panel cp-section">
<h2>Pixel 3 &mdash; Dot Removal (Y+8)</h2>
<h2>Pixel 3 &mdash; Dot Removal (Y+7)</h2>
<p style="font-size:0.8em; color:#666; margin-bottom:12px;">Replacement character for diacritics dot removal. All 24 bits encode the codepoint.</p>
<div class="cp-input-row">
@@ -238,7 +240,7 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
<div class="panel output-section">
<h2>Pixel Colour Values</h2>
<div class="pixel-label">Pixel 1: Low Height (Y+6)</div>
<div class="pixel-label">Pixel 1: Low Height (Y+5)</div>
<div class="pixel-row">
<canvas id="swatch1" class="colour-swatch" width="48" height="48"></canvas>
<div class="pixel-info">
@@ -247,7 +249,7 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
</div>
</div>
<div class="pixel-label" style="margin-top:12px;">Pixel 2: Glyph Shape (Y+7)</div>
<div class="pixel-label" style="margin-top:12px;">Pixel 2: Glyph Shape (Y+6)</div>
<div class="pixel-row">
<canvas id="swatch2" class="colour-swatch" width="48" height="48"></canvas>
<div class="pixel-info">
@@ -261,7 +263,7 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
</div>
</div>
<div class="pixel-label" style="margin-top:12px;">Pixel 3: Dot Removal (Y+8)</div>
<div class="pixel-label" style="margin-top:12px;">Pixel 3: Dot Removal (Y+7)</div>
<div class="pixel-row">
<canvas id="swatch3" class="colour-swatch" width="48" height="48"></canvas>
<div class="pixel-info">
@@ -293,6 +295,7 @@ h1 { font-size: 1.4em; margin-bottom: 4px; color: #111; }
const ZONES = ['A','B','C','D','E','F','G','H','J','K'];
const state = { A:0, B:0, C:0, D:0, E:0, F:0, G:0, H:0, J:0, K:0 };
let isLowheight = false;
let maskInputActive = false; // prevent recalc from overwriting mask input while user types
// Bit positions within kerning_mask (24-bit RGB):
// Blue byte: A=bit7, B=bit6, C=bit5, D=bit4, E=bit3, F=bit2, G=bit1, H=bit0
@@ -363,12 +366,71 @@ function recalc() {
document.getElementById('b2').textContent = b;
document.getElementById('bits2').textContent = bin8(r) + ' ' + bin8(g) + ' ' + bin8(b);
document.getElementById('maskVal').textContent = '0x' + fullMask.toString(16).toUpperCase().padStart(6, '0');
// Update mask input only if the change didn't come from the mask input itself
if (!maskInputActive) {
document.getElementById('maskInput').value = '#' + hex2(r) + hex2(g) + hex2(b);
document.getElementById('maskInput').classList.remove('error');
}
document.getElementById('maskBin').textContent = bin8((fullMask >> 16) & 0xFF) + ' ' + bin8((fullMask >> 8) & 0xFF) + ' ' + bin8(fullMask & 0xFF);
drawSwatchSolid('swatch2', r, g, b);
}
function updateFromMask() {
const input = document.getElementById('maskInput');
const raw = input.value.trim();
if (raw === '') {
input.classList.remove('error');
return;
}
// Parse hex colour: #RRGGBB, 0xRRGGBB, or RRGGBB
let hex = null;
if (/^#([0-9A-Fa-f]{6})$/.test(raw)) {
hex = parseInt(RegExp.$1, 16);
} else if (/^0[xX]([0-9A-Fa-f]{6})$/.test(raw)) {
hex = parseInt(RegExp.$1, 16);
} else if (/^([0-9A-Fa-f]{6})$/.test(raw)) {
hex = parseInt(RegExp.$1, 16);
}
if (hex === null) {
input.classList.add('error');
return;
}
input.classList.remove('error');
const r = (hex >> 16) & 0xFF;
const g = (hex >> 8) & 0xFF;
const b = hex & 0xFF;
// Reverse-map to zone states
state.A = (b & 0x80) ? 1 : 0;
state.B = (b & 0x40) ? 1 : 0;
state.C = (b & 0x20) ? 1 : 0;
state.D = (b & 0x10) ? 1 : 0;
state.E = (b & 0x08) ? 1 : 0;
state.F = (b & 0x04) ? 1 : 0;
state.G = (b & 0x02) ? 1 : 0;
state.H = (b & 0x01) ? 1 : 0;
state.J = (g & 0x80) ? 1 : 0;
state.K = (g & 0x40) ? 1 : 0;
// Reverse-map Y toggle
document.getElementById('yToggle').checked = !!(r & 0x80);
// Update all zone buttons
document.querySelectorAll('.zone-btn').forEach(btn => {
btn.classList.toggle('active', !!state[btn.dataset.zone]);
});
// Recalc without overwriting the mask input
maskInputActive = true;
recalc();
maskInputActive = false;
}
function updateCodepoint() {
const input = document.getElementById('cpInput');
const raw = input.value.trim();

View File

@@ -1,7 +1,7 @@
--- Pixel 0
- Lowheight bit
- encoding: has pixel - it's low height
- used by the diacritics system to quickly look up if the character is low height without parsing the Pixel 1
- bit must be set if above-diacritics should be lowered (e.g. lowercase b, which has 'A' shape bit but considered lowheight)
### Legends
#
@@ -106,3 +106,5 @@ dot removal for diacritics:
- encoding:
- <MSB> RRRRRRRR GGGGGGGG BBBBBBBB <LSB>
--- Pixel 3
Unused for now.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -1,9 +1,13 @@
package net.torvald.terrarumsansbitmap
import net.torvald.terrarumsansbitmap.gdx.CodePoint
/**
* Created by minjaesong on 2021-11-25.
*/
data class DiacriticsAnchor(val type: Int, val x: Int, val y: Int, val xUsed: Boolean, val yUsed: Boolean)
data class DiacriticsAnchor(val type: Int, val x: Int, val y: Int) {
val isZero = (x == 0 && y == 0)
}
/**
* Created by minjaesong on 2018-08-07.
*/
@@ -15,7 +19,7 @@ data class GlyphProps(
val nudgeX: Int = 0,
val nudgeY: Int = 0,
val diacriticsAnchors: Array<DiacriticsAnchor> = Array(6) { DiacriticsAnchor(it, 0, 0, false, false) },
val diacriticsAnchors: Array<DiacriticsAnchor> = Array(6) { DiacriticsAnchor(it, 0, 0) },
val alignWhere: Int = 0, // ALIGN_LEFT..ALIGN_BEFORE
@@ -29,6 +33,8 @@ data class GlyphProps(
val isKernYtype: Boolean = false,
val kerningMask: Int = 255,
val dotRemoval: CodePoint? = null,
val directiveOpcode: Int = 0, // 8-bits wide
val directiveArg1: Int = 0, // 8-bits wide
val directiveArg2: Int = 0, // 8-bits wide
@@ -95,10 +101,6 @@ data class GlyphProps(
diacriticsAnchors.forEach {
hash = hash xor it.type
hash = hash * 16777619
hash = hash xor (it.x or (if (it.xUsed) 128 else 0))
hash = hash * 16777619
hash = hash xor (it.y or (if (it.yUsed) 128 else 0))
hash = hash * 16777619
}
hash = hash xor tags

View File

@@ -888,6 +888,9 @@ class TerrarumSansBitmap(
SHEET_HENTAIGANA_VARW -> hentaiganaIndexY(ch)
SHEET_CONTROL_PICTURES_VARW -> controlPicturesIndexY(ch)
SHEET_LEGACY_COMPUTING_VARW -> legacyComputingIndexY(ch)
SHEET_CYRILIC_EXTB_VARW -> cyrilicExtBIndexY(ch)
SHEET_CYRILIC_EXTA_VARW -> cyrilicExtAIndexY(ch)
SHEET_CYRILIC_EXTC_VARW -> cyrilicExtCIndexY(ch)
else -> ch / 16
}
@@ -918,9 +921,9 @@ class TerrarumSansBitmap(
val isLowHeight = (pixmap.getPixel(codeStartX, codeStartY + 5).and(255) != 0)
// Keming machine parameters
val kerningBit1 = pixmap.getPixel(codeStartX, codeStartY + 6).tagify()
val kerningBit2 = pixmap.getPixel(codeStartX, codeStartY + 7).tagify()
val kerningBit3 = pixmap.getPixel(codeStartX, codeStartY + 8).tagify()
val kerningBit1 = pixmap.getPixel(codeStartX, codeStartY + 6).tagify() // glyph shape
val kerningBit2 = pixmap.getPixel(codeStartX, codeStartY + 7).tagify() // dot removal
val kerningBit3 = pixmap.getPixel(codeStartX, codeStartY + 8).tagify() // unused
var isKernYtype = ((kerningBit1 and 0x80000000.toInt()) != 0)
var kerningMask = kerningBit1.ushr(8).and(0xFFFFFF)
val hasKernData = kerningBit1 and 255 != 0//(kerningBit1 and 255 != 0 && kerningMask != 0xFFFF)
@@ -944,12 +947,12 @@ class TerrarumSansBitmap(
val shift = (3 - (it % 3)) * 8
val yPixel = pixmap.getPixel(codeStartX, codeStartY + yPos).tagify()
val xPixel = pixmap.getPixel(codeStartX, codeStartY + yPos + 1).tagify()
val yUsed = (yPixel ushr shift) and 128 != 0
val xUsed = (xPixel ushr shift) and 128 != 0
val y = if (yUsed) (yPixel ushr shift) and 127 else 0
val x = if (xUsed) (xPixel ushr shift) and 127 else 0
val ySgn = ((yPixel ushr shift) and 128).let { if (it == 0) -1 else 1 }
val xSgn = ((xPixel ushr shift) and 128).let { if (it == 0) -1 else 1 }
val y = ((yPixel ushr shift) and 127) * ySgn
val x = ((xPixel ushr shift) and 127) * xSgn
DiacriticsAnchor(it, x, y, xUsed, yUsed)
DiacriticsAnchor(it, x, y)
}.toTypedArray()
val alignWhere = (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 15).and(255) != 0).toInt() shl y) }
@@ -968,7 +971,9 @@ class TerrarumSansBitmap(
GlyphProps.STACK_DONT
else (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 18).and(255) != 0).toInt() shl y) }
glyphProps[code] = GlyphProps(width, isLowHeight, nudgeX, nudgeY, diacriticsAnchors, alignWhere, writeOnTop, stackWhere, IntArray(15), hasKernData, isKernYtype, kerningMask, directiveOpcode, directiveArg1, directiveArg2)
val dotRemoval = if (kerningBit2 == 0) null else kerningBit2.ushr(8)
glyphProps[code] = GlyphProps(width, isLowHeight, nudgeX, nudgeY, diacriticsAnchors, alignWhere, writeOnTop, stackWhere, IntArray(15), hasKernData, isKernYtype, kerningMask, dotRemoval, directiveOpcode, directiveArg1, directiveArg2)
// extra info
val extCount = glyphProps[code]?.requiredExtInfoCount() ?: 0
@@ -1015,17 +1020,6 @@ class TerrarumSansBitmap(
(0xD800..0xDFFF).forEach { glyphProps[it] = GlyphProps(0) }
(0x100000..0x10FFFF).forEach { glyphProps[it] = GlyphProps(0) }
(0xFFFA0..0xFFFFF).forEach { glyphProps[it] = GlyphProps(0) }
// manually add width of one orphan insular letter
// WARNING: glyphs in 0xA770..0xA778 has invalid data, further care is required
glyphProps[0x1D79] = GlyphProps(9)
// U+007F is DEL originally, but this font stores bitmap of Replacement Character (U+FFFD)
// to this position. String replacer will replace U+FFFD into U+007F.
glyphProps[0x7F] = GlyphProps(15)
}
private fun Int.halveWidth() = this / 2 + 1
@@ -1120,6 +1114,7 @@ class TerrarumSansBitmap(
var nonDiacriticCounter = 0 // index of last instance of non-diacritic char
var stackUpwardCounter = 0 // TODO separate stack counter for centre- and right aligned
var stackDownwardCounter = 0
var nudgeUpHighWater = 0 // tracks max nudgeY seen in current stack, so subsequent marks with lower nudge still clear the previous one
val HALF_VAR_INIT = W_VAR_INIT.minus(1).div(2)
@@ -1198,6 +1193,7 @@ class TerrarumSansBitmap(
stackUpwardCounter = 0
stackDownwardCounter = 0
nudgeUpHighWater = 0
}
// FIXME HACK: using 0th diacritics' X-anchor pos as a type selector
/*else if (thisProp.writeOnTop && thisProp.diacriticsAnchors[0].x == GlyphProps.DIA_JOINER) {
@@ -1217,30 +1213,32 @@ class TerrarumSansBitmap(
// set X pos according to alignment information
posXbuffer[charIndex] = -thisProp.nudgeX +
when (thisProp.alignWhere) {
GlyphProps.ALIGN_LEFT, GlyphProps.ALIGN_BEFORE -> posXbuffer[nonDiacriticCounter]
GlyphProps.ALIGN_LEFT, GlyphProps.ALIGN_BEFORE -> {
val anchorPointX = if (itsProp.diacriticsAnchors[diacriticsType].isZero) itsProp.width else itsProp.diacriticsAnchors[diacriticsType].x
posXbuffer[nonDiacriticCounter] + anchorPointX
}
GlyphProps.ALIGN_RIGHT -> {
// println("thisprop alignright $kerning, $extraWidth")
val anchorPoint =
if (!itsProp.diacriticsAnchors[diacriticsType].xUsed) itsProp.width else itsProp.diacriticsAnchors[diacriticsType].x
val anchorPointX = if (itsProp.diacriticsAnchors[diacriticsType].isZero) itsProp.width else itsProp.diacriticsAnchors[diacriticsType].x
extraWidth += thisProp.width
posXbuffer[nonDiacriticCounter] + anchorPoint - W_VAR_INIT + kerning + extraWidth
posXbuffer[nonDiacriticCounter] + anchorPointX - W_VAR_INIT + kerning + extraWidth
}
GlyphProps.ALIGN_CENTRE -> {
val anchorPoint =
if (!itsProp.diacriticsAnchors[diacriticsType].xUsed) itsProp.width.div(2) else itsProp.diacriticsAnchors[diacriticsType].x
val anchorPointX = if (itsProp.diacriticsAnchors[diacriticsType].isZero) itsProp.width.div(2) else itsProp.diacriticsAnchors[diacriticsType].x
if (itsProp.alignWhere == GlyphProps.ALIGN_RIGHT) {
if (thisChar in 0x900..0x902)
posXbuffer[nonDiacriticCounter] + anchorPoint + (itsProp.width - 1).div(2)
posXbuffer[nonDiacriticCounter] + anchorPointX + (itsProp.width - 1).div(2)
else
posXbuffer[nonDiacriticCounter] + anchorPoint + (itsProp.width + 1).div(2)
posXbuffer[nonDiacriticCounter] + anchorPointX + (itsProp.width + 1).div(2)
} else {
if (thisChar in 0x900..0x902)
posXbuffer[nonDiacriticCounter] + anchorPoint - (W_VAR_INIT + 1) / 2
posXbuffer[nonDiacriticCounter] + anchorPointX - (W_VAR_INIT + 1) / 2
else
posXbuffer[nonDiacriticCounter] + anchorPoint - HALF_VAR_INIT
posXbuffer[nonDiacriticCounter] + anchorPointX - HALF_VAR_INIT
}
}
else -> throw InternalError("Unsupported alignment: ${thisProp.alignWhere}")
@@ -1277,14 +1275,14 @@ class TerrarumSansBitmap(
// set Y pos according to diacritics position
when (thisProp.stackWhere) {
GlyphProps.STACK_DOWN -> {
posYbuffer[charIndex] = (-thisProp.nudgeY + H_DIACRITICS * stackDownwardCounter) * flipY.toSign()
posYbuffer[charIndex] = (-thisProp.nudgeY + H_DIACRITICS * stackDownwardCounter) * flipY.toSign() - thisProp.nudgeY
stackDownwardCounter++
}
GlyphProps.STACK_UP -> {
posYbuffer[charIndex] = -thisProp.nudgeY + (-H_DIACRITICS * stackUpwardCounter + -thisProp.nudgeY) * flipY.toSign()
val effectiveNudge = maxOf(thisProp.nudgeY, nudgeUpHighWater)
posYbuffer[charIndex] = -effectiveNudge + (-H_DIACRITICS * stackUpwardCounter + -effectiveNudge) * flipY.toSign() + effectiveNudge
// shift down on lowercase if applicable
if (getSheetType(thisChar) in autoShiftDownOnLowercase &&
lastNonDiacriticChar.isLowHeight()) {
if (lastNonDiacriticChar.isLowHeight()) {
//dbgprn("AAARRRRHHHH for character ${thisChar.toHex()}")
//dbgprn("lastNonDiacriticChar: ${lastNonDiacriticChar.toHex()}")
//dbgprn("cond: ${thisProp.alignXPos == GlyphProps.DIA_OVERLAY}, charIndex: $charIndex")
@@ -1294,6 +1292,7 @@ class TerrarumSansBitmap(
posYbuffer[charIndex] += H_STACKUP_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign
}
nudgeUpHighWater = effectiveNudge
stackUpwardCounter++
// dbgprn("lastNonDiacriticChar: ${lastNonDiacriticChar.charInfo()}; stack counter: $stackUpwardCounter")
@@ -1302,22 +1301,25 @@ class TerrarumSansBitmap(
posYbuffer[charIndex] = (-thisProp.nudgeY + H_DIACRITICS * stackDownwardCounter) * flipY.toSign()
stackDownwardCounter++
posYbuffer[charIndex] = (-thisProp.nudgeY + -H_DIACRITICS * stackUpwardCounter) * flipY.toSign()
val effectiveNudge = maxOf(thisProp.nudgeY, nudgeUpHighWater)
posYbuffer[charIndex] = (-effectiveNudge + -H_DIACRITICS * stackUpwardCounter) * flipY.toSign()
// shift down on lowercase if applicable
if (getSheetType(thisChar) in autoShiftDownOnLowercase &&
lastNonDiacriticChar.isLowHeight()) {
if (lastNonDiacriticChar.isLowHeight()) {
if (diacriticsType == GlyphProps.DIA_OVERLAY)
posYbuffer[charIndex] += H_OVERLAY_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign
else
posYbuffer[charIndex] += H_STACKUP_LOWERCASE_SHIFTDOWN * flipY.toSign() // if minus-assign doesn't work, try plus-assign
}
nudgeUpHighWater = effectiveNudge
stackUpwardCounter++
}
// for BEFORE_N_AFTER, do nothing in here
}
// nudge Y pos according to anchor position
posYbuffer[charIndex] -= itsProp.diacriticsAnchors[diacriticsType].y
// Don't reset extraWidth here!
}
}
@@ -1327,7 +1329,7 @@ class TerrarumSansBitmap(
if (str.isNotEmpty()) {
val lastCharProp = glyphProps[str.last()]
val penultCharProp = glyphProps[str[nonDiacriticCounter]] ?:
(if (errorOnUnknownChar) throw throw InternalError("No GlyphProps for char '${str[nonDiacriticCounter]}' " +
(if (errorOnUnknownChar) throw InternalError("No GlyphProps for char '${str[nonDiacriticCounter]}' " +
"(${str[nonDiacriticCounter].charInfo()})") else nullProp)
posXbuffer[posXbuffer.lastIndex] = posXbuffer[posXbuffer.lastIndex - 1] + // DON'T add 1 to house the shadow, it totally breaks stuffs
if (lastCharProp != null && lastCharProp.writeOnTop >= 0) {
@@ -1523,8 +1525,8 @@ class TerrarumSansBitmap(
}
// for lowercase i and j, if cNext is a diacritic that goes on top, remove the dots
else if (diacriticDotRemoval.containsKey(c) && (glyphProps[cNext]?.writeOnTop ?: -1) >= 0 && glyphProps[cNext]?.stackWhere == GlyphProps.STACK_UP) {
seq.add(diacriticDotRemoval[c]!!)
else if (glyphProps[c]!!.dotRemoval != null && (glyphProps[cNext]?.writeOnTop ?: -1) >= 0 && glyphProps[cNext]?.stackWhere == GlyphProps.STACK_UP) {
seq.add(glyphProps[c]!!.dotRemoval!!)
}
// BEGIN of tamil subsystem implementation
@@ -2604,6 +2606,9 @@ class TerrarumSansBitmap(
internal const val SHEET_HENTAIGANA_VARW = 39
internal const val SHEET_CONTROL_PICTURES_VARW = 40
internal const val SHEET_LEGACY_COMPUTING_VARW = 41
internal const val SHEET_CYRILIC_EXTB_VARW = 42
internal const val SHEET_CYRILIC_EXTA_VARW = 43
internal const val SHEET_CYRILIC_EXTC_VARW = 44
internal const val SHEET_UNKNOWN = 254
@@ -2625,10 +2630,6 @@ class TerrarumSansBitmap(
const val MOVABLE_BLOCK_1 = 0xFFFF0
private val autoShiftDownOnLowercase = arrayOf(
SHEET_DIACRITICAL_MARKS_VARW
)
private val fileList = arrayOf( // MUST BE MATCHING WITH SHEET INDICES!!
"ascii_variable.tga",
"hangul_johab.tga",
@@ -2672,6 +2673,9 @@ class TerrarumSansBitmap(
"hentaigana_variable.tga",
"control_pictures_variable.tga",
"symbols_for_legacy_computing_variable.tga",
"cyrilic_extB_variable.tga",
"cyrilic_extA_variable.tga",
"cyrilic_extC_variable.tga",
)
internal val codeRange = arrayOf( // MUST BE MATCHING WITH SHEET INDICES!!
0..0xFF, // SHEET_ASCII_VARW
@@ -2704,7 +2708,7 @@ class TerrarumSansBitmap(
0xA720..0xA7FF, // SHEET_EXTD_VARW
0x20A0..0x20CF, // SHEET_CURRENCIES_VARW
0xFFE00..0xFFF9F, // SHEET_INTERNAL_VARW
0x2100..0x214F, // SHEET_LETTERLIKE_MATHS_VARW
0x2100..0x21FF, // SHEET_LETTERLIKE_MATHS_VARW
0x1F100..0x1F1FF, // SHEET_ENCLOSED_ALPHNUM_SUPL_VARW
(0x0B80..0x0BFF) + (0xF00C0..0xF00FF), // SHEET_TAMIL_VARW
0x980..0x9FF, // SHEET_BENGALI_VARW
@@ -2716,6 +2720,9 @@ class TerrarumSansBitmap(
0x1B000..0x1B16F, // SHEET_HENTAIGANA_VARW
0x2400..0x243F, // SHEET_CONTROL_PICTURES_VARW
0x1FB00..0x1FBFF, // SHEET_LEGACY_COMPUTING_VARW
0xA640..0xA69F, // SHEET_CYRILIC_EXTB_VARW
0x2DE0..0x2DFF, // SHEET_CYRILIC_EXTA_VARW
0x1C80..0x1C8F, // SHEET_CYRILIC_EXTC_VARW
)
private val codeRangeHangulCompat = 0x3130..0x318F
@@ -2733,11 +2740,6 @@ class TerrarumSansBitmap(
0x20..0x7F,
)
private val diacriticDotRemoval = hashMapOf(
'i'.toInt() to 0x131,
'j'.toInt() to 0x237
)
internal fun Int.charInfo() = "U+${this.toString(16).padStart(4, '0').toUpperCase()}: ${Character.getName(this)}"
const val NQSP = 0x2000
@@ -3066,6 +3068,9 @@ class TerrarumSansBitmap(
private fun hentaiganaIndexY(c: CodePoint) = (c - 0x1B000) / 16
private fun controlPicturesIndexY(c: CodePoint) = (c - 0x2400) / 16
private fun legacyComputingIndexY(c: CodePoint) = (c - 0x1FB00) / 16
private fun cyrilicExtBIndexY(c: CodePoint) = (c - 0xA640) / 16
private fun cyrilicExtAIndexY(c: CodePoint) = (c - 0x2DE0) / 16
private fun cyrilicExtCIndexY(c: CodePoint) = (c - 0x1C80) / 16
val charsetOverrideDefault = Character.toChars(CHARSET_OVERRIDE_DEFAULT).toSurrogatedString()
val charsetOverrideBulgarian = Character.toChars(CHARSET_OVERRIDE_BG_BG).toSurrogatedString()

View File

@@ -229,16 +229,16 @@ class TerrarumTypewriterBitmap(
val nudgeY = nudgingBits.ushr(16).toByte().toInt() // signed 8-bit int
val diacriticsAnchors = (0..5).map {
val yPos = 11 + (it / 3) * 2
val yPos = 13 - (it / 3) * 2
val shift = (3 - (it % 3)) * 8
val yPixel = pixmap.getPixel(codeStartX, codeStartY + yPos).tagify()
val xPixel = pixmap.getPixel(codeStartX, codeStartY + yPos + 1).tagify()
val y = (yPixel ushr shift) and 127
val x = (xPixel ushr shift) and 127
val yUsed = (yPixel ushr shift) >= 128
val xUsed = (yPixel ushr shift) >= 128
val ySgn = ((yPixel ushr shift) and 128).let { if (it == 0) -1 else 1 }
val xSgn = ((xPixel ushr shift) and 128).let { if (it == 0) -1 else 1 }
val y = ((yPixel ushr shift) and 127) * ySgn
val x = ((xPixel ushr shift) and 127) * xSgn
DiacriticsAnchor(it, x, y, xUsed, yUsed)
DiacriticsAnchor(it, x, y)
}.toTypedArray()
val alignWhere = (0..1).fold(0) { acc, y -> acc or ((pixmap.getPixel(codeStartX, codeStartY + y + 15).and(255) != 0).toInt() shl y) }

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.