C-Former: Theory-Predicted Phase Transition via Hodge Decomposition on TD6 Tiles
A mathematical theory diagnosed its own architecture's flaws, prescribed the fixes, and the fixes produced a 64-percentage-point capability jump.
Interactive TD6 tile — the fundamental compute unit with three Hodge-decomposed channels
The initial C-Former (v1) contained two provable mathematical flaws that CT theory itself diagnosed: (1) edge currents computed as lie in the gradient subspace, making — the cycle channel was dead by construction; (2) all three channels received identical inputs, violating axiom B4. These reduced the architecture to .
CT prescribed three fixes: inject cycle basis vectors from , route channels through independent projections, and chain tiles for long sequences. With all fixes, the architecture transitions to — the unique optimal dimensionality. On ListOps, accuracy jumps from 17.4% to 81.3%, beating the standard transformer (77.9%) with 40% fewer parameters. On Pathfinder: 99.97%.
This is not an incremental improvement. It is a qualitative capability jump that emerges from correcting the Hodge decomposition's implementation — exactly as CT's d=3 theorem predicts.
The d=2 to d=3 Phase Transition
| MODEL | LISTOPS | PATHFINDER | PARAMS |
|---|---|---|---|
| C-Former v1 (d=2, broken) | 17.4% | 78.7% | 9.5M |
| Standard Transformer | 77.9% | ~71% | 3.8M |
| C-Former v3 (d=3, fixed) | 81.3% | 99.97% | 2.2M |
The convergence trajectory reveals the phase transition most clearly:
| EPOCH | C-FORMER v3 | STANDARD | C-FORMER v1 |
|---|---|---|---|
| 1 | 59.50% | 13.90% | 10.40% |
| 10 | 73.80% | 60.40% | ~40% |
| 30 | 80.30% | 75.40% | ~65% |
| 50 | 81.30% | 77.90% | 67.85% |
At epoch 1, C-Former v3 is at 59.5% while the standard transformer is at 13.9% and v1 is at 10.4%. The multi-tile architecture with cycle injection learns ListOps hierarchical structure in a single epoch. The standard transformer needs 20+ epochs to reach comparable accuracy. This is not a training trick — it is the Hodge inductive bias providing structural understanding from the first pass.
Three Fixes Prescribed by CT Theory
Edge currents are purely gradient. Fix: inject the 12 fundamental cycle basis vectors from via learned coefficients. Verified: (max 5.96e-08), orthonormal, full rank. is now alive.
Each budget channel now receives an independent learned projection of the input, not the same undifferentiated features. B4 (local additivity: independent components' budgets add) is now satisfied.
Chain of TD6 tiles, 6 tokens per tile. Adjacent tiles exchange through boundary nodes. Cost: O(L) per layer (linear in sequence length). Inter-tile cycles raise from 12 to 13N-1, providing long-range sensing.
Interpretability: Budget Profiles on UCI HAR
The fixed Hodge projectors produce deterministic, physiologically meaningful decompositions. This finding is independent of the v3 fixes and remains valid across all versions.
| ACTIVITY | B_th | B_cx | B_leak | MATCH |
|---|---|---|---|---|
| WALKING | 21.0% | 50.3% | 28.7% | YES |
| WALK UP | 36.5% | 32.5% | 31.0% | YES |
| WALK DOWN | 40.8% | 40.8% | 18.4% | YES |
| SITTING | 19.1% | 32.2% | 48.7% | YES |
| STANDING | 21.1% | 32.3% | 46.7% | YES |
| LAYING | 21.0% | 26.2% | 52.8% | NO |
5/6 predictions correct (83%). Profiles identical across all seeds (deterministic from fixed projectors). Frozen Hodge representations retain 96.6% accuracy with 8.4% trainable parameters.
Complete Cost Accounting
| PHASE | COST | KEY FINDING |
|---|---|---|
| Phase B (synthetic, HAR) | ~$1 | Interpretable budget profiles |
| Phase C (LRA, QM9) | ~$1 | ListOps 17.4%, dead B_cx |
| v1.5 (wrong fix) | ~$0.50 | Symmetric products are not cycle flow |
| v3 (all fixes) | ~$0.50 | ListOps 81.3%, Pathfinder 99.97% |
| Total | ~$3 | Theory-predicted phase transition for the cost of a coffee |