Structured Rules

Exact-K and row-K were learned much faster than parity. One-hot encoding delayed or reduced memorization compared with scalar encoding. Exact-K had rule-learning time comparable to parity G=2 despite imposing a global count over all bits.…

1 sources - 6 claims

Exact-K and row-K were learned much faster than parity. One-hot encoding delayed or reduced memorization compared with scalar encoding. Exact-K had rule-learning time comparable to parity G=2 despite imposing a global count over all bits. Row-K was learned rapidly, often around 1,000 to 10,000 steps. Latin square and Sudoku were delayed because multiple constraints had to be coordinated. Column validity emerged later than row validity in categorical tasks, showing anisotropic learning.