Group Parity

Diffusion transformers learned parity well only for small group sizes at the training endpoint. Increasing the group size increases bit interaction order and provides direct control over rule complexity. High-G parity is difficult because…

1 sources - 6 claims

Diffusion transformers learned parity well only for small group sizes at the training endpoint. Increasing the group size increases bit interaction order and provides direct control over rule complexity. High-G parity is difficult because group parity requires degree-G multiplicative interactions. Novel full samples in parity mainly arise by recombining memorized groups into unseen full samples. For larger parity groups, more rule-valid generations were exact training samples. The main testbed is group-level even parity on 36-bit binary images.