System Neural Diversity

1 sources - 5 claims

All-pairs SND aggregation should be treated as a design choice rather than an unavoidable default. SND measures behavioral heterogeneity in multi-agent reinforcement learning using average pairwise distances between agents' action distributions. In Gaussian-policy settings, the pairwise behavioral distance is computed using Monte Carlo averages of Wasserstein-2 distances over rollout observations. Full SND requires n choose 2 pairwise computations for every metric call. All-pairs SND aggregation can become a bottleneck when evaluated repeatedly during training or control.