Expert Utilization
EMO had a higher Gini coefficient for expert utilization than the fixed baseline. Fixed E=128 and EMO showed similar per-layer and per-expert routing distributions. No EMO layer exhibited routing collapse in the utilization analysis. The p…
1 sources - 4 claims
EMO had a higher Gini coefficient for expert utilization than the fixed baseline. Fixed E=128 and EMO showed similar per-layer and per-expert routing distributions. No EMO layer exhibited routing collapse in the utilization analysis. The paper reports slightly higher expert-utilization imbalance for EMO despite observing no collapse.