Expert Utilization

EMO had a higher Gini coefficient for expert utilization than the fixed baseline. Fixed E=128 and EMO showed similar per-layer and per-expert routing distributions. No EMO layer exhibited routing collapse in the utilization analysis. The p…

1 sources - 4 claims

EMO had a higher Gini coefficient for expert utilization than the fixed baseline. Fixed E=128 and EMO showed similar per-layer and per-expert routing distributions. No EMO layer exhibited routing collapse in the utilization analysis. The paper reports slightly higher expert-utilization imbalance for EMO despite observing no collapse.