Policy Blindness

MeZO cumulative updates are much more uniform than Adam's layer-wise update profile. Adam shows heterogeneous layer update patterns in OPT-6.7B, with larger updates in shallow and middle layers. Low correlation between MeZO cumulative upda…

1 sources - 5 claims

MeZO cumulative updates are much more uniform than Adam's layer-wise update profile. Adam shows heterogeneous layer update patterns in OPT-6.7B, with larger updates in shallow and middle layers. Low correlation between MeZO cumulative updates and Adam gradient norms indicates isotropic ZO exploration fails to recover intrinsic layer sensitivity. Policy Blindness refers to standard ZO methods perturbing all layers uniformly despite heterogeneous layer sensitivity. Dense uniform perturbation can inject noise through insensitive layers.