Prompt-MoE Surrogate
The surrogate uses six expert MLPs and a gating network to produce weighted predictions. The Prompt-MoE surrogate consumes normalized parameters, context, and the retrieved prompt. Expert disagreement is computed as the weighted squared de…
1 sources - 5 claims
The surrogate uses six expert MLPs and a gating network to produce weighted predictions. The Prompt-MoE surrogate consumes normalized parameters, context, and the retrieved prompt. Expert disagreement is computed as the weighted squared deviation among expert predictions. Most adaptation steps update only the retrieved prompts while keeping expert and gate weights frozen. Emergency full updates train experts and gate from a replay buffer when anomaly or predictive variance crosses a threshold.