Martingale Neural Operators

1 sources - 8 claims

Adding the martingale covariance head does not materially degrade MNO's mean prediction accuracy compared to a mean-only baseline. MNO achieves 68× and 120× improvements in Wasserstein-2 distance over Neural SPDE on stochastic Burgers and φ⁴ field theory respectively. Sharing the FNO backbone between the drift and covariance heads achieves better mean RMSE than using independent backbones, attributed to reduced gradient interference. MNO uses a temporal gate that forces both heads to output zero at t=0, ensuring the model exactly recovers the initial condition. MNO converts the Doob-Meyer decomposition into an inductive architectural bias by giving the network separate parametric heads for drift and martingale components. The variance-consistency auxiliary loss term is the load-bearing auxiliary signal; removing it alone degrades performance as much as removing all auxiliary terms. MNO is a marginal learner and does not enforce temporal consistency, conservation laws, or the full martingale tower property. MNO fails on Gray-Scott systems because high-frequency deterministic pattern structure is misattributed to stochastic variance by the residual factor head.