Backward Error Analysis
For sampled-subgraph mini-batch training, the modified loss is not simply the full-graph loss. The formal backward error analysis is limited to vanilla SGD, while Adam lacks a matching stochastic graph mini-batch derivation. The theoretica…
1 sources - 4 claims
For sampled-subgraph mini-batch training, the modified loss is not simply the full-graph loss. The formal backward error analysis is limited to vanilla SGD, while Adam lacks a matching stochastic graph mini-batch derivation. The theoretical analysis uses backward error analysis to interpret a discrete optimizer as following a modified objective. For full-graph gradient descent, the modified objective adds an epsilon-scaled squared-gradient correction to the full-graph loss.