Evidence and Limitations

1 sources - 4 claims

Functional FQI outperformed continuous-action offline RL baselines that collapse distributional actions to scalars in simulations. Functional boxplots showed the learned recommendations stayed within the non-outlying range of observed behavior distributions. Simulation studies found that Functional FQE accurately recovered the value of a target functional policy. The authors caution that causal interpretations cannot be drawn from the observational All of Us data alone.