Abstract:
Time-series forecasting across finance, biology, and physics combines deterministic trends with stochastic fluctuations. While classical methods (ARIMA, LSTM) and symbolic regression tools (PySR, SINDy) generate deterministic equations, real-world processes require stochastic differential equations (SDEs) to capture inherent uncertainty. We introduce SAGE (Stochastic Automatic Generative Ensembles), which reframes equation discovery as a probabilistic inference problem using evolutionary algorithms to perform mathematical reasoning over the space of SDEs. SAGE uses evolutionary optimization stochasticity to transform variability across multiple symbolic regression runs into probability distributions over equation terms and coefficients. SAGE enables data-driven derivation of interpretable SDE models without restrictive parametric assumptions. More robust SDE models are more natural for further automated symbolic manipulation and reasoning than usually discovered ODEs and their ensembles.