Probability Paths and the Structure of Predictions over Time
Authors: Zhiyuan Jerry Lin, Hao Sheng, Sharad Goel
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now explore the efficacy of GLIM through a series of experiments on two real-world datasets. We have additionally included a simulation study on a synthetic dataset in Appendix B to demonstrate GILM s empirical finite sample behavior. In all of our experiments, we use a covariance matrix Σ(X, θ) with autoregressive structure and heteroskedastic variance. We compare GLIM against three representative baselines from each of the three aforementioned classes of models in Section 2: (1) MMFE: martingale method of forecast evolution [Heath and Jackson, 1994, Zhao et al., 2013]; (2) LR: a set of linear regression models {mt} that predict the future estimated probability at time t [Brockwell and Davis, 2016]; and (3) MQLSTM: a Bayesian multi-horizon quantile LSTM model [Wen et al., 2017, Eisenach et al., 2020]. As displayed in the plot, GLIM outperforms all three baselines across the board. Particularly, Figure 4a are Figure 5a are plotted in log scale, suggesting GLIM is outperforming baselines by a few orders of magnitudes on those metrics. |
| Researcher Affiliation | Collaboration | Zhiyuan Jerry Lin Facebook zylin@fb.com Hao Sheng Stanford University haosheng@cs.stanford.edu Sharad Goel Harvard University sgoel@hks.harvard.edu |
| Pseudocode | No | The paper describes multi-step procedures for model inference and drawing probability paths but does not present them in structured pseudocode or an algorithm block explicitly labeled as such. |
| Open Source Code | Yes | Code to replicate our experiments is available online at: https://github.com/Its Mr Lin/probability-paths. |
| Open Datasets | Yes | Specifically, we use a dataset of Australian rainfall observations [Williams, 2011, Young and Young, 2018], and construct daily predictions starting seven days in advance of the target date. Kaggle: Rain in australia, 2018. https://www.kaggle.com/jsphyg/weather-dataset-rattle-package. |
| Dataset Splits | No | For the basketball dataset, the paper states: 'training our model on the first season, and evaluating our predictions on the second' (2017-2018 for training, 2018-2019 for evaluation). For the weather dataset, it states: 'We randomly sampled 10,000 target dates in the dataset prior to 2014 for training our models, and randomly sampled 10,000 target dates in or after 2014 for testing.' There is no explicit mention of a separate validation split. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, or cloud computing specifications) used to run the experiments. |
| Software Dependencies | No | The paper mentions that Hamiltonian Monte Carlo (HMC) is 'as implemented in Stan [Carpenter et al., 2017]', but it does not specify the version number of Stan or any other software dependencies with their versions. |
| Experiment Setup | Yes | In all of our experiments, we use a covariance matrix Σ(X, θ) with autoregressive structure and heteroskedastic variance. Specifically, we set Σ(i,j) = σiσjρ(n |i j|), where the variance at time t is σ2 t = Gβ(X, t). We use a regularized linear function Gβ(X, t) for GLIM, which we describe in detail in Appendix C.1. We set ρ = 0 in this case and use a regularized quadratic function Gβ(X, t) for GLIM, which we describe in detail in Appendix C.2. For each model, all metrics are calculated using 100 simulated samples per probability path. Without further constraints, θ is not fully identified by the data, since multiplying all of the latent variables in Eq. (1) by a positive constant does not affect the sign of the relevant expression. Thus, in our applications below, we constrain the scale of the latent variables by requiring Var(Z1) = σ2 1 = 1. |