Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Online Portfolio Selection with ML Predictions
Authors: Ziliang Zhang, Tianming Zhao, Albert Zomaya
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments on large-scale equity data strengthen our theory, spanning both synthetic prediction streams and production-grade machine-learning models. 3 Empirical study We begin by assessing RAM on the canonical New York Stock Exchange (NYSE) benchmark, using i.i.d. random rankings to model a fully oblivious and uninformative oracle. |
| Researcher Affiliation | Academia | Ziliang Zhang School of Computer Science The University of Sydney Camperdown NSW 2050, Australia EMAIL Tianming Zhao School of Computer Science The University of Sydney Camperdown NSW 2050, Australia EMAIL Albert Y. Zomaya School of Computer Science The University of Sydney Camperdown NSW 2050, Australia EMAIL |
| Pseudocode | Yes | Algorithm 1 RAM: rebalanced arithmetic mean with predictions |
| Open Source Code | Yes | Source code is available at https://github.com/mroymd/OPML. Both the NYSE and S&P 500 datasets are publicly available; moreover, we supply a one-click Colab notebook that fully replicates all reported experiments. |
| Open Datasets | Yes | Our primary dataset is the original NYSE(O) collection [27], which contains 36 stocks spanning 22 years (1962 1984) over 5,651 trading days. To capture a broader range of market volatility and ensure more recent coverage, we also consider the extended NYSE(N) dataset [28], encompassing 21 assets from 1962 to 2006 (11,178 trading days). We use the nightly-refreshed S&P 500 historical panel [30] available on Kaggle, containing 501 constituents from 2010 to 2024. |
| Dataset Splits | Yes | The model is retrained each trading day using a 250-day sliding window, featuring contemporaneous and three lagged returns per asset. A decaying factor with Θ = 0.995age prioritizes recent observations while discarding stale information. A 60-day hold-out slice inside the same window provides early-stopping signals, eliminating look-ahead bias. |
| Hardware Specification | Yes | All experiments compute under 6h on one standard CPU. |
| Software Dependencies | No | The paper mentions 'Light GBM Lambda MART [25]' for forecasting ranks, but does not provide a specific version number for this software or any other key software components used in the experiments. |
| Experiment Setup | Yes | The model is retrained each trading day using a 250-day sliding window, featuring contemporaneous and three lagged returns per asset. A decaying factor with Θ = 0.995age prioritizes recent observations while discarding stale information. A 60-day hold-out slice inside the same window provides early-stopping signals, eliminating look-ahead bias. |