Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Structured Learning of Compositional Sequential Interventions
Authors: Jialin Yu, Andreas Koukorinis, Nicolo Colombo, Yuchen Zhu, Ricardo Silva
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run a number of synthetic and semi-synthetic experiments to evaluate the performance of the CSI-VAE approach. In this section, we summarize our experimental set-up and main results. |
| Researcher Affiliation | Academia | Jialin Yu1 Andreas Koukorinis1 Nicolรฒ Colombo2 Yuchen Zhu1 Ricardo Silva1 1University College London 2Royal Holloway, University of London |
| Pseudocode | No | The paper describes the algorithm (CSI-VAE) in prose within Section 3 ('Algorithm and Statistical Inference' and '3.1 Algorithm: CSI-VAE') but does not provide structured pseudocode or an algorithm block. |
| Open Source Code | Yes | The code for reproducing all results and figures is available online2. 2The code is available at https://github.com/jialin-yu/CSI-VAE |
| Open Datasets | Yes | For simulator (2), we construct simulated datasets of size 3, 000, again containing 5 different interventions, an initial period T0 = 25, a maximum of 3 different interventions per unit, and r = 10. The task is to predict outcomes for interventions not applied yet within any given unit (i.e., at least from the 2 options left). In simulator (2), parameters ฯ and ฮฒ are learned from real-world data. Interventions are artificial, but inspired by the process of showing different proportions of track types to an user in a Spotify-like environment. ...The dataset comes from Spotify5, which is an online music streaming platform... For a detailed description of the dataset, please refer to [10]. Reference [10] is: B. Brost, R. Mehrotra, and T. Jehan. The music streaming sessions dataset. In The World Wide Web Conference, pages 2594 2600, 2019. |
| Dataset Splits | Yes | For both setups, we use a data ratio of 0.7, 0.1, 0.2 for training, validation and test, respectively. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components like Adam optimizer, GRU, LSTM, and Transformer models, but it does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Each experiment was repeated 5 times, using Adam [26] at a learning rate of 0.01, with 50 epochs in the fully-synthetic case and 100 for the semi-synthetic, which was enough for convergence. We selected the best iteration point based on a small holdout set. |