Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Effective and Efficient Time-Varying Counterfactual Prediction with State-Space Models
Authors: Haotian Wang, Haoxuan Li, Hao Zou, Haoang Chi, Long Lan, Wanrong Huang, Wenjing Yang
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted extensive experiments on both synthetic and real-world datasets, demonstrating that Mamba-CDSP not only outperforms baselines by a large margin, but also exhibits prominent running efficiency. |
| Researcher Affiliation | Academia | 1College of Computer Science and Technology, National University of Defense Technology 2Center for Data Science, Peking University 3Tsinghua University 4Intelligent Game and Decision Lab |
| Pseudocode | No | The paper describes methods and theoretical analyses using mathematical equations, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository. |
| Open Datasets | Yes | Following common practice in benchmarking for counterfactual inference, all the methods are validated on three datasets, including the synthetic tumor growth data (Geng et al., 2017), the MIMIC-III-based semi-synthetic data (Melnychuk et al., 2022; Schulam & Saria, 2017), the MIMIC-III real-world data (Johnson et al., 2016). ... The M5 Forecasting dataset, as cited in (Huang et al., 2024), comprises daily transaction data from Walmart stores across three U.S. states... |
| Dataset Splits | Yes | For the tumor-growth synthetic dataset, ... for each γ, we simulate 10,000 patients for training, 1,000 for validation, and 1,000 for testing. ... By setting da = 3 and dy = 2, the cohort of 1,000 patients is split into train/validation/test subsets via a ratio of 60% / 20% / 20 %. ... The train/validation/test subsets are split with the ratio of 70%/15%/15%. |
| Hardware Specification | Yes | Experiments are carried out on 1 NVIDIA Ge Force RTX 3090 GPU |
| Software Dependencies | No | The paper mentions various models and architectures (e.g., Mamba, Transformer, LSTM, RNNs) but does not specify any particular software libraries or tools with their version numbers that were used for implementation. |
| Experiment Setup | Yes | Table 4: Ranges for hyperparameter tuning across experiments. Here, we distinguish (1) data using the tumor growth (TG) simulator (=experiments with fully-synthetic data), (2) data from the semi-synthetic benchmark, and (3) real-world MIMIC-III data. EL refers to the embedding layer, and PL refers to the projection layer. Model Hyperparameter TG simulator Semi-Synthetic Data Real-world Data Mamaba-CDSP Mamba blocks (B) 1 1 2 Learning rate (η) {0.0005, 0.001, 0.01} {0.0005, 0.001, 0.01} {0.0005, 0.001, 0.01} Minibatch size 128 64 64 De-correlation Parameter 1 1 1 EL hidden units (d EL) 32 32 64 PL hidden units (d PL) 32 32 64 Dropout rate (p) 0.1 0.1 0.1 EMA of model weights 0.99 0.99 0.99 Input size da + dx + dy + dv da + dx + dy + dv da + dx + dy + dv Output size dy dy dy |