Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Causal View of Time Series Imputation: Some Identification Results on Missing Mechanism

Authors: Ruichu Cai, Kaitao Zheng, Junxian Huang, Zijian Li, Zhengming Chen, Boyan Xu, Zhifeng Hao

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that our method surpasses existing time series imputation techniques across various datasets with different missing mechanisms, demonstrating its effectiveness in real-world applications. Our approach is validated through massive semisynthetic datasets on all the missing mechanisms, the experimental results show that our DMM method outperforms the state-of-the-art baselines.
Researcher Affiliation Academia 1School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China 2Peng Cheng Laboratory, Shenzhen 518066, China 3Mohamed bin Zayed University of Artificial Intelligence, Masdar City, Abu Dhabi 4College of Science, Shantou University, Shantou 515063, China
Pseudocode No The paper describes the methodology using equations and textual explanations, but it does not contain any clearly labeled pseudocode or algorithm blocks in a structured format.
Open Source Code Yes Code: https://github.com/DMIRLAB-Group/DMM
Open Datasets Yes To evaluate the performance of the proposed method, we consider the following datasets: 1) ETT[Zhou et al., 2021]: {ETTh1, ETTh2, ETTm1, ETTm2}; 2) Exchange[Lai et al., 2018]; 3) Weather3: For each dataset, we systematically generate mask matrices to accurately simulate missing values based on the missing mechanisms of MAR and MNAR. (Footnote 3: https://www.bgc-jena.mpg.de/wetter/)
Dataset Splits No The paper mentions the use of 'training set' and 'test set' and 'different mask ratios' (e.g., 0.2, 0.4, and 0.6) for missing values, but it does not specify the exact percentages or methodology for splitting the datasets into training, validation, and test sets for model evaluation.
Hardware Specification No The paper does not explicitly provide details about the hardware used to run its experiments. No specific GPU, CPU, or other computing resource models are mentioned.
Software Dependencies No The paper describes the use of various neural architectures and models (e.g., variational inference, normalizing flow-based neural architecture, Multi-layer Perceptron networks), but it does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup No The paper states that Ltotal = LR + βLz K + γLc K, where β and γ are hyperparameters. It also mentions repeating experiments three times with random seeds. However, it does not provide specific values for these hyperparameters, learning rates, batch sizes, or other detailed training configurations in the main text, instead referring to an appendix for implementation details.