Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

BRITS: Bidirectional Recurrent Imputation for Time Series

Authors: Wei Cao, Dong Wang, Jian Li, Hao Zhou, Lei Li, Yitan Li

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on three real-world datasets, including an air quality dataset, a healthcare dataset, and a localization dataset for human activity. Experiments show that our model outperforms the state-of-the-art methods in both imputation and classification/regression.
Researcher Affiliation Collaboration Wei Cao Tsinghua University Bytedance AI Lab EMAIL Dong Wang Duke University EMAIL Jian Li Tsinghua University EMAIL Hao Zhou Bytedance AI Lab EMAIL Yitan Li Bytedance AI Lab EMAIL Lei Li Bytedance AI Lab EMAIL
Pseudocode No The paper describes algorithms using equations and text but does not contain a structured pseudocode block or a section explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code Yes The download links of the datasets, as well as the implementation codes can be found in the Git Hub page4. https://github.com/caow13/BRITS
Open Datasets Yes We evaluate our models on the air quality dataset, which consists of PM2.5 measurements from 36 monitoring stations in Beijing. We evaluate our models on health-care data in Physio Net Challenge 2012 [27]. The UCI localization data for human activity [18]
Dataset Splits Yes For the imputation tasks, we randomly select 10% of non-missing values as the validation data. The early stopping is thus performed based on the validation error. For the classification tasks, we first pre-train the model as a pure imputation task and report its imputation accuracy. Then we use 5-fold cross validation to further optimize both the imputation and classification losses simultaneously.
Hardware Specification Yes All models are trained with GPU GTX 1080.
Software Dependencies No The paper mentions 'Py Torch' and 'python package fancyimpute' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes To make a fair comparison, we control the number of parameters of all models as around 80, 000. We train our model by an Adam optimizer with learning rate 0.001 and batch size 64. For all the tasks, we normalize the numerical values to have zero mean and unit variance for stable training.