Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
BRITS: Bidirectional Recurrent Imputation for Time Series
Authors: Wei Cao, Dong Wang, Jian Li, Hao Zhou, Lei Li, Yitan Li
NeurIPS 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on three real-world datasets, including an air quality dataset, a healthcare dataset, and a localization dataset for human activity. Experiments show that our model outperforms the state-of-the-art methods in both imputation and classification/regression. |
| Researcher Affiliation | Collaboration | Wei Cao Tsinghua University Bytedance AI Lab EMAIL Dong Wang Duke University EMAIL Jian Li Tsinghua University EMAIL Hao Zhou Bytedance AI Lab EMAIL Yitan Li Bytedance AI Lab EMAIL Lei Li Bytedance AI Lab EMAIL |
| Pseudocode | No | The paper describes algorithms using equations and text but does not contain a structured pseudocode block or a section explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | The download links of the datasets, as well as the implementation codes can be found in the Git Hub page4. https://github.com/caow13/BRITS |
| Open Datasets | Yes | We evaluate our models on the air quality dataset, which consists of PM2.5 measurements from 36 monitoring stations in Beijing. We evaluate our models on health-care data in Physio Net Challenge 2012 [27]. The UCI localization data for human activity [18] |
| Dataset Splits | Yes | For the imputation tasks, we randomly select 10% of non-missing values as the validation data. The early stopping is thus performed based on the validation error. For the classification tasks, we first pre-train the model as a pure imputation task and report its imputation accuracy. Then we use 5-fold cross validation to further optimize both the imputation and classification losses simultaneously. |
| Hardware Specification | Yes | All models are trained with GPU GTX 1080. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'python package fancyimpute' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | To make a fair comparison, we control the number of parameters of all models as around 80, 000. We train our model by an Adam optimizer with learning rate 0.001 and batch size 64. For all the tasks, we normalize the numerical values to have zero mean and unit variance for stable training. |