Retaining Knowledge for Learning with Dynamic Definition
Authors: Zichang Liu, Benjamin Coleman, Tianyi Zhang, Anshumali Shrivastava
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5, we apply RIDDLE in the dynamic definition setting for four real-world tasks. Our method outperforms baselines by up to 30% on the original dataset while achieving competitive accuracy on the new dataset. |
| Researcher Affiliation | Collaboration | Zichang Liu Rice University Houston, TX 77025 zichangliu@rice.edu Benjamin Coleman Rice University Houston, TX 77025 brc7@rice.edu Tianyi Zhang Rice University Houston, TX 77025 tz21@rice.edu Anshumali Shrivastava Rice University/Third AI Corp. Houston, TX 77025 anshumali@rice.edu |
| Pseudocode | Yes | Algorithm 1 RIDDLE Training Input: Training Dataset D, |D| = m, number of rows L, number of cells R, learning rate , error function E( ), number of epochs e, batch size b, random seed s. Output: Trained Model S, Counters C |
| Open Source Code | Yes | Code is available at https://github.com/lzcemma/RIDDLE. |
| Open Datasets | Yes | We consider four datasets that reflect dynamic distributions. Every dataset is organized into two parts: the original dataset Do and the update dataset Du. MNIST Binary, CIFAR10, Image Net, News. The original task for the News dataset [43] is to predict the news topics given a caption and short description. |
| Dataset Splits | No | The paper describes how the original (Do) and update (Du) datasets are constructed based on class definitions, but it does not provide specific numerical train/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility. |
| Hardware Specification | Yes | All experiments are conducted on a machine with 96 24-core/2-thread/2-socket processors (Intel Xeon(R) Gold 5220R 2.20GHz) and 8 Nvidia V100 32GB. |
| Software Dependencies | No | The paper mentions using pretrained models and standard deep learning frameworks (implicitly), but does not specify any software names with version numbers for reproducibility (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | No | While Algorithm 1 mentions 'learning rate' and 'batch size', the paper does not provide specific numerical values for hyperparameters or other training configurations (e.g., optimizer type, number of epochs) in the main text. |