Retaining Knowledge for Learning with Dynamic Definition

Authors: Zichang Liu, Benjamin Coleman, Tianyi Zhang, Anshumali Shrivastava

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5, we apply RIDDLE in the dynamic definition setting for four real-world tasks. Our method outperforms baselines by up to 30% on the original dataset while achieving competitive accuracy on the new dataset.
Researcher Affiliation Collaboration Zichang Liu Rice University Houston, TX 77025 zichangliu@rice.edu Benjamin Coleman Rice University Houston, TX 77025 brc7@rice.edu Tianyi Zhang Rice University Houston, TX 77025 tz21@rice.edu Anshumali Shrivastava Rice University/Third AI Corp. Houston, TX 77025 anshumali@rice.edu
Pseudocode Yes Algorithm 1 RIDDLE Training Input: Training Dataset D, |D| = m, number of rows L, number of cells R, learning rate , error function E( ), number of epochs e, batch size b, random seed s. Output: Trained Model S, Counters C
Open Source Code Yes Code is available at https://github.com/lzcemma/RIDDLE.
Open Datasets Yes We consider four datasets that reflect dynamic distributions. Every dataset is organized into two parts: the original dataset Do and the update dataset Du. MNIST Binary, CIFAR10, Image Net, News. The original task for the News dataset [43] is to predict the news topics given a caption and short description.
Dataset Splits No The paper describes how the original (Do) and update (Du) datasets are constructed based on class definitions, but it does not provide specific numerical train/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification Yes All experiments are conducted on a machine with 96 24-core/2-thread/2-socket processors (Intel Xeon(R) Gold 5220R 2.20GHz) and 8 Nvidia V100 32GB.
Software Dependencies No The paper mentions using pretrained models and standard deep learning frameworks (implicitly), but does not specify any software names with version numbers for reproducibility (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup No While Algorithm 1 mentions 'learning rate' and 'batch size', the paper does not provide specific numerical values for hyperparameters or other training configurations (e.g., optimizer type, number of epochs) in the main text.