Human-Guided Complexity-Controlled Abstractions

Authors: Andi Peng, Mycal Tucker, Eoin Kenny, Noga Zaslavsky, Pulkit Agrawal, Julie A Shah

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In finetuning experiments, using only a small number of labeled examples for a new task, we show that (1) tuning the representation to a task-appropriate complexity level supports the highest finetuning performance, and (2) in a human-participant study, users were able to identify the appropriate complexity level for a downstream task using visualizations of discrete representations.
Researcher Affiliation Academia Andi Peng MIT Mycal Tucker MIT Eoin M. Kenny MIT Noga Zaslavsky UC Irvine Pulkit Agrawal MIT Julie A. Shah MIT
Pseudocode No The paper describes its methods in prose and mathematical equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes 1Code available at github.com/mycal-tucker/human-guided-abstractions.
Open Datasets Yes We trained agents on three image classification datasets: Fashion MNIST [33], CIFAR100 [14], and i Naturalist 2019 (i Nat) [10].
Dataset Splits Yes In finetuning, we held out a subset of the data for assessing finetuning accuracy and selected the best-performing encoder via validation set accuracy. For a given k and v, we randomly sampled v datapoints per class label to be part of the validation set and used the remaining data for finetuning.
Hardware Specification Yes Pre-training a single model for 200 epochs took approximately 5 minutes on a desktop computer with one NVIDIA 2080 Ge Force RTX. Pre-training a single model for 400 epochs took approximately 10 minutes on a desktop computer with one NVIDIA 3080.
Software Dependencies No The paper mentions software components like 'Adam optimizer', 'feedforward networks', 'ReLU activation', and 'ResNet18' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Predictor neural networks were instantiated as feedforward networks with 4 fully connected layers, with hidden dimension 256 and ReLU activations, and trained to map from shifted encodings to classifications for 100 epochs using an Adam optimizer with default parameters, with the learning rate decreasing by a factor of 10 based on plateauing training loss, with a patience of 5 epochs, and early stopping if the learning rate fell below 10 8. (Appendix Tables 1, 2, and 3 also provide detailed hyperparameters like batch size, epochs, and various lambda values for different encoders and domains).