Distinguishing rule and exemplar-based generalization in learning systems

Authors: Ishita Dasgupta, Erin Grant, Tom Griffiths

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We find that standard neural network models are feature-biased and exemplar-based, and discuss the implications of these findings for machine learning research on systematic generalization, fairness, and data augmentation.To illustrate our framework in a simple statistical learning problem and quantitatively confirm the intuitions outlined in Section (2), we consider a two-dimensional classification problem.
Researcher Affiliation Collaboration 1Departments of Psychology & Computer Science, Princeton University 2Department of Electrical Engineering & Computer Sciences, UC Berkeley. @Now at Deep Mind.
Pseudocode No The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured steps formatted like code.
Open Source Code Yes Code at https://github.com/eringrant/icml-2022-rules-vs-exemplars.
Open Datasets Yes sentiment analysis onthe Internet Movie Database Movie Reviews (IMDb) dataset (Maas et al., 2011). and Celeb Faces Attributes (Celeb A) (Liu et al., 2015).
Dataset Splits Yes The upper-right quadrant in all subfigures of Fig. (3), for which p(zdisc = 1, zdist = 1) = 1, acts as a hold-out set on which we can evaluate generalization to an unseen combination of attribute values. and We limit our analyses to networks that achieve at least 75% validation accuracy (on held-out samples from its own training distribution) to ensure that, despite differences in data variability across training conditions, all models learn a meaningful decision boundary.
Hardware Specification No The paper describes the models trained (e.g., LSTM, ResNets) but does not provide specific details on the hardware used for running the experiments, such as GPU/CPU models, memory amounts, or cloud instance types.
Software Dependencies No The paper mentions using 'scikit-learn implementation' and 'standard hyperparameters' but does not provide specific version numbers for these or any other software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or CUDA.
Experiment Setup Yes We train feedforward rectified linear unit (Re LU) classifiers with varying numbers of hidden layers and hidden units. We use the scikit-learn implementation with default parameters, run 20 times for confidence intervals. and We train a single layer LSTM (20 hidden units; default hyperparameters) on each condition and We train Res Nets of various depths ({10, 18, 34}) and widths ({2, 4, 8, 16, 32, 64}) on 6 different choices for feature pairs, with standard hyperparameters.