Distinguishing rule and exemplar-based generalization in learning systems
Authors: Ishita Dasgupta, Erin Grant, Tom Griffiths
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We find that standard neural network models are feature-biased and exemplar-based, and discuss the implications of these findings for machine learning research on systematic generalization, fairness, and data augmentation.To illustrate our framework in a simple statistical learning problem and quantitatively confirm the intuitions outlined in Section (2), we consider a two-dimensional classification problem. |
| Researcher Affiliation | Collaboration | 1Departments of Psychology & Computer Science, Princeton University 2Department of Electrical Engineering & Computer Sciences, UC Berkeley. @Now at Deep Mind. |
| Pseudocode | No | The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured steps formatted like code. |
| Open Source Code | Yes | Code at https://github.com/eringrant/icml-2022-rules-vs-exemplars. |
| Open Datasets | Yes | sentiment analysis onthe Internet Movie Database Movie Reviews (IMDb) dataset (Maas et al., 2011). and Celeb Faces Attributes (Celeb A) (Liu et al., 2015). |
| Dataset Splits | Yes | The upper-right quadrant in all subfigures of Fig. (3), for which p(zdisc = 1, zdist = 1) = 1, acts as a hold-out set on which we can evaluate generalization to an unseen combination of attribute values. and We limit our analyses to networks that achieve at least 75% validation accuracy (on held-out samples from its own training distribution) to ensure that, despite differences in data variability across training conditions, all models learn a meaningful decision boundary. |
| Hardware Specification | No | The paper describes the models trained (e.g., LSTM, ResNets) but does not provide specific details on the hardware used for running the experiments, such as GPU/CPU models, memory amounts, or cloud instance types. |
| Software Dependencies | No | The paper mentions using 'scikit-learn implementation' and 'standard hyperparameters' but does not provide specific version numbers for these or any other software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or CUDA. |
| Experiment Setup | Yes | We train feedforward rectified linear unit (Re LU) classifiers with varying numbers of hidden layers and hidden units. We use the scikit-learn implementation with default parameters, run 20 times for confidence intervals. and We train a single layer LSTM (20 hidden units; default hyperparameters) on each condition and We train Res Nets of various depths ({10, 18, 34}) and widths ({2, 4, 8, 16, 32, 64}) on 6 different choices for feature pairs, with standard hyperparameters. |