Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Distinguishing rule and exemplar-based generalization in learning systems

Authors: Ishita Dasgupta, Erin Grant, Tom Griffiths

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We find that standard neural network models are feature-biased and exemplar-based, and discuss the implications of these findings for machine learning research on systematic generalization, fairness, and data augmentation.To illustrate our framework in a simple statistical learning problem and quantitatively confirm the intuitions outlined in Section (2), we consider a two-dimensional classification problem.
Researcher Affiliation Collaboration 1Departments of Psychology & Computer Science, Princeton University 2Department of Electrical Engineering & Computer Sciences, UC Berkeley. @Now at Deep Mind.
Pseudocode No The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured steps formatted like code.
Open Source Code Yes Code at https://github.com/eringrant/icml-2022-rules-vs-exemplars.
Open Datasets Yes sentiment analysis onthe Internet Movie Database Movie Reviews (IMDb) dataset (Maas et al., 2011). and Celeb Faces Attributes (Celeb A) (Liu et al., 2015).
Dataset Splits Yes The upper-right quadrant in all subfigures of Fig. (3), for which p(zdisc = 1, zdist = 1) = 1, acts as a hold-out set on which we can evaluate generalization to an unseen combination of attribute values. and We limit our analyses to networks that achieve at least 75% validation accuracy (on held-out samples from its own training distribution) to ensure that, despite differences in data variability across training conditions, all models learn a meaningful decision boundary.
Hardware Specification No The paper describes the models trained (e.g., LSTM, ResNets) but does not provide specific details on the hardware used for running the experiments, such as GPU/CPU models, memory amounts, or cloud instance types.
Software Dependencies No The paper mentions using 'scikit-learn implementation' and 'standard hyperparameters' but does not provide specific version numbers for these or any other software dependencies like deep learning frameworks (e.g., PyTorch, TensorFlow) or CUDA.
Experiment Setup Yes We train feedforward rectified linear unit (Re LU) classifiers with varying numbers of hidden layers and hidden units. We use the scikit-learn implementation with default parameters, run 20 times for confidence intervals. and We train a single layer LSTM (20 hidden units; default hyperparameters) on each condition and We train Res Nets of various depths ({10, 18, 34}) and widths ({2, 4, 8, 16, 32, 64}) on 6 different choices for feature pairs, with standard hyperparameters.