Towards Exact Computation of Inductive Bias

Authors: Akhilan Boopathy, William Yue, Jaedong Hwang, Abhiram Iyer, Ila Fiete

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically apply our inductive bias metric to a range of domains including supervised image classification, reinforcement learning (RL) and few-shot metalearning. Consistent with prior work, we find that tasks with higher dimensional inputs require more inductive bias. We empirically find that neural networks encode massive amounts of inductive bias relative to other expressive model classes.
Researcher Affiliation Academia Akhilan Boopathy , William Yue , Jaedong Hwang , Abhiram Iyer , Ila Fiete Massachusetts Institute of Technology akhilan@mit.edu
Pseudocode Yes Algorithm 1 Kernel-based Sampling
Open Source Code Yes 1https://github.com/FieteLab/Exact-Inductive-Bias
Open Datasets Yes Our evaluation includes benchmark tasks across various domains: MNIST [Lecun et al., 1998], CIFAR-10 [Krizhevsky et al., 2009], 5-way 1-shot Omniglot [Lake et al., 2015] and inverted pendulum control [Florian, 2007].
Dataset Splits No The paper mentions 'training data' and 'test set' but does not explicitly specify exact percentages, sample counts, or detailed methodology for train/validation/test splits in the main text. It refers to an appendix for 'further experimental details', but those are not provided in the main paper.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions 'Tensorflow' in Table 2, but it does not specify any software dependencies with their version numbers (e.g., Python 3.x, PyTorch x.x, specific library versions) that would be needed for reproducibility.
Experiment Setup No While the paper describes the model architectures (e.g., 'a high-capacity Re LU-activated fully-connected neural network with 9 layers and 512 units per hidden layer') and loss function ('mean squared error loss'), it does not provide concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific optimizer settings in the main text. It refers to an appendix for 'further experimental details'.