Towards Exact Computation of Inductive Bias
Authors: Akhilan Boopathy, William Yue, Jaedong Hwang, Abhiram Iyer, Ila Fiete
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically apply our inductive bias metric to a range of domains including supervised image classification, reinforcement learning (RL) and few-shot metalearning. Consistent with prior work, we find that tasks with higher dimensional inputs require more inductive bias. We empirically find that neural networks encode massive amounts of inductive bias relative to other expressive model classes. |
| Researcher Affiliation | Academia | Akhilan Boopathy , William Yue , Jaedong Hwang , Abhiram Iyer , Ila Fiete Massachusetts Institute of Technology akhilan@mit.edu |
| Pseudocode | Yes | Algorithm 1 Kernel-based Sampling |
| Open Source Code | Yes | 1https://github.com/FieteLab/Exact-Inductive-Bias |
| Open Datasets | Yes | Our evaluation includes benchmark tasks across various domains: MNIST [Lecun et al., 1998], CIFAR-10 [Krizhevsky et al., 2009], 5-way 1-shot Omniglot [Lake et al., 2015] and inverted pendulum control [Florian, 2007]. |
| Dataset Splits | No | The paper mentions 'training data' and 'test set' but does not explicitly specify exact percentages, sample counts, or detailed methodology for train/validation/test splits in the main text. It refers to an appendix for 'further experimental details', but those are not provided in the main paper. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Tensorflow' in Table 2, but it does not specify any software dependencies with their version numbers (e.g., Python 3.x, PyTorch x.x, specific library versions) that would be needed for reproducibility. |
| Experiment Setup | No | While the paper describes the model architectures (e.g., 'a high-capacity Re LU-activated fully-connected neural network with 9 layers and 512 units per hidden layer') and loss function ('mean squared error loss'), it does not provide concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or specific optimizer settings in the main text. It refers to an appendix for 'further experimental details'. |