A Theoretical Framework for Inference Learning
Authors: Nick Alonso, Beren Millidge, Jeffrey Krichmar, Emre O Neftci
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide extensive simulation results that further support our theoretical interpretations and find IL achieves quicker convergence when trained with mini-batch size one while performing competitively with BP for larger mini-batches when combined with Adam.In this section, we compare the performance of BP based algorithms to IL algorithms. |
| Researcher Affiliation | Academia | 1Department of Cognitive Science, UC Irvine 2Department of Computer Science, UC Irvine 3MRC Brain Network Dynamics Unit, University of Oxford 4Electrical Engineering and Information Technology, RWTH Aachen, Germany and Peter Grünberg Institute, Forschungszentrum Jülich, Germany |
| Pseudocode | Yes | Algorithm 1: Generalized BP |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] This information is in supplementary material and in link to code |
| Open Datasets | Yes | We trained our BP and IL algorithms on CIFAR-10 with mini-batch size 64. |
| Dataset Splits | No | No explicit percentages or sample counts for training, validation, and test splits were found in the main text. The paper uses standard datasets like CIFAR-10 and F-MNIST, which have predefined splits, but these details are not explicitly stated within the paper itself. |
| Hardware Specification | No | The paper states: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] Is included in supplementary material'. However, these details are not provided in the main body of the paper. |
| Software Dependencies | No | The paper mentions software like 'Adam optimizers' but does not specify any version numbers for software dependencies. |
| Experiment Setup | Yes | IL-SGD ... optimizes ˆh using SGD to near convergence (25 update steps).IL-Prox ... optimizes ˆh using SGD to near convergence (25 update steps).IL-Prox Fast ... truncates the optimization of ˆh to only 12 iterations.All models use Re LU activations at hidden layers. Softmax is used at output layer for classification, while sigmoid is used on the autoencoder task.MLP size 784-2x500-10 and autoencoder dimension 784-256-100-256-784 are trained on F-MNIST for 50000+ iterations. MLP size 3072-3x1024-10 and autoencoder 3072-1024-500-100500-1024-3072 are trained on CIFAR-10 for 50000+ iterations.We train our BP and IL algorithms on CIFAR-10 with mini-batch size 64. |