Biologically Inspired Learning Model for Instructed Vision

Authors: Roy Abel, Shimon Ullman

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate our BU-TD model, learned via Counter-Hebb learning, in two settings: 1) unguided visual processing, to show that CH learning is capable of learning vision models 2) guided visual processing, to evaluate the ability of our model to guide the visual process according to instructions.
Researcher Affiliation Academia Roy Abel Weizmann Institute of Science roy.abel@weizmann.ac.il Shimon Ullman Weizmann Institute of Science shimon.ullman@weizmann.ac.il
Pseudocode Yes Algorithm 1 Counter-Hebb Learning (Section 4.1) and Algorithm 2 Instruction-Based Learning (Section 5)
Open Source Code Yes The code for reproducing the experiments and creating BU-TD models for guided models is available at https://github.com/royabel/Top-Down-Networks.
Open Datasets Yes In the unguided experiments, we evaluate the performance of the Counter-Hebb learning on standard image classification benchmarks: MNIST [Le Cun et al., 1998], Fashion-MNIST [Xiao et al., 2017], and CIFAR10 [Krizhevsky et al., 2009]. We followed the same experiments as Bozkurt et al. [2024] and used two-layer fully connected networks, with a hidden layer of size 500 for both MNIST and Fashion-MNIST datasets and size 1,000 for CIFAR10. Further details including the full set of hyperparameters can be found in Appendix A.4.2.
Dataset Splits No We omitted the validation set, and the hyper-parameters were tuned based solely on the training set.
Hardware Specification Yes All the experiments were conducted using either NVIDIA RTX 6000 GPU or NVIDIA RTX 8000 GPU. For all experiments but Celeb A, a single NVIDIA RTX 6000 GPU was used, with the experiments utilizing only a fraction of its capacity. In the case of the Celeb A dataset, either a single NVIDIA RTX 8000 GPU or two NVIDIA RTX 6000 GPUs were used.
Software Dependencies No The paper mentions software components such as 'The standard Adam optimizer [Ruder, 2016]' and the use of 'Res Net-18 [He et al., 2016] architecture (without the final layer) with batch normalization layers [Ioffe and Szegedy, 2015]' (Section 6.2). However, it does not specify version numbers for any programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA) used in the experiments.
Experiment Setup Yes We trained for 50 epochs with an exponential learning rate decay with γ = 0.95. The initial learning rate was 10 4, and the batch size 20.