Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators

Authors: Scott Lowe, Robert Earle, Jason d'Eon, Thomas Trappenberg, Sageev Oore

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We deploy these new activation functions, both in isolation and in conjunction to demonstrate their effectiveness on a variety of tasks including tabular classification, image classification, transfer learning, abstract reasoning, and compositional zero-shot learning.
Researcher Affiliation Collaboration Scott C. Lowe1,2, Robert Earle1,2, Jason d Eon1,2, Thomas Trappenberg1, Sageev Oore1,2 1Faculty of Computer Science Dalhousie University Halifax, Nova Scotia Canada 2Vector Institute for Artificial Intelligence Toronto, Ontario Canada
Pseudocode No The paper defines functions mathematically (e.g., ANDAIL(x, y) := { x + y, x < 0, y < 0; min(x, y), otherwise }), but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our python package which provides an implementation of these the activation functions is available at https://github.com/DalhousieAI/pytorch-logit-logic, which is also available as PyPI package pytorch-logit-logic.
Open Datasets Yes The Bach Chorale dataset (Boulanger-Lewandowski et al., 2012) consists of 382 chorales composed by JS Bach... We tasked 2-layer MLPs with determining whether a short four-part musical excerpt is taken from a Bach chorale." and "We trained 2-layer MLP and 6-layer CNN models on MNIST with ADAM (Kingma & Ba, 2015)...
Dataset Splits Yes hyperparameters tuned through a random search against a validation set comprised of the last 10k images of the training partition.
Hardware Specification Yes Additionally, we gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.
Software Dependencies Yes For a comprehensive set of baselines, we compared against every activation function built into Py Torch 1.10 (see Appendix A.17).
Experiment Setup Yes We trained 2-layer MLP and 6-layer CNN models on MNIST with ADAM (Kingma & Ba, 2015), 1-cycle schedule (Smith & Topin, 2017; Smith, 2018), and using hyperparameters tuned through a random search against a validation set comprised of the last 10k images of the training partition.