Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators
Authors: Scott Lowe, Robert Earle, Jason d'Eon, Thomas Trappenberg, Sageev Oore
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We deploy these new activation functions, both in isolation and in conjunction to demonstrate their effectiveness on a variety of tasks including tabular classification, image classification, transfer learning, abstract reasoning, and compositional zero-shot learning. |
| Researcher Affiliation | Collaboration | Scott C. Lowe1,2, Robert Earle1,2, Jason d Eon1,2, Thomas Trappenberg1, Sageev Oore1,2 1Faculty of Computer Science Dalhousie University Halifax, Nova Scotia Canada 2Vector Institute for Artificial Intelligence Toronto, Ontario Canada |
| Pseudocode | No | The paper defines functions mathematically (e.g., ANDAIL(x, y) := { x + y, x < 0, y < 0; min(x, y), otherwise }), but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our python package which provides an implementation of these the activation functions is available at https://github.com/DalhousieAI/pytorch-logit-logic, which is also available as PyPI package pytorch-logit-logic. |
| Open Datasets | Yes | The Bach Chorale dataset (Boulanger-Lewandowski et al., 2012) consists of 382 chorales composed by JS Bach... We tasked 2-layer MLPs with determining whether a short four-part musical excerpt is taken from a Bach chorale." and "We trained 2-layer MLP and 6-layer CNN models on MNIST with ADAM (Kingma & Ba, 2015)... |
| Dataset Splits | Yes | hyperparameters tuned through a random search against a validation set comprised of the last 10k images of the training partition. |
| Hardware Specification | Yes | Additionally, we gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research. |
| Software Dependencies | Yes | For a comprehensive set of baselines, we compared against every activation function built into Py Torch 1.10 (see Appendix A.17). |
| Experiment Setup | Yes | We trained 2-layer MLP and 6-layer CNN models on MNIST with ADAM (Kingma & Ba, 2015), 1-cycle schedule (Smith & Topin, 2017; Smith, 2018), and using hyperparameters tuned through a random search against a validation set comprised of the last 10k images of the training partition. |