Fractional Adaptive Linear Units

Authors: Julio Zamora, Anthony D. Rhodes, Lama Nachman8988-8996

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments on a variety of conventional tasks and network architectures, we demonstrate the effectiveness of FALUs when compared to traditional and state-of-the-art AFs.
Researcher Affiliation Industry Intel Labs julio.c.zamora.esquivel@intel.com, anthony.rhodes@intel.com, lama.nachman@intel.com
Pseudocode No The paper describes mathematical formulas and derivations but does not include a structured pseudocode or algorithm block.
Open Source Code No To facilitate practical use of this work, we plan to make our code publicly available.
Open Datasets Yes MNIST (Le Cun and Cortes 2010), Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017), CIFAR-10 (Krizhevsky 2009), Image Net (Deng et al. 2009)
Dataset Splits No For each dataset we use conventional train/test splits used in literature. MNIST consists of 60,000 (50k/10k train/test split) 28 28 resolution gray scale images in 10 classes, with 6,000 images per class.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'standard automatic differentiation workflows' and 'standard Deep Learning libraries', and 'Adam optimizer (Kingma and Ba 2014)' but does not specify software names with version numbers.
Experiment Setup Yes For each experiment we used the Adam optimizer (Kingma and Ba 2014) to train our model, and randomly initialized the FALU parameteres in the range α [0, 1] and β [1, 1+ϵ], with ϵ = 0.05;...the FALU function parameters were clamped during training within the domains described previously, i.e., α [0, 2] and β [1, 10]. The model was trained for 120 epochs with an initial learning rate of 0.01 decayed by an order of magnitude every 30 epochs, batch size of 128, and random weight initialization.