Spherization Layer: Representation Using Only Angles

Authors: Hoyong Kim, kangil kim

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the functional correctness of the proposed method in a toy task, retention ability in well-known image classification tasks, and effectiveness in word analogy test and few-shot learning.
Researcher Affiliation Academia Hoyong Kim, Kangil Kim Artificial Intelligence Graduate School Gwangju Institute of Science and Technology, Gwangju 61005, South Korea
Pseudocode No The paper describes the spherization layer using mathematical equations and text, but does not include a formal pseudocode or algorithm block.
Open Source Code Yes Code is publicly available at https://github.com/GIST-IRR/spherization_layer
Open Datasets Yes We validate the functional correctness of the proposed method in a toy task, retention ability in well-known image classification tasks, and effectiveness in word analogy test and few-shot learning. [...] We used BERT [4] and Ro BERTa [17] to conduct the word analogy test [...] trained them on Wiki Text [19] [...] We used Proto Net [24] with Conv Net and Res Net for few-shot learning on Mini-Image Net [26].
Dataset Splits Yes All experiments were performed five times with random seeds and their training and test accuracy were evaluated except word analogy test and few-shot learning. The mean πœ‡and standard deviation 𝜎of accuracy are represented as πœ‡ 𝜎in each table.
Hardware Specification No The paper states, 'See Appendix and the supplemental material (.zip file)' for compute and resources, but the provided text from the paper and its appendix does not contain specific hardware details such as GPU/CPU models or memory.
Software Dependencies No The paper mentions that a 9-layer CNN was 'reproduced in Py Torch', but no specific version numbers for PyTorch or any other software dependencies are provided.
Experiment Setup Yes We set a 2-layer neural network as the original network, and trained it with the softmax function, cross-entropy, and SGD at a learning rate of 0.01. We trained both networks on the input samples for 100 epochs with 16 mini-batches [...] trained them on Wiki Text [19] for 3 epochs with 8 mini-batches, the softmax function following cross-entropy, SGD at a learning rate 0.0001 on masked-language modeling.