Spherization Layer: Representation Using Only Angles
Authors: Hoyong Kim, kangil kim
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the functional correctness of the proposed method in a toy task, retention ability in well-known image classification tasks, and effectiveness in word analogy test and few-shot learning. |
| Researcher Affiliation | Academia | Hoyong Kim, Kangil Kim Artificial Intelligence Graduate School Gwangju Institute of Science and Technology, Gwangju 61005, South Korea |
| Pseudocode | No | The paper describes the spherization layer using mathematical equations and text, but does not include a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Code is publicly available at https://github.com/GIST-IRR/spherization_layer |
| Open Datasets | Yes | We validate the functional correctness of the proposed method in a toy task, retention ability in well-known image classification tasks, and effectiveness in word analogy test and few-shot learning. [...] We used BERT [4] and Ro BERTa [17] to conduct the word analogy test [...] trained them on Wiki Text [19] [...] We used Proto Net [24] with Conv Net and Res Net for few-shot learning on Mini-Image Net [26]. |
| Dataset Splits | Yes | All experiments were performed five times with random seeds and their training and test accuracy were evaluated except word analogy test and few-shot learning. The mean πand standard deviation πof accuracy are represented as π πin each table. |
| Hardware Specification | No | The paper states, 'See Appendix and the supplemental material (.zip file)' for compute and resources, but the provided text from the paper and its appendix does not contain specific hardware details such as GPU/CPU models or memory. |
| Software Dependencies | No | The paper mentions that a 9-layer CNN was 'reproduced in Py Torch', but no specific version numbers for PyTorch or any other software dependencies are provided. |
| Experiment Setup | Yes | We set a 2-layer neural network as the original network, and trained it with the softmax function, cross-entropy, and SGD at a learning rate of 0.01. We trained both networks on the input samples for 100 epochs with 16 mini-batches [...] trained them on Wiki Text [19] for 3 epochs with 8 mini-batches, the softmax function following cross-entropy, SGD at a learning rate 0.0001 on masked-language modeling. |