Soft Threshold Ternary Networks

Authors: Weixiang Xu, Xiangyu He, Tianli Zhao, Qinghao Hu, Peisong Wang, Jian Cheng

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate the proposed STTN in terms of qualitative and quantitative studies. Our experiments are conducted on three popular image classification datasets: CIFAR-10, CIFAR-100 and Image Net (ILSVRC12). We test on several representative CNNs including: Alex Net, VGGNet, and Res Net.
Researcher Affiliation Academia Weixiang Xu1 , Xiangyu He1 , Tianli Zhao1 , Qinghao Hu1,2 , Peisong Wang1,2 and Jian Cheng1,2 1Institute of Automation, Chinese Academy of Sciences 2CAS Center for Excellence in Brain Science and Intelligence Technology
Pseudocode No The paper describes the methodology using mathematical equations and textual explanations, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes Our experiments are conducted on three popular image classification datasets: CIFAR-10, CIFAR-100 and Image Net (ILSVRC12).
Dataset Splits No The paper describes data augmentation and training procedures but does not explicitly provide specific training/validation/test dataset splits (e.g., percentages, sample counts, or references to predefined splits) needed for reproduction.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions 'resource limited devices' as a problem without specifying the experimental setup hardware.
Software Dependencies No The paper mentions using 'Adam with default settings' but does not specify any software libraries or frameworks with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, CUDA 11.x).
Experiment Setup Yes In all CIFAR experiments, we pad 2 pixels in each side of images and randomly crop 32 32 size from padded images during training. As for Image Net experiments, we first proportionally resize images to 256 N (N 256) with the short edge to 256. Then we randomly sub-crop them to 224 224 patches with mean subtraction and randomly flipping. ... We use Adam with default settings in all our experiments. The batch size for Image Net is 256. We set weight decay as 1e 6 and momentum as 0.9. All networks on Image Net are trained for 110 epochs. The initial learning rate is 0.005, and we use cosine learning rate decay policy.