reproducibilityindex.ai

Towards Fully Sparse Training: Information Restoration with Spatial Similarity

Authors: Weixiang Xu, Xiangyu He, Ke Cheng, Peisong Wang, Jian Cheng2929-2937

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluation of accuracy and efﬁciency shows that we can achieve 2 training acceleration with negligible accuracy degradation on challenging large-scale classiﬁcation and detection tasks.In this section, we evaluate the proposed FST in terms of accuracy and efﬁciency. Our experiments are conducted on image classiﬁcation and object detection.
Researcher Affiliation	Academia	Weixiang Xu1,2, Xiangyu He1,2, Ke Cheng1,2, Peisong Wang1, Jian Cheng1 1NLPR, Institute of Automation, Chinese Academy of Sciences 2School of Artiﬁcial Intelligence, University of Chinese Academy of Sciences {xuweixiang2018,chengke2017}@ia.ac.cn, {xiangyu.he, peisong.wang, jcheng}@nlpr.ia.ac.cn
Pseudocode	No	The paper describes its methods in prose but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the methodology or provide a link to a code repository.
Open Datasets	Yes	To verify the effectiveness of our method, we ﬁrst evaluate it on the large-scale Image Net.The PASCAL VOC dataset contains around 16k training images with 20 different classes, while the COCO dataset consists of about 80k training images from 80 different categories.
Dataset Splits	No	The paper mentions training details like 'batch size 256 for 120 epochs' but does not explicitly state the dataset splits (e.g., percentage for training, validation, and test sets).
Hardware Specification	Yes	The execution environment is as below: Tesla A100 GPU 1, Py Torch 1.7, CUDA 11.1.
Software Dependencies	Yes	The execution environment is as below: Tesla A100 GPU 1, Py Torch 1.7, CUDA 11.1.
Experiment Setup	Yes	We follow hyperparameter settings as (Zhou et al. 2021): all models are trained with batch size 256 for 120 epochs, and learning rate is annealed from 0.1 to 0 with a cosine scheduler. In order to reproduce their reported accuracy, we set weight decay as 7e-5 and use label smooth.