reproducibilityindex.ai

Dynamic Neural Response Tuning

Authors: Tian Qiu, Wenxiang Xu, Lin Chen, Linyun Zhou, Zunlei Feng, Mingli Song

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental studies indicate that the proposed DNRT is highly interpretable, applicable to various mainstream network architectures, and can achieve remarkable performance compared with existing neural response mechanisms in multiple tasks and domains. 5 EXPERIMENTAL STUDY
Researcher Affiliation	Academia	Tian Qiu, Wenxiang Xu, Lin Chen, Linyun Zhou, Zunlei Feng , Mingli Song Zhejiang University {tqiu,xuwx1996,lin_chen,zhoulyaxx,zunleifeng,songml}@zju.edu.cn
Pseudocode	No	The paper describes the mechanisms mathematically and in prose but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/horrible-dong/DNRT.
Open Datasets	Yes	In the main experiments, we adopt five datasets, including MNIST (Le Cun et al., 1998), CIFAR-10 (Krizhevsky et al., 2009), CIFAR-100 (Krizhevsky et al., 2009), Image Net-100 (Deng et al., 2009), and Image Net-1K (Deng et al., 2009), to verify the effectiveness of the proposed DNRT.
Dataset Splits	Yes	A Long-Tailed CIFAR-10 dataset is generated with reduced training examples per class while the validation set remains unchanged. Datasets. In the main experiments, we adopt five datasets, including MNIST (Le Cun et al., 1998), CIFAR-10 (Krizhevsky et al., 2009), CIFAR-100 (Krizhevsky et al., 2009), Image Net-100 (Deng et al., 2009), and Image Net-1K (Deng et al., 2009)
Hardware Specification	Yes	The experiment is conducted on an NVIDIA A100 (80G). The 'Latency' is obtained, on average, from the model inferring 224 224-pixel images on an NVIDIA 3090.
Software Dependencies	No	All experiments use the same data augmentations provided by timm (Wightman, 2019), Adam W optimizer with weight decay of 0.05, drop-path rate of 0.1, gradient clipping norm of 1.0, and cosine annealing learning rate scheduler with linear warm-up. (Specific version numbers for software libraries or frameworks are not provided.)
Experiment Setup	Yes	In the proposed ARR, the momentum m for updating the moving mean is empirically set to 0.1, and the balanced parameter λ varies depending on networks and datasets (see Appendix A.2). All experiments use the same data augmentations provided by timm (Wightman, 2019), Adam W optimizer with weight decay of 0.05, drop-path rate of 0.1, gradient clipping norm of 1.0, and cosine annealing learning rate scheduler with linear warm-up. Except for simple MLPs, which are trained for only 50 epochs from scratch, other networks are trained for 300 epochs from scratch.