Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Improving Generalization of Deep Neural Networks by Optimum Shifting
Authors: Yuyan Zhou, Ye Li, Lei Feng, Sheng-Jun Huang
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments (including classification and detection) with various deep neural network architectures on benchmark datasets demonstrate the effectiveness of our method. Applying SOS to Trained Models Test Accuracy. We first evaluate SOS by applying it to trained deep models on CIFAR-10 and CIFAR-100 dataset |
| Researcher Affiliation | Academia | 1MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China. 2Information Systems Technology and Design Pillar, Singapore University of Technology and Design EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: SOS algorithm during training Input: Training set S π π=1{(ππ, ππ)}, batch size π1, π2 for SGD and SOS, step size πΎ> 0. 1 for number of training epochs do 2 Sample batch B = {(π1, π1), ...(ππ2, ππ2)}; 3 Compute the input and output matrix ; 4 π¨= ππ,1, ππ,2, , ππ,π2 π= π½πππ,1, π½πππ,2, , π½πππ,π2 ; 5 for each columns π½πin the final linear layer do 6 Conduct Gaussian elimination to make π¨ row-independent: [π¨ , ππ ] = Gaussian eliminate([π¨, ππ]) ; 7 Update the parameters: π½ π= π¨ (π¨ (π¨ )π) 1ππ ; 8 for π‘= 0, 1, , π do 9 Update all parameters using SGD; 10 Wπ‘= Wπ‘ 1 πΎ1 π1 Γπ1 π=1 Wπ‘ 1 πΏ; |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. It mentions using YOLOV5 (Jocher et al. 2020) but doesn't provide its own code. |
| Open Datasets | Yes | applying it to trained deep models on CIFAR-10 and CIFAR-100 dataset (Krizhevsky, Hinton et al. 2009)... Image Net classification dataset... PASCAL VOC dataset (Everingham et al. 2010). |
| Dataset Splits | Yes | applying it to trained deep models on CIFAR-10 and CIFAR-100 dataset (Krizhevsky, Hinton et al. 2009), which consists of 50k training images and 10k testing images in 10 and 100 classes. |
| Hardware Specification | No | The paper mentions 'High Performance Computing Platform of Nanjing University of Aeronautics and Astronautics' in the acknowledgements, but does not specify any particular hardware details such as GPU/CPU models or memory used for the experiments. |
| Software Dependencies | No | The paper refers to using 'Py Torch official pretrained models' but does not provide specific software dependencies with version numbers for replication. |
| Experiment Setup | Yes | Following (Huang et al. 2017), the weight decay is 10 4 and a Nesterov momentum of 0.9 without damping. The batch size is set to 64 and the models are trained for 300 epochs. The initial learning rate is set to 0.1 and is reduced by a factor of 10 at 50% and 75% of the total training epochs. |