Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Training Deep Neural Networks via Direct Loss Minimization

Authors: Yang Song, Alexander Schwing, Richard, Raquel Urtasun

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we develop a novel dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection, especially in the presence of label noise.
Researcher Affiliation Academia Yang Song EMAIL Dept. of Physics, Tsinghua University, Beijing 100084, China Alexander G. Schwing EMAIL Richard S. Zemel EMAIL Raquel Urtasun EMAIL Dept. of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada
Pseudocode Yes Figure 1. Our algorithm for direct loss minimization. Figure 2. Our algorithm for AP loss-augmented maximization or minimization.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes In the next experiment we use the PASCAL VOC2012 action classification dataset provided by Everingham et al. (2014). For object detection we use the PASCAL VOC2012 object detection dataset collected by Everingham et al. (2014).
Dataset Splits Yes We then randomly divide the generated data into a training set containing 10,000 elements and a test set containing the rest. For each of the 10 target classes, we divide the trainval dataset into equal-sized training, validation and test sets.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts) are provided for the experimental setup.
Software Dependencies No The paper mentions using specific network architectures (e.g., Krizhevsky et al. (2012)), but does not provide version numbers for any software libraries or frameworks used.
Experiment Setup Yes For all algorithms we used the entire available training set in a single batch and performed 300 iterations. For our final results, we use a learning rate of 0.1, a regularization parameter of 1 10 7, and ϵ = 0.1 for all classes.