Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Training Deep Neural Networks via Direct Loss Minimization
Authors: Yang Song, Alexander Schwing, Richard, Raquel Urtasun
ICML 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our approach in the context of maximizing average precision for ranking problems. Towards this goal, we develop a novel dynamic programming algorithm that can efficiently compute the weight updates. Our approach proves superior to a variety of baselines in the context of action classification and object detection, especially in the presence of label noise. |
| Researcher Affiliation | Academia | Yang Song EMAIL Dept. of Physics, Tsinghua University, Beijing 100084, China Alexander G. Schwing EMAIL Richard S. Zemel EMAIL Raquel Urtasun EMAIL Dept. of Computer Science, University of Toronto, Toronto, Ontario M5S 2E4, Canada |
| Pseudocode | Yes | Figure 1. Our algorithm for direct loss minimization. Figure 2. Our algorithm for AP loss-augmented maximization or minimization. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | In the next experiment we use the PASCAL VOC2012 action classification dataset provided by Everingham et al. (2014). For object detection we use the PASCAL VOC2012 object detection dataset collected by Everingham et al. (2014). |
| Dataset Splits | Yes | We then randomly divide the generated data into a training set containing 10,000 elements and a test set containing the rest. For each of the 10 target classes, we divide the trainval dataset into equal-sized training, validation and test sets. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) are provided for the experimental setup. |
| Software Dependencies | No | The paper mentions using specific network architectures (e.g., Krizhevsky et al. (2012)), but does not provide version numbers for any software libraries or frameworks used. |
| Experiment Setup | Yes | For all algorithms we used the entire available training set in a single batch and performed 300 iterations. For our final results, we use a learning rate of 0.1, a regularization parameter of 1 10 7, and ϵ = 0.1 for all classes. |