Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs
Authors: Michael Gygli, Mohammad Norouzi, Anelia Angelova
ICML 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed Deep Value Networks on 3 tasks: multi-label classification, binary image segmentation, and a 3-class face segmentation task. Section 5.4 investigates the sampling mechanisms for DVN training, and Section 5.5 visualizes the learned models. |
| Researcher Affiliation | Collaboration | Michael Gygli 1 * Mohammad Norouzi 2 Anelia Angelova 2 ... 1ETH Z urich & gifs.com 2Google Brain, Mountain View, USA. Correspondence to: Michael Gygli <EMAIL>, Mohammad Norouzi <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Deep Value Network training |
| Open Source Code | Yes | Our source code based on Tensor Flow (Abadi et al., 2015) is available at https://github.com/gyglim/dvn. |
| Open Datasets | Yes | We use standard benchmarks in multi-label classification, namely Bibtex and Bookmarks, introduced in (Katakis et al., 2008). |
| Dataset Splits | Yes | We tune the hyperparameters of the model on a validation set and, once best hyper-parameters are found, fine-tune on the combination of training and validation sets. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Tensor Flow (Abadi et al., 2015)' as the basis for the source code, but does not specify a version number for TensorFlow or any other software dependencies needed to replicate the experiment. |
| Experiment Setup | Yes | We use a learning rate of 0.01 and apply dropout on the first fully connected layer with the keeping probability 0.75 as determined on the validation set. |