reproducibilityindex.ai

Predict and Constrain: Modeling Cardinality in Deep Structured Prediction

Authors: Nataly Brukhim, Amir Globerson

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our approach outperforms strong baselines, achieving state-of-the-art results on multi-label classiﬁcation benchmarks.
Researcher Affiliation	Academia	Nataly Brukhim 1 Amir Globerson 1 1Tel Aviv University, Blavatnik School of Computer Science. Correspondence to: Nataly Brukhim <natalybr@mail.tau.ac.il>, Amir Globerson <gamir@post.tau.ac.il>.
Pseudocode	Yes	Algorithm 1 Soft projection onto the simplex
Open Source Code	No	The paper mentions that 'The E2E-SPEN results were obtained by running their publicly available code on these datasets.' This refers to a baseline method's code, not the code for the authors' own method.
Open Datasets	Yes	We use 3 standard MLC benchmarks, as used by other recent approaches (Belanger & Mc Callum 2016; Gygli et al. 2017; Amos & Kolter 2017): Bibtex, Delicious, and Bookmarks.
Dataset Splits	No	All of the hyperparameters were tuned on development data. While 'development data' often implies a validation set, the paper does not specify explicit split percentages or counts for validation data.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions using 'TensorFlow' but does not specify its version number or any other software dependencies with version numbers, which are required for full reproducibility.
Experiment Setup	Yes	For all neural networks we use a single hidden layer, with Re LU activations. For the unrolled optimization we used gradient ascent with momentum 0.9, unrolled for T iterations, with T ranging between 10 20, and with R = 2 alternating projection iterations. All of the hyperparameters were tuned on development data. We trained our network using Ada Grad (Duchi et al., 2011) with learning rate η = 0.1.