Consistent Structured Prediction with Max-Min Margin Markov Networks

Authors: Alex Nowak, Francis Bach, Alessandro Rudi

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 6, we perform a thorough experimental analysis of the proposed method on classical unstructured and structured prediction settings. ... We perform a comparative experimental analysis for different tasks between M4Ns, M3Ns and CRFs optimized with Generalized BCFW + SP-MP (Algorithm 1 + Algorithm 2), BCFW (Lacoste-Julien et al., 2013) and SDCA (Shalev Shwartz & Zhang, 2013), respectively. ... The results are in Table 1.
Researcher Affiliation Academia Alex Nowak-Vila 1 Francis Bach 1 Alessandro Rudi 1 1INRIA D epartement d Informatique de l Ecole Normale Sup erieure, PSL Research University. Correspondence to: Alex Nowak-Vila <alex.nowak-vila@inria.fr>.
Pseudocode Yes Algorithm 1 GBCFW (primal) ... Algorithm 2 SP-MP ( µ(K), ν(K)) OK(v, µ(0)ν(0))
Open Source Code Yes 2Code in https://github.com/alexnowakvila/ maxminloss
Open Datasets Yes We use datasets of the UCI machine learning repository (Asuncion & Newman, 2007) for multi-class classification and ordinal regression, the OCR dataset from Taskar et al. (2004) for sequence prediction and the ranking datasets used by Korba et al. (2018).
Dataset Splits Yes We use 14 random splits of the dataset into 60% for training, 20% for validation and 20% for testing.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper mentions running methods with their own implementation and using a Gaussian kernel, but does not provide specific software names with version numbers for libraries, programming languages, or frameworks like Python, PyTorch, or TensorFlow.
Experiment Setup Yes We choose the regularization parameter λ in {2 j}10 j=1 using the validation set and show the average test loss on the test sets in Table 1 of the model with the best λ. We use a Gaussian kernel and perform 50 passes on the data and set the number of iterations of Algorithm 2 to 20 and 10 times the length of the sequence for sequence prediction.