Consistent Structured Prediction with Max-Min Margin Markov Networks
Authors: Alex Nowak, Francis Bach, Alessandro Rudi
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 6, we perform a thorough experimental analysis of the proposed method on classical unstructured and structured prediction settings. ... We perform a comparative experimental analysis for different tasks between M4Ns, M3Ns and CRFs optimized with Generalized BCFW + SP-MP (Algorithm 1 + Algorithm 2), BCFW (Lacoste-Julien et al., 2013) and SDCA (Shalev Shwartz & Zhang, 2013), respectively. ... The results are in Table 1. |
| Researcher Affiliation | Academia | Alex Nowak-Vila 1 Francis Bach 1 Alessandro Rudi 1 1INRIA D epartement d Informatique de l Ecole Normale Sup erieure, PSL Research University. Correspondence to: Alex Nowak-Vila <alex.nowak-vila@inria.fr>. |
| Pseudocode | Yes | Algorithm 1 GBCFW (primal) ... Algorithm 2 SP-MP ( µ(K), ν(K)) OK(v, µ(0)ν(0)) |
| Open Source Code | Yes | 2Code in https://github.com/alexnowakvila/ maxminloss |
| Open Datasets | Yes | We use datasets of the UCI machine learning repository (Asuncion & Newman, 2007) for multi-class classification and ordinal regression, the OCR dataset from Taskar et al. (2004) for sequence prediction and the ranking datasets used by Korba et al. (2018). |
| Dataset Splits | Yes | We use 14 random splits of the dataset into 60% for training, 20% for validation and 20% for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper mentions running methods with their own implementation and using a Gaussian kernel, but does not provide specific software names with version numbers for libraries, programming languages, or frameworks like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | We choose the regularization parameter λ in {2 j}10 j=1 using the validation set and show the average test loss on the test sets in Table 1 of the model with the best λ. We use a Gaussian kernel and perform 50 passes on the data and set the number of iterations of Algorithm 2 to 20 and 10 times the length of the sequence for sequence prediction. |