Learning Randomly Perturbed Structured Predictors for Direct Loss Minimization
Authors: Hedda Cohen Indelman, Tamir Hazan
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate empirically the effectiveness of learning this balance in structured discrete spaces. |
| Researcher Affiliation | Academia | 1Technion. Correspondence to: Hedda Cohen Indelman <cohen.hedda@campus.technion.ac.il>. |
| Pseudocode | No | The paper describes its methods using text and mathematical equations, and includes architectural diagrams, but it does not contain any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code may be found in https: //github.com/Hedda Cohen Indelman/ Perturbed Structured Predictors Direct. |
| Open Datasets | Yes | We report the classification accuracies on the standard test sets in Table 3. For MNIST and Fashion-MNIST, our method matched or outperformed Neural Sort (Grover et al., 2019) and Relax Sub Sample (Xie and Ermon, 2019)... For CIFAR-10, our method outperformed Neural Sort and Relax Sub Sample... |
| Dataset Splits | No | The paper mentions a 'training set' and 'test set' for the bipartite matching experiment ('the training set consists of 10 random sequences of length d and a test set that consists of a single sequence of the same length d'). For the k-NN experiments on MNIST, Fashion-MNIST, and CIFAR-10, it refers to 'standard test sets', but does not explicitly provide specific train/validation/test splits or percentages. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific solver versions) used in the experiments. |
| Experiment Setup | Yes | In all direct loss based experiments we set a negative ϵ. The network µ has a first fully connected layer that links the sets of samples to an intermediate representation (with 32 neurons), and a second (fully connected) layer that turns those representations into batches of latent permutation matrices of dimension d by d each. ... The network σ has a single layer connecting input sample sequences to a single output which is then activated by a softplus activation. to perform 20 Sinkhorn iterations and 10 different reconstruction for each batch sample. |