Efficient Proximal Mapping of the 1-path-norm of Shallow Networks
Authors: Fabian Latorre, Paul Rolland, Nadav Hallak, Volkan Cevher
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In section 7, we present numerical evidence that our approach (i) converges faster and to lower values of the objective function, compared to plain SGD; (ii) generates sparse iterates; and, (iii) the magnitude of the regularization parameter of the 1-path-norm allows a better accuracy-robustness trade-off than the common ℓ1 regularization or constraints on layer-wise matrix norms.Our benchmarks are the MNIST (Le Cun & Cortes, 2010), Fashion-MNIST (Xiao et al., 2017) and Kuzushiji-MNIST (Clanuwat et al., 2018). |
| Researcher Affiliation | Academia | 1Laboratory for Information and Inference Systems (LIONS), EPFL, Switzerland. Correspondence to: Fabian Latorre <fabian.latorre@epfl.ch>. |
| Pseudocode | Yes | Algorithm 1 Prox-Grad Method, Algorithm 2 Single-output robust-sparse proximal mapping, Algorithm 3 Multi-output robust-sparse proximal mapping. |
| Open Source Code | No | The paper mentions using PyTorch and TensorFlow, but does not provide any statement or link indicating that the source code for their specific methodology is openly available or will be released. |
| Open Datasets | Yes | Our benchmarks are the MNIST (Le Cun & Cortes, 2010), Fashion-MNIST (Xiao et al., 2017) and Kuzushiji-MNIST (Clanuwat et al., 2018). |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide details about a validation set or specific split percentages for training, validation, and test data. It only explicitly mentions 'test set'. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions the use of 'Py Torch (Paszke et al., 2019) or Tensor Flow (Abadi et al., 2015)' but does not provide specific version numbers for these or any other software dependencies required for reproducibility. |
| Experiment Setup | Yes | For a wide range of learning rates, number of hidden neurons and regularization parameters λ, we train networks with SGD and Proximal-SGD (with constant learning rate). We do so for 20 epochs and with batch size set to 100. |