Geometric Losses for Distributional Learning

Authors: Arthur Mensch, Mathieu Blondel, Gabriel Peyré

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We study the theoretical properties of our loss and showcase its effectiveness on two applications: ordinal regression and drawing generation. We present two experiments that demonstrate the validity and usability of the geometric softmax in practical usecases.
Researcher Affiliation Collaboration 1 Ecole Normale Sup erieure, DMA, Paris, France 2CNRS, France 3NTT Communication Science Laboratories, Kyoto, Japan.
Pseudocode No No pseudocode or algorithm blocks are explicitly labeled or presented in the paper.
Open Source Code Yes We provide a Py Torch package for reusing the discrete geometric softmax layer1. 1github.com/arthurmensch/g-softmax
Open Datasets Yes We use the real-world ordinal datasets provided by Gutierrez et al. (2016), using their predefined 30 cross-validation folds. We train variational auto-encoders on these datasets using, as output layers, (1) the KL divergence with normalized output and (2) our geometric loss with normalized output. These approaches output an image prediction using a softmax/g-softmax over all pixels, which is justified when we seek to output a concentrated distributional output. This is the case for doodles and digits, which can be seen as 1D distributions in a 2D space. It differs from the more common approach that uses a binary cross-entropy loss for every pixel and enables to capture interactions between pixels at the feature extraction level. We use standard KL penalty on the latent space distribution. Google Quick Draw (Ha & Eck, 2018) and MNIST dataset.
Dataset Splits Yes We use the real-world ordinal datasets provided by Gutierrez et al. (2016), using their predefined 30 cross-validation folds. During training, we measure the geometric cross-entropy loss and the Hausdorff divergence on the train and validation set.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, or memory) used for the experiments are provided in the paper.
Software Dependencies No The paper mentions using a 'Py Torch package' for the geometric softmax layer, but it does not specify version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes We use a cross-validated ℓ2 penalty term on the linear score model gθ. We use standard KL penalty on the latent space distribution. Using the g-softmax takes into account a cost between pixels (i, j) and (k, l), that we set to be the Euclidean cost C/σ, where C is the ℓ2 2 cost and σ is the typical distance of interaction we choose σ = 2 in our experiments. Experimental details are reported in Appendix B.