Continuous Regularized Wasserstein Barycenters
Authors: Lingxiao Li, Aude Genevay, Mikhail Yurochkin, Justin M. Solomon
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our approach and compare against previous work on synthetic examples and real-world applications. |
| Researcher Affiliation | Collaboration | Lingxiao Li MIT CSAIL lingxiao@mit.edu Aude Genevay MIT CSAIL aude.genevay@gmail.com Mikhail Yurochkin IBM Research, MIT-IBM Watson AI Lab mikhail.yurochkin@ibm.com Justin Solomon MIT CSAIL, MIT-IBM Watson AI Lab jsolomon@mit.edu |
| Pseudocode | Yes | Algorithm 1: Stochastic gradient descent to solve the regularized barycenter problem (11) |
| Open Source Code | Yes | The source code is publicly available at https://github.com/lingxiaoli94/CWB. |
| Open Datasets | Yes | We consider Poisson regression for the task of predicting the hourly number of bike rentals using features such as the day of the week and weather conditions.3 (Footnote 3: http://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset) |
| Dataset Splits | No | The paper describes splitting the data into 5 equally-sized subsets for posterior aggregation, but does not explicitly mention a 'validation set' or a dedicated validation split in the context of model training or evaluation. |
| Hardware Specification | Yes | We ran our experiments using a NVIDIA Tesla V100 GPU on a Google cloud instance with 12 computeoptimized CPUs and 64GB memory. |
| Software Dependencies | Yes | The stochastic gradient descent used to solve (11) and (14) is implemented in Tensorflow 2.1 [Aba+16]. |
| Experiment Setup | Yes | In all experiments below, we use Adam optimizer [KB14] with learning rate 10 4 and batch size 4096 or 8192 for the training. The dual potentials {fi, gi}n i=1 in (11) are each parameterized as neural networks with two fully-connected layers (d ! 128 ! 256 ! 1) using Re LU activations. Every Ti in (14) is parameterized with layers (d ! 128 ! 256 ! d). |