Minimizing $f$-Divergences by Interpolating Velocity Fields
Authors: Song Liu, Jiahao Yu, Jack Simons, Mingxuan Yi, Mark Beaumont
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate their effectiveness using novel applications on domain adaptation and missing data imputation. |
| Researcher Affiliation | Academia | 1University of Bristol, Bristol, UK. Correspondence to: Song Liu <song.liu@bristol.ac.uk>. |
| Pseudocode | Yes | Algorithm 1 Searching for a Density Ratio Preserving Map s |
| Open Source Code | Yes | The code for reproducing our results can be found at https://github.com/ anewgithubname/gradest2. |
| Open Datasets | Yes | We demonstrate our approach in a toy example, in Figure 3. Table 2 compares the performance of adapted classifiers on a real-world 10-class classification dataset named office-caltech-10, where images of the same objects are taken from four different domains (amazon, caltech, dslr and webcam). ... In the second experiment, we test the performance of our algorithm on a real-world Breast Cancer classification dataset (Zwitter and Soklic, 1988) in Figure 5. ... Available at https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data. ... In this experiment, we first expand MNIST digits into 32 32 pictures then adds a small random noise. |
| Dataset Splits | No | The paper mentions "We construct training and testing sets using cross validation" but does not provide specific details on the split percentages, sample counts, or the exact cross-validation setup for reproducibility. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run its experiments. |
| Software Dependencies | No | The paper mentions using "pyTorch torch.clamp function" and "Adam (Kingma and Ba, 2015)", but it does not specify version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For all methods, we use naive gradient descent to update particles with a fixed step size 0.1. ... S -shape data: TWGF = 100, TGrad Est = 2000, σ is chosen by model selection described in Section I.1. UCI Breast Cancer data: TWGF = 1000, TGrad Est = 100, σ = median( q pairwise distance of X ... we use a kernel bandwidth that equals to 1/5 of the pairwise distances in the particle dataset, as it is too computationally expensive to perform cross validation at each iteration. After each update, we clip pixel values so that they are in between [0, 1]. ... at each iteration, we randomly select 4000 samples from the original dataset and 4000 particles from the particle set. ... we set the learning rates for both WGF and feature space WGF to be 0.1 and the kernel bandwidth σ in our local estimators is tuned using cross validation with a candidate set ranging from 0.1 to 2. |