Learning Ultrametric Trees for Optimal Transport Regression
Authors: Samantha Chen, Puoya Tabaghi, Yusu Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experimental Results We use Py Torch to implement our tree optimization algorithm, denoted Ult. Tree.1 We compare the performance of our method with Sinkhorn distance (Cuturi 2013), Flowtree (Backurs et al. 2019), Quadtree (Indyk and Thaper 2003), weight optimized cluster tree Wasserstein distance (c TWD), weight optimized Quadtree Wasserstein distance (q TWD), and their sliced variants (Yamada et al. 2022). All reported results for Sinkhorn distance are computed with the Python Optimal Transport(Flamary et al. 2021) library and with regularization parameter λ = 1.0. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, University of California, San Diego 2 Halıcıo glu Data Science Institute, University of California, San Diego |
| Pseudocode | Yes | Algorithm 1: Ultrametric tree optimization procedure Input: discrete set X, the distance matrix D for X, learning rate α, maximum number of iterations tmax, training samples S, minimum spanning tree algorithm MST Output: An ultrametric du and associated tree T u X 1: D(0) u = proj(D) 2: T (0) X = MST(D) (simultaneous during proj) 3: Let t = 0. 4: while t < tmax do 5: Compute C(du) for training samples S. 6: Given i, j [N], i , j [N] such that LCA(i, j) = LCA(i , j ), b Di ,j (D(k) u α C(D(k) u ))i ,j 7: i, j [N], b Di,j 1 2 (2 b Di,j b Di,i b Dj,j). 8: Dk+1 u = proj( b D) 9: T (k+1) X = MST( b D) (simultaneous during proj) 10: end while 11: return Dtmax u , T tmax X |
| Open Source Code | Yes | 1All code is publicly available at github.com/chens5/treelearning. |
| Open Datasets | Yes | We compare 1-Wasserstein distance approximations for the Twitter and BBCSport word datasets (Huang et al. 2016) where training data consists of word frequency distributions per document. We also use three graph datasets: (1) USCA312, (2) USAir97 (Rossi and Ahmed 2015) and (3) the Belfast public transit graph (Kujala et al. 2018). Finally, we include a high-dimensional RNAseq dataset (publically available from the Allen Institute) which consists of vectors in R2000 (Yao et al. 2021). |
| Dataset Splits | No | The paper mentions 'training data' and 'training' process but does not provide specific details on how the datasets were split into training, validation, and test sets with percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific hardware details (like GPU/CPU models, memory, or cloud instances) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'Python Optimal Transport' but does not specify their version numbers or any other software dependencies with version information. |
| Experiment Setup | No | While Algorithm 1 mentions 'learning rate α' and 'maximum number of iterations tmax', their specific values for the experiments are not provided in the main text. The paper lacks concrete hyperparameter values or detailed training configurations for the proposed method. |