Improved Error Bounds for Tree Representations of Metric Spaces

Authors: Samir Chowdhury, Facundo Mémoli, Zane T. Smith

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We prove novel additive distortion bounds for two methods of tree embeddings: one into general trees, and one into ultrametric trees. These additive distortion bounds take into account (1) whether the data is treelike, and (2) whether the data has low doubling dimension, which is a measure of its intrinsic dimension. Thus we prove an answer to Question 1 above, namely, that the approximation error made by an optimal tree metric (or optimal ultrametric) can be bounded nontrivially.
Researcher Affiliation Academia Samir Chowdhury Department of Mathematics The Ohio State University Columbus, OH 43210 chowdhury.57@osu.edu Facundo Mémoli Department of Mathematics Department of Computer Science and Engineering The Ohio State University Columbus, OH 43210 memoli@math.osu.edu Zane Smith Department of Computer Science and Engineering The Ohio State University Columbus, OH 43210 smith.9911@osu.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The supplementary material contains (1) an appendix with proofs omitted from the body, (2) a practical demonstration in A where we apply Gromov s embedding to a bitmap image of a tree and show that our upper bounds perform better than the bounds suggested by Gromov s embedding theorem, and (3) Matlab .m files containing demos of Gromov s embedding being applied to various images of trees.
Open Datasets No The paper primarily deals with theoretical bounds and mathematical proofs. While it mentions 'various real-world data sets, such as Internet latencies and biological, social, and collaboration networks' as examples for which their theory is relevant, it does not specify any particular publicly available dataset used for training or evaluation in the main body. The practical demonstration in the appendix uses 'a bitmap image of a tree' which is a type of data, but not a formally identified public dataset with access information.
Dataset Splits No The paper is theoretical and does not describe experiments that would involve training, validation, or test dataset splits.
Hardware Specification No The paper is theoretical and does not provide details on specific hardware used for any computations or demonstrations.
Software Dependencies No The paper mentions 'Matlab .m files' in the supplementary material, but it does not specify a version number for Matlab or any other software dependency.
Experiment Setup No The paper is theoretical and does not provide details about an experimental setup, hyperparameters, or system-level training settings.