Dimensionality reduction: theoretical perspective on practical measures

Authors: Yair Bartal, Nova Fandina, Ofer Neiman

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental All our theoretical results are backed by empirical experiments. We validate our theoretical findings experimentally on various randomly generated Euclidean and non-Euclidean metric spaces, in Section 6.
Researcher Affiliation Academia Yair Bartal Department of Computer Science Hebrew University of Jerusalem Jerusalem, Israel yair@cs.huji.ac.il Nova Fandina Department of Computer Science Hebrew University of Jerusalem Jerusalem, Israel fandina@cs.huji.ac.il Ofer Neiman Department of Computer Science Ben Gurion University of the Negev Beer-Sheva, Israel neimano@cs.bgu.ac.il
Pseudocode No The paper describes algorithms conceptually but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide a direct link to open-source code for the methodology described, nor does it explicitly state that code is provided in supplementary materials.
Open Datasets No The paper states: "We validate our theoretical findings experimentally on various randomly generated Euclidean and non-Euclidean metric spaces, in Section 6." and describes a generation process: "A random Euclidean space X of a fixed size n and dimension d = n = 800 was embedded..." and "The construction of the space is as follows: first, a sampled Euclidean space X, of size and dimension n = d = 100, is generated as above; second, the interpoint distances of X are distorted with a noise factor 1 + ϵ...". The paper does not provide access information (link, DOI, citation) for these generated datasets to be publicly available.
Dataset Splits No The paper mentions running experiments on generated data but does not specify any training, validation, or test dataset splits. It describes parameters for generating data like n, d, k, q, but not how these generated datasets are partitioned for different phases of evaluation.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments (e.g., CPU/GPU models, memory specifications).
Software Dependencies No The paper mentions methods like JL, PCA, Isomap, but does not provide specific software dependencies with version numbers (e.g., library names with their versions).
Experiment Setup Yes A random Euclidean space X of a fixed size n and dimension d = n = 800 was embedded into k [4, 30] dimensions with q = 5, by the JL/PCA/Isomap methods. We stress that we run many more experiments for a wide range of parameter values of n [100, 3000], k [2, 100], q [1, 10]. In the experiment shown in Fig. 2b, tests are shown for embedding dimension k = 20 and q = 2. The construction of the space is as follows: first, a sampled Euclidean space X, of size and dimension n = d = 100, is generated as above; second, the interpoint distances of X are distorted with a noise factor 1 + ϵ, with ϵ N(0, δ), for δ < 1.