reproducibilityindex.ai

Optimization Can Learn Johnson Lindenstrauss Embeddings

Authors: Nikos Tsikouras, Constantine Caramanis, Christos Tzamos

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate our theoretical results, demonstrating that first and second-order information suffices to learn high-quality embeddings. Our goal is to identify an optimal matrix M that produces embeddings satisfying the JL guarantee (see Definition 1) while minimizing distortion. We show that our method generates embeddings with significantly lower distortion compared to those obtained using the Gaussian construction of the JL lemma.
Researcher Affiliation	Academia	Nikos Tsikouras UOA & Archimedes / Athena RC n.tsikouras@athenarc.gr Constantine Caramanis UT Austin & Archimedes / Athena RC constantine@utexas.edu Christos Tzamos UOA & Archimedes / Athena RC christos@tzamos.com
Pseudocode	Yes	Algorithm 1 Hessian Descent. Algorithm 2 An algorithm to find optimal distortion.
Open Source Code	Yes	We have provided the code we ran for our simulations in the supplementary material.
Open Datasets	No	We generate a unit norm synthetic dataset with n = 100 data points in d = 500 dimensions, and target dimension k = 30 dimensions.
Dataset Splits	No	The paper mentions generating a synthetic dataset but does not specify any training, validation, or test splits. It only states the total number of data points and dimensions.
Hardware Specification	No	Given that this is a theoretical result, the experiments did not require substantial computational resources, and we conducted all experiments on local machines.
Software Dependencies	No	The paper mentions using the 'Adam optimizer' but does not specify any version numbers for Adam or any other software components.
Experiment Setup	Yes	We run our optimization for 5000 iterations, using the Adam optimizer [Kingma and Ba, 2014] with a learning rate of 0.01 and batch size N = 20.