Optimization Can Learn Johnson Lindenstrauss Embeddings

Authors: Nikos Tsikouras, Constantine Caramanis, Christos Tzamos

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate our theoretical results, demonstrating that first and second-order information suffices to learn high-quality embeddings. Our goal is to identify an optimal matrix M that produces embeddings satisfying the JL guarantee (see Definition 1) while minimizing distortion. We show that our method generates embeddings with significantly lower distortion compared to those obtained using the Gaussian construction of the JL lemma.
Researcher Affiliation Academia Nikos Tsikouras UOA & Archimedes / Athena RC n.tsikouras@athenarc.gr Constantine Caramanis UT Austin & Archimedes / Athena RC constantine@utexas.edu Christos Tzamos UOA & Archimedes / Athena RC christos@tzamos.com
Pseudocode Yes Algorithm 1 Hessian Descent. Algorithm 2 An algorithm to find optimal distortion.
Open Source Code Yes We have provided the code we ran for our simulations in the supplementary material.
Open Datasets No We generate a unit norm synthetic dataset with n = 100 data points in d = 500 dimensions, and target dimension k = 30 dimensions.
Dataset Splits No The paper mentions generating a synthetic dataset but does not specify any training, validation, or test splits. It only states the total number of data points and dimensions.
Hardware Specification No Given that this is a theoretical result, the experiments did not require substantial computational resources, and we conducted all experiments on local machines.
Software Dependencies No The paper mentions using the 'Adam optimizer' but does not specify any version numbers for Adam or any other software components.
Experiment Setup Yes We run our optimization for 5000 iterations, using the Adam optimizer [Kingma and Ba, 2014] with a learning rate of 0.01 and batch size N = 20.