reproducibilityindex.ai

Scaling Up Ordinal Embedding: A Landmark Approach

Authors: Jesse Anderton, Javed Aslam

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Figure 2 exhibits L-SOE for a variety of values of n and d, on two simulated datasets a set of points sampled from a uniform ball, and a set of points sampled from a Gaussian Mixture Model (GMM) with 10 components. ... We also tested LLOE on two real datasets. We embed 20 Newsgroups (n = 18, 846, d = 101, 631 TF-IDF scores) and MNIST Digits (n = 70, 000, d = 784 pixels), and compare the results to LSA in Figure 4.
Researcher Affiliation	Academia	1College of Computer and Information Science, Northeastern University, Boston, Massachusetts. Correspondence to: Jesse Anderton <jesse@ccs.edu>.
Pseudocode	Yes	Algorithm 1 LLOE(n, d, L, o)
Open Source Code	Yes	1 Available at https://github.com/jesand/lloe.
Open Datasets	Yes	on two simulated datasets a set of points sampled from a uniform ball, and a set of points sampled from a Gaussian Mixture Model (GMM) with 10 components. ... We also tested LLOE on two real datasets. We embed 20 Newsgroups (n = 18, 846, d = 101, 631 TF-IDF scores) and MNIST Digits (n = 70, 000, d = 784 pixels)
Dataset Splits	No	The paper mentions evaluating on datasets and metrics like 'Probability perr that a random triplet is incorrect in the embedding', but does not provide specific details on training/validation/test splits (e.g., percentages, sample counts, or explicit instructions for partitioning the data).
Hardware Specification	Yes	Embeddings run on a late 2013 15 quad core Mac Book Pro with 2 GHz Intel Core i7 CPU and 16GB of RAM.
Software Dependencies	No	The paper mentions software like 'sklearn.datasets.make blobs()' and 'L-BFGS', but does not provide specific version numbers for any software dependencies required to replicate the experiments.
Experiment Setup	Yes	Each embedding proceeds for up to 1,000 rounds of L-BFGS, with early termination if no loss decrease is observed. We report the best embedding from 20 random initializations.