Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Scaling Up Ordinal Embedding: A Landmark Approach

Authors: Jesse Anderton, Javed Aslam

ICML 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Figure 2 exhibits L-SOE for a variety of values of n and d, on two simulated datasets a set of points sampled from a uniform ball, and a set of points sampled from a Gaussian Mixture Model (GMM) with 10 components. ... We also tested LLOE on two real datasets. We embed 20 Newsgroups (n = 18, 846, d = 101, 631 TF-IDF scores) and MNIST Digits (n = 70, 000, d = 784 pixels), and compare the results to LSA in Figure 4.
Researcher Affiliation Academia 1College of Computer and Information Science, Northeastern University, Boston, Massachusetts. Correspondence to: Jesse Anderton <EMAIL>.
Pseudocode Yes Algorithm 1 LLOE(n, d, L, o)
Open Source Code Yes 1 Available at https://github.com/jesand/lloe.
Open Datasets Yes on two simulated datasets a set of points sampled from a uniform ball, and a set of points sampled from a Gaussian Mixture Model (GMM) with 10 components. ... We also tested LLOE on two real datasets. We embed 20 Newsgroups (n = 18, 846, d = 101, 631 TF-IDF scores) and MNIST Digits (n = 70, 000, d = 784 pixels)
Dataset Splits No The paper mentions evaluating on datasets and metrics like 'Probability perr that a random triplet is incorrect in the embedding', but does not provide specific details on training/validation/test splits (e.g., percentages, sample counts, or explicit instructions for partitioning the data).
Hardware Specification Yes Embeddings run on a late 2013 15 quad core Mac Book Pro with 2 GHz Intel Core i7 CPU and 16GB of RAM.
Software Dependencies No The paper mentions software like 'sklearn.datasets.make blobs()' and 'L-BFGS', but does not provide specific version numbers for any software dependencies required to replicate the experiments.
Experiment Setup Yes Each embedding proceeds for up to 1,000 rounds of L-BFGS, with early termination if no loss decrease is observed. We report the best embedding from 20 random initializations.