Local Ordinal Embedding

Authors: Yoshikazu Terada, Ulrike Luxburg

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments we focus on the case of local ordinal embedding (experiments for more general soft ordinal embedding are provided in the supplementary material). We ran extensive simulations to compare our algorithms to its competitors (the main paper and the supplementary material). They show that our method is very good at discovering the density structure of data sets and the geometric structure of graphs.
Researcher Affiliation Collaboration Yoshikazu Terada1,2 TERADA@SIGMATH.ES.OSAKA-U.AC.JP 1Graduate School of Engineering Science, Osaka University, Japan 2Ci Net, National Institute of Information and Communications Technology, Japan. Ulrike von Luxburg3 LUXBURG@INFORMATIK.UNI-HAMBURG.DE 3Department of Computer Science, University of Hamburg, Germany
Pseudocode Yes The proof of this proposition is provided in the supplementary material, as well as the pseudocode for solving the soft ordinal embedding problem based on this majorizing function. The pseudocode of the resulting majorization algorithm is presented in Algorithm 1, called LOE-MM in the following.
Open Source Code Yes The code of our algorithm has been published as an official R-package (Terada & von Luxburg, 2014). URL http://cran.r-project.org/web/packages/loe/index.html.
Open Datasets No We sampled n points in R2 from a distribution that has two uniform high-density squares, surrounded by a uniform low density region. Next we consider a higher dimensional Gaussian mixture model with three components. We define the mean vectors of the three components as µl := Acl (l = 1, 2, 3), where c1 := (√3, 0), c2 := (−√3/2, 3/2), c3 := (−√3/2, −3/2) and A is a random p × 2 orthonormal matrix. The points Xi = [Xi1, . . . , Xip]T (i = 1, . . . , n) are generated as Xi := P3 l=1 uilµl + εil, where ui = (ui1, ui2, ui3) and εik are independently generated from the multinomial distribution for three trials with equal probabilities and the p-dimensional standard normal distribution Np(0, Ip), respectively.
Dataset Splits No No explicit description of training/validation/test splits, specific percentages, absolute sample counts, or cross-validation setup was found in the paper. The paper mentions statistical evaluation by constructing 100 graphs, but this refers to multiple experiment runs rather than a validation split.
Hardware Specification Yes This experiment was performed on a 3 GHz Intel core i7 with 8 GB of memory. It took approximately 2 months to get the solutions of GNMDS for 10 data sets, after which we stopped (in this experiment, GNMDS was performed on a 1.9 GHz Intel core i5 with 8 GB of memory).
Software Dependencies Yes LOE-SD (the steepest descent algorithm, as implemented in C and R, with the step size 1 and the rate parameter of the backtracking line search 0.5); LOE-BFGS (a Newton-like algorithm, the Broyden-Fletcher-Goldfarb-Shanno algorithm implemented as the optim function in the R stats package), and our majorization algorithm LOE-MM. The code of our algorithm has been published as an official R-package (Terada & von Luxburg, 2014).
Experiment Setup Yes This experiment was performed on a 3 GHz Intel core i7 with 8 GB of memory. The results are depicted in Figure 5. To compare the computational costs of the methods, we measured the required time for 50 iterations of each algorithm, for each parameter choice. More specifically, we additionally compared three different methods to minimize our objective functions: LOE-SD (the steepest descent algorithm, as implemented in C and R, with the step size 1 and the rate parameter of the backtracking line search 0.5); LOE-BFGS (a Newton-like algorithm, the Broyden-Fletcher-Goldfarb-Shanno algorithm implemented as the optim function in the R stats package), and our majorization algorithm LOE-MM.