Optimization Equivalence of Divergences Improves Neighbor Embedding

Authors: Zhirong Yang, Jaakko Peltonen, Samuel Kaski

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We give two examples of developing new visualization methods through the equivalences: 1) We develop weighted symmetric stochastic neighbor embedding (ws-SNE) from Elastic Embedding and analyze its benefits, good performance for both vectorial and network data; in experiments ws-SNE has good performance across data sets of different types, whereas comparison methods fail for some of the data sets; 2) we develop a γdivergence version of a Poly Log layout method; the new method is scale invariant in the output space and makes it possible to efficiently use large-scale smoothed neighborhoods.
Researcher Affiliation Academia Zhirong Yang2 ZHIRONG.YANG@AALTO.FI Jaakko Peltonen1,4 JAAKKO.PELTONEN@AALTO.FI Samuel Kaski1,3 SAMUEL.KASKI@AALTO.FI 1Helsinki Institute for Information Technology HIIT, 2Department of Information and Computer Science, Aalto University, Finland, 3Department of Computer Science, University of Helsinki, and 4University of Tampere
Pseudocode No The paper describes algorithms and mathematical formulations, but it does not include any distinct pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any specific link or statement about making its source code publicly available for the methods described (ws-SNE or γ-Quad Log).
Open Datasets Yes The compared methods were used to visualize six data sets, two vectorial data and four network data. The descriptions of the data sets are given in the supplemental document. Figure 1 shows the resulting visualizations.
Dataset Splits No The paper evaluates performance using metrics like AUC and discusses KNN neighborhoods, but it does not specify explicit dataset splits for training, validation, or testing, nor does it detail cross-validation setups for reproducibility of data partitioning.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies No The paper mentions various software components and methods used, such as "graphviz," "Lin Log," "t-SNE," "EE," "sfdp layout," "Cauchy kernel," "spectral direction optimization," and "Barnes-Hut trees," but it does not provide specific version numbers for these components.
Experiment Setup Yes For ws-SNE, we adopted the Cauchy kernel, spectral direction optimization (Vladymyrov & Carreira-Perpi n an, 2012) and scalable implementation with Barnes-Hut trees (van der Maaten, 2013; Yang et al., 2013). Both ws-SNE and t-SNE were run for the maximum 1000 iterations. We used default settings for the other compared methods (graphviz uses sfdp layout; Hu, 2005).