Optimization Equivalence of Divergences Improves Neighbor Embedding
Authors: Zhirong Yang, Jaakko Peltonen, Samuel Kaski
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We give two examples of developing new visualization methods through the equivalences: 1) We develop weighted symmetric stochastic neighbor embedding (ws-SNE) from Elastic Embedding and analyze its benefits, good performance for both vectorial and network data; in experiments ws-SNE has good performance across data sets of different types, whereas comparison methods fail for some of the data sets; 2) we develop a γdivergence version of a Poly Log layout method; the new method is scale invariant in the output space and makes it possible to efficiently use large-scale smoothed neighborhoods. |
| Researcher Affiliation | Academia | Zhirong Yang2 ZHIRONG.YANG@AALTO.FI Jaakko Peltonen1,4 JAAKKO.PELTONEN@AALTO.FI Samuel Kaski1,3 SAMUEL.KASKI@AALTO.FI 1Helsinki Institute for Information Technology HIIT, 2Department of Information and Computer Science, Aalto University, Finland, 3Department of Computer Science, University of Helsinki, and 4University of Tampere |
| Pseudocode | No | The paper describes algorithms and mathematical formulations, but it does not include any distinct pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific link or statement about making its source code publicly available for the methods described (ws-SNE or γ-Quad Log). |
| Open Datasets | Yes | The compared methods were used to visualize six data sets, two vectorial data and four network data. The descriptions of the data sets are given in the supplemental document. Figure 1 shows the resulting visualizations. |
| Dataset Splits | No | The paper evaluates performance using metrics like AUC and discusses KNN neighborhoods, but it does not specify explicit dataset splits for training, validation, or testing, nor does it detail cross-validation setups for reproducibility of data partitioning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions various software components and methods used, such as "graphviz," "Lin Log," "t-SNE," "EE," "sfdp layout," "Cauchy kernel," "spectral direction optimization," and "Barnes-Hut trees," but it does not provide specific version numbers for these components. |
| Experiment Setup | Yes | For ws-SNE, we adopted the Cauchy kernel, spectral direction optimization (Vladymyrov & Carreira-Perpi n an, 2012) and scalable implementation with Barnes-Hut trees (van der Maaten, 2013; Yang et al., 2013). Both ws-SNE and t-SNE were run for the maximum 1000 iterations. We used default settings for the other compared methods (graphviz uses sfdp layout; Hu, 2005). |