Optimal Neighborhood Preserving Visualization by Maximum Satisfiability
Authors: Kerstin Bunte, Matti Järvisalo, Jeremias Berg, Petri Myllymäki, Jaakko Peltonen, Samuel Kaski
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method performs well in experiments, yielding clean embeddings of datasets where a stateof-the-art comparison method yields poor arrangements. In a real-world case study for semi-supervised WLAN signal mapping in buildings we outperform state-of-the-art methods. Experiments We apply the bit-based Max SAT encoding to the visualization of five different types of synthetic and real-world datasets. |
| Researcher Affiliation | Academia | Kerstin Bunte HIIT, Aalto University Finland Matti J arvisalo HIIT, University of Helsinki Finland Jeremias Berg HIIT, University of Helsinki Finland Petri Myllym aki HIIT, University of Helsinki Finland Jaakko Peltonen University of Tampere, Finland HIIT, Aalto University, Finland Samuel Kaski HIIT, Aalto University, University of Helsinki, Finland |
| Pseudocode | No | The paper describes the encoding using propositional logic and provides detailed explanations of its components, but it does not contain a structured pseudocode or algorithm block that is clearly labeled as such. |
| Open Source Code | Yes | The instance constructing source code, the detailed encoding and additional experimental results can be found at http://research.ics.aalto.fi/mi/software/satnerv/. |
| Open Datasets | Yes | Helix: 100 datapoints from a three-dimensional coiled ring (synthetic), see Fig. 2(A). Coil: A standard dimension reduction and visualization benchmark dataset (Nene, Nayar, and Murase 1996) used in the original t-SNE paper (van der Maaten and Hinton 2008) Olivetti: The Olivetti (Samaria and Harter 1994) database contains 400 grayscale facial images, with size 64 64, of several persons. ISMB: Gene expression microarray experiments (Caldas et al. 2009) from the Array Express database (Parkinson et al. 2009). |
| Dataset Splits | No | The paper mentions specific uses of data for 'training' and 'test' in the WLAN case study (e.g., '38 are used for training and are denoted key points, and the remaining 66 are used as test points for evaluation purposes'), but it does not provide explicit train/validation/test split percentages, sample counts, or detailed splitting methodologies for all experiments as required for reproducibility. |
| Hardware Specification | No | The authors thank Jessica Davies for providing the Max HS solver, Teemu Pulkkinen for help with the WLAN data, and the Aalto Science-IT project for computational resources. This mentions 'computational resources' but lacks specific hardware details like CPU/GPU models or memory. |
| Software Dependencies | No | We used Max HS (Davies and Bacchus 2013) as the Max SAT solver. We compared our approach to the current perhaps most widely used NE method, t-SNE (van der Maaten and Hinton 2008), and SPE (Shaw and Jebara 2009). The paper identifies software used but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We computed the neighborhood weights following Eq. (3) using perplexity 5 and threshold ϵ = δ = 0.17 for precision and recall, resulting in an effective neighborhood of 2. The σi is chosen to set the entropy of the distribution over neighbors equal to log k, where the perplexity k denotes the effective number of local neighbors. We computed the weights with perplexity 15, ϵ = 0.1 and δ = 10−7, resulting in maximal 5 recall neighbors for the Max SAT encoding. |