VIPR: An Interactive Tool for Meaningful Visualization of High-Dimensional Data

Authors: Donghan Wang, Madalina Fiterau, Artur Dubrawski

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this demonstration, we present a powerful analysis tool that uses IPE methodology in support of fundamental machine learning tasks: regression, classification, and clustering. We show in examples how it can discover hidden interpretable structures embedded in high-dimensional data. We demonstrate how VIPR extracts communicative models and allows its users to visualize informative patterns in low-dimensional projections of highly-dimensional data. We showcase patterns extracted from our own data and from public benchmark datasets, as well as models learned from data received ad-hoc at the demonstration site. The users will see the extraction of Informative Projection models and their visualizations in real time, under multiple settings, of at least 20 different public and proprietary datasets with diverse characteristics.
Researcher Affiliation Academia Donghan Wang , Madalina Fiterau , Artur Dubrawski Carnegie Mellon University Pittsburgh, PA, USA Stanford University Stanford, CA, USA
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper provides a link to "https://plot.ly/javascript/", which is an open-source library used for plotting, not the source code for the VIPR methodology itself. There is no explicit statement or link providing access to the authors' own implementation code for VIPR.
Open Datasets Yes We showcase patterns extracted from our own data and from public benchmark datasets, as well as models learned from data received ad-hoc at the demonstration site. VIPR was also used for regression on UCI Concrete data, containing 8 input features for 1,030 observations.
Dataset Splits No The paper does not provide specific dataset split information (e.g., percentages, sample counts, or a detailed splitting methodology) for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (like CPU/GPU models, processor types, or memory) used for running its experiments. It only mentions general concepts like an "interactive web application".
Software Dependencies No The paper mentions using "an open source library 1" and provides the link "https://plot.ly/javascript/". However, it does not specify a version number for this library or any other software dependencies, which is required for reproducibility.
Experiment Setup No The paper describes user-selectable parameters for the demonstration ("number of submodels, the dimensionality of the subspaces, costs associated with features, and the types of base classifier or base regressor to be used"), but it does not specify the concrete values of these parameters or other system-level training settings used for the results presented in the paper (e.g., for Figures 1 and 2).