Fair Neighbor Embedding
Authors: Jaakko Peltonen, Wen Xu, Timo Nummenmaa, Jyrki Nummenmaa
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments the method yields fair visualizations outperforming previous methods. ... We perform DR on six data sets, reducing them to 2D. We compare Fair-Ne RV/Fair-t-Ne RV with other methods and show output plots. ... We split each data set to a training and test data set; parameters are optimized for training sets, performance is evaluated on test sets. ... Table 1 reports f1k and f1Avg scores of different embeddings for test data sets with best parameters obtained from training data sets. |
| Researcher Affiliation | Academia | 1Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland. Correspondence to: Jaakko Peltonen <jaakko.peltonen@tuni.fi>. |
| Pseudocode | No | The paper describes the mathematical formulation of the method and its gradients in Appendix A, but it does not include a distinct pseudocode block or algorithm listing. |
| Open Source Code | Yes | Software. Software (Matlab & C++) and data are available at https://github.com/wenxu-fi/Fair-Ne RV. |
| Open Datasets | Yes | The Adult data (https://tinyurl.com/rsbk8wab), also called Census Income... Communities and Crimes (CC; https://tinyurl.com/34sttp2t)... The German credit data (https://tinyurl.com/y6h95pne)... The Law School (LSAC) data set stems from an admission council survey at 163 US law schools in 1991. ... We used preprocessed data from https://tinyurl.com/mtu84w83. The Pima data (https://tinyurl.com/4ytz6nm8)... |
| Dataset Splits | No | The paper states: "We split each data set to a training and test data set; parameters are optimized for training sets, performance is evaluated on test sets." and "We then split each data set into 50% training and 50% test data again by stratified sampling." It mentions hyperparameter optimization on training sets but does not specify a distinct validation set split. |
| Hardware Specification | No | The paper mentions that "For the experiments with hyperparameter search, for all methods we ran different hyperparameter combinations in parallel on a computing cluster", but it does not provide specific details such as CPU, GPU, or memory specifications of the hardware used. |
| Software Dependencies | No | The paper states that "Software (Matlab & C++) and data are available at https://github.com/wenxu-fi/Fair-Ne RV." However, it does not specify version numbers for Matlab, C++, or any libraries/dependencies used. |
| Experiment Setup | Yes | Other hyperparameters (for ct-SNE, a neighborhood perturbation parameter; for Fair-Ne RV and Fair-t-Ne RV, ω and tradeoff parameters γ, β,τ ,τ / ) are fitted to each training set. ... For all methods, hyperparameters are chosen to maximize training-set performance f1Avg chosen over 5 rounds of uniform sampling of 3600 hyperparameter sets in the hyperparameter spaces of each method with feasible ranges for each hyperparameter (see Appendix D). ... In our experiments, we set the ranges for Fair-Ne RV and Fair-t-Ne RV parameters to be τ [0, 1], τ / (τ , 1], β [0, 1], γ [0, 1], and ω [0.5, 0.99]. The ct-SNE method has a hyperparamer β whose range is [10 7, 1]. |