reproducibilityindex.ai

Fair Streaming Principal Component Analysis: Statistical and Algorithmic Viewpoint

Authors: Junghyun Lee, Hanseul Cho, Se-Young Yun, Chulhee Yun

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Lastly, we verify the efficacy and memory efficiency of our algorithm on real-world datasets.
Researcher Affiliation	Academia	Junghyun Lee Hanseul Cho Se-Young Yun Chulhee Yun Kim Jaechul Graduate School of AI, KAIST {jh_lee00, jhs4015, yunseyoung, chulhee.yun}@kaist.ac.kr
Pseudocode	Yes	The pseudocode of our algorithm is shown in Algorithms 1 and 2.
Open Source Code	Yes	The code for all experiments is available at github.com/Hanseul Jo/fair-streaming-pca.
Open Datasets	Yes	We evaluate the efficacy of our proposed FNPM on the Celeb A dataset (Liu et al., 2015b). For the sake of completeness, we conduct a quantitative evaluation of our algorithm on UCI datasets (Adult Income, COMPAS, German Credit).
Dataset Splits	Yes	We adopt the predefined train-validation split and run our algorithm only on the training set for 5 iterations with block sizes of b = B = 32, 000. Then, using the output V of FNPM, we project images selected from the validation set.
Hardware Specification	Yes	All experiments were performed on Apple 2020 Mac mini M1 with 16GB RAM.
Software Dependencies	No	We implement our FNPM using Python JAX Num Py Module (Bradbury et al., 2023; Harris et al., 2020) and Pytorch (Paszke et al., 2017). This lists software names but does not provide specific version numbers for reproducibility.
Experiment Setup	Yes	For each channel of colors, we project the data onto a k = 1000-dimensional subspace while nullifying m = 2 leading eigenvectors of covariance difference. run our algorithm only on the training set for 5 iterations with block sizes of b = B = 32, 000. Ours (offline, m=15).