Convex Formulations for Fair Principal Component Analysis
Authors: Matt Olfat, Anil Aswani663-670
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude by showing how our approach can be used to perform a fair (with respect to age) clustering of health data that may be used to set health insurance rates. ...we demonstrate their effectiveness using several datasets. |
| Researcher Affiliation | Academia | Matt Olfat,1 Anil Aswani1 1UC Berkeley Berkeley, CA 94720 molfat@berkeley.edu, aaswani@berkeley.edu |
| Pseudocode | No | The paper describes mathematical formulations and steps, but it does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper mentions supplementary material at 'https://arxiv.org/pdf/1802.03765.pdf', which is a link to an arXiv paper, not source code. There is no other explicit statement about the release of the authors' source code. |
| Open Datasets | Yes | We use synthetic and real datasets from the UC Irvine Machine Learning Repository (Lichman 2013)... We use minute-level data from the the National Health and Nutrition Examination Survey (NHANES) from 2005 2006 (Centers for Desease Control and Prevention (CDC). National Center for Health Statistics (NCHS). 2018) |
| Dataset Splits | Yes | For any SVM run, tuning parameters were chosen using 5-fold cross-validation... After splitting each dataset into separate training (70%) and testing (30%) sets |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions using SVMs and k-means clustering but does not specify the software libraries or their version numbers (e.g., scikit-learn version, PyTorch version). |
| Experiment Setup | Yes | For any SVM run, tuning parameters were chosen using 5-fold cross-validation, and data was normalized to have unit variance in each field. ...After splitting each dataset into separate training (70%) and testing (30%) sets... with δ = 0 and µ = 0.01... We conduct k-means clustering (with k = 3) on the dimensionality-reduced data. |