Variational Fair Clustering
Authors: Imtiaz Masud Ziko, Jing Yuan, Eric Granger, Ismail Ben Ayed11202-11209
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report comprehensive evaluations and comparisons with state-of-the-art methods over various fair clustering benchmarks, which show that our variational formulation can yield highly competitive solutions in terms of fairness and clustering objectives.In this section, we present comprehensive empirical evaluations of the proposed fair-clustering algorithm, along with comparisons with state-of-the-art fair-clustering techniques. |
| Researcher Affiliation | Academia | 1 ETS Montreal, Canada 2 Xidian University, China |
| Pseudocode | Yes | Algorithm 1 Proposed Fair-clustering |
| Open Source Code | Yes | Code is available at: https://github.com/imtiazziko/Variational Fair-Clustering |
| Open Datasets | Yes | We use three datasets from the UCI machine learning repository, one large-scale data set whose demographics are balanced (Census), along with two other data sets with various demographic proportions: Bank 4 dataset...Adult5 is a US census record data set from 1994...Census6 is a large-scale data set corresponding to a US census record data from 1990. |
| Dataset Splits | No | The paper mentions using synthetic and real datasets for evaluation and discusses initial partition generation. However, it does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or cross-validation methods). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU model, CPU type, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific details on ancillary software dependencies, such as programming languages, libraries, or solvers with their version numbers, needed to replicate the experiments. |
| Experiment Setup | Yes | In all the experiments, we fixed L = 2...We standardize each dataset by making each feature attribute to have zero mean and unit variance. We then performed L2-normalization of the features, and used the standard K-means++ (Arthur and Vassilvitskii 2007) to generate initial partitions for all the models. For Ncut, we use 20-nearest neighbor affinity matrix, W: w(xp, xq) = 1 if data point xq is within the 20-nearest neighbors of xp, and equal to 0 otherwise. |