reproducibilityindex.ai

Concept Distillation: Leveraging Human-Centered Explanations for Model Improvement

Authors: Avani Gupta, Saurabh Saini, P J Narayanan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Benchmark results on standard biased MNIST datasets and on a challenging Texture MNIST dataset that we introduce. Show application on a severely biased classification problem involving age bias. Show application beyond classification to the challenging multi-branch Intrinsic Image Decomposition problem by inducing human-centered concepts as priors.
Researcher Affiliation	Collaboration	Avani Gupta1,2 Saurabh Saini2 P J Narayanan 2 1M42, UAE 2IIIT Hyderabad, India
Pseudocode	Yes	Algorithm 1 Concept Distillation Pipeline
Open Source Code	Yes	Please visit https://avani17101.github.io/Concept-Distilllation/ for code and more details.
Open Datasets	Yes	We show results on two standard biased datasets (Color MNIST [42] and Decoy MNIST [16]) and introduce a more challenging Texture MNIST dataset for quantitative evaluations. We also experimented on a real-world gender classification dataset BFFHQ [34]... Following Li and Snavely [44], we fine-tune over the CGIntrinsics [44], IIW [10], and SAW [37] datasets while we report results over ARAP dataset [11], which consists of realistic synthetic images.
Dataset Splits	No	The paper mentions 'training set' and 'test set' for datasets like Color MNIST, but it does not provide explicit training/validation/test split percentages, sample counts, or detailed splitting methodology.
Hardware Specification	Yes	taking 15-30 secs on a single 12GB Nvidia 1080 Ti GPU
Software Dependencies	No	The paper mentions using a 'DINO Vi T-B8 transformer [13]' and refers to 'PyTorch' in the text near the table 1 footnote, but it does not specify version numbers for any software components (e.g., PyTorch, Python, CUDA, etc.) in the main body or provided supplementary material.
Experiment Setup	No	The paper mentions fine-tuning 'for a few epochs' and describes the architecture of the mapping module and CAV estimations, but it does not provide specific hyperparameter values such as learning rate, exact number of epochs, batch size, or optimizer settings for reproducibility.