Concept Distillation: Leveraging Human-Centered Explanations for Model Improvement

Authors: Avani Gupta, Saurabh Saini, P J Narayanan

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Benchmark results on standard biased MNIST datasets and on a challenging Texture MNIST dataset that we introduce. Show application on a severely biased classification problem involving age bias. Show application beyond classification to the challenging multi-branch Intrinsic Image Decomposition problem by inducing human-centered concepts as priors.
Researcher Affiliation Collaboration Avani Gupta1,2 Saurabh Saini2 P J Narayanan 2 1M42, UAE 2IIIT Hyderabad, India
Pseudocode Yes Algorithm 1 Concept Distillation Pipeline
Open Source Code Yes Please visit https://avani17101.github.io/Concept-Distilllation/ for code and more details.
Open Datasets Yes We show results on two standard biased datasets (Color MNIST [42] and Decoy MNIST [16]) and introduce a more challenging Texture MNIST dataset for quantitative evaluations. We also experimented on a real-world gender classification dataset BFFHQ [34]... Following Li and Snavely [44], we fine-tune over the CGIntrinsics [44], IIW [10], and SAW [37] datasets while we report results over ARAP dataset [11], which consists of realistic synthetic images.
Dataset Splits No The paper mentions 'training set' and 'test set' for datasets like Color MNIST, but it does not provide explicit training/validation/test split percentages, sample counts, or detailed splitting methodology.
Hardware Specification Yes taking 15-30 secs on a single 12GB Nvidia 1080 Ti GPU
Software Dependencies No The paper mentions using a 'DINO Vi T-B8 transformer [13]' and refers to 'PyTorch' in the text near the table 1 footnote, but it does not specify version numbers for any software components (e.g., PyTorch, Python, CUDA, etc.) in the main body or provided supplementary material.
Experiment Setup No The paper mentions fine-tuning 'for a few epochs' and describes the architecture of the mapping module and CAV estimations, but it does not provide specific hyperparameter values such as learning rate, exact number of epochs, batch size, or optimizer settings for reproducibility.