Information-Theoretic Diffusion
Authors: Xianghao Kong, Rob Brekelmans, Greg Ver Steeg
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate ITD’s efficacy by evaluating its discovered concepts on downstream tasks and through human studies. On ImageNet, ITD achieves an average accuracy of 75.3% when clustering features from a self-supervised model, significantly outperforming alternative approaches. |
| Researcher Affiliation | Collaboration | Mengyu Yang1, Yi-Fan Zhang2, Ethan Liu2, Daniel M. Blatman2, Gregory M. P. O’Hare3, Georgios Tzimiropoulos1, Jie Song2 1 University of Oulu, Finland 2 Huawei Noah’s Ark Lab, Ireland 3 University College Dublin, Ireland 4 Huawei Noah’s Ark Lab, China |
| Pseudocode | Yes | Algorithm 1 Information-Theoretic Diffusion (ITD) |
| Open Source Code | No | Our code will be publicly released upon acceptance. |
| Open Datasets | Yes | ImageNet-1K (ILSVRC 2012) is a widely used benchmark dataset for image classification tasks, containing 1.28 million training images and 50,000 validation images across 1,000 classes. |
| Dataset Splits | Yes | ImageNet-1K (ILSVRC 2012) is a widely used benchmark dataset for image classification tasks, containing 1.28 million training images and 50,000 validation images across 1,000 classes. |
| Hardware Specification | Yes | All experiments were run on NVIDIA A100 GPUs. |
| Software Dependencies | No | The paper mentions software like PyTorch, scikit-learn, Faiss, and Diffusers, but does not specify their version numbers. |
| Experiment Setup | Yes | The backbone feature extractor is ViT-B/16 from DINO [5], pre-trained on ImageNet-1K. We extract features from the penultimate layer of the backbone model. We use AdamW optimizer with a learning rate of 1e-4 and a batch size of 256. The diffusion process runs for 2000 steps with a linear noise schedule, and the number of clusters K is set to 1000 for ImageNet. |