Superclass-Conditional Gaussian Mixture Model For Learning Fine-Grained Embeddings
Authors: Jingchao Ni, Wei Cheng, Zhengzhang Chen, Takayoshi Asakura, Tomoya Soma, Sho Kato, Haifeng Chen
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on benchmark datasets and a real-life medical dataset indicate the effectiveness of our method. |
| Researcher Affiliation | Industry | Jingchao Ni1, Wei Cheng1, Zhengzhang Chen1, Takayoshi Asakura2, Tomoya Soma2, Sho Kato3, Haifeng Chen1 1NEC Laboratories America, 2NEC Corporation, 3Renascience, Inc. |
| Pseudocode | Yes | Algorithm 1: Superclass-conditional Gaussian mixture model (SCGM) |
| Open Source Code | Yes | The code of SCGM is available at https://github.com/nijingchao/SCGM for reproducibility study. |
| Open Datasets | Yes | The table below summarizes the benchmark datasets: (1) BREEDS (Santurkar et al., 2020) includes four datasets {Living17, Nonliving26, Entity13, Entity30} derived from Image Net with class hierarchy calibrated...; (2) CIFAR-100 (Krizhevsky, 2009); and (3) tiered Image Net (Ren et al., 2018) |
| Dataset Splits | Yes | For BREEDS and CIFAR-100, the val set is 10% of the train set. ...tiered Image Net... was divided into 20/6/8 splits for (disjoint) train/val/test sets. |
| Hardware Specification | Yes | Table 4: Computational costs on 4 Quadro RTX6000 24G GPUs. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | For training, we used cosine annealing with warm restarts schedule (Loshchilov & Hutter, 2017) with 20 epochs per cycle. The batch size was 256 for BREEDS, 1024 for CIFAR-100, and 512 for tiered Image Net. The learning rate was 0.03 for BREEDs, and 0.12 for CIFAR-100 and tiered Image Net. The weight decay was 1e 4. All models were trained with 200 epochs. ... For SCGM, we set γ = 0.5, σ2 = 0.1, and λ = 25 (λ follows (Asano et al., 2020)). |