Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Expanding the Category of Classifiers with LLM Supervision
Authors: Derui Lyu, Xiangyu Wang, Taiyu Ban, Lyuzhou Chen, Xiren Zhou, Huanhuan Chen
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that CEMIL outperforms existing methods using expert-constructed attributes, demonstrating its effectiveness for fully automated classifier expansion without human participation. Extensive experiments demonstrate that CEMIL, which operates solely with an LLM, consistently outperforms state-of-the-art ZSL methods that rely on expert-constructed attributes, across a variety of ZSL datasets. |
| Researcher Affiliation | Academia | Derui Lyu , Xiangyu Wang , Taiyu Ban , Lyuzhou Chen , Xiren Zhou , Huanhuan Chen University of Science and Technology of China EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology using textual explanations and a framework diagram (Figure 2), but no structured pseudocode or algorithm blocks are explicitly provided. |
| Open Source Code | No | The paper does not contain any explicit statement about the release of source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Methods are evaluated on three widely used datasets: 1) AWA2 [Xian et al., 2018], an animal classification dataset featuring 50 mammal species; 2) CUB [Wah et al., 2011], a dataset containing 200 bird species; and 3) SUN [Patterson et al., 2014], a scene recognition dataset with 717 categories. |
| Dataset Splits | Yes | Each dataset is accompanied by expert-constructed class attributes and is divided into seen and unseen classes based on the splitting scheme in [Xian et al., 2017]. We perform evaluations of the methods under both standard and generalized ZSL settings. |
| Hardware Specification | Yes | Experiments are conducted on an Nvidia Ge Force RTX 4090 24GB GPU. |
| Software Dependencies | No | The paper mentions using 'GPT-4o [Open AI, 2023] as the LLM and the text encoder of CLIP [Radford et al., 2021] as the embedding model' but does not specify versions for any other software dependencies, libraries, or programming languages used for implementation. |
| Experiment Setup | Yes | The initial expected view number is set to 50. Each encoder is a single-layer MLP, while each decoder is a two-layer MLP with a hidden dimension 4096. The dimensions of the attention vectors are 2048. Neural network parameters are initialized randomly from a standard normal distribution. The Adam optimizer is used for training, with up to 500 epochs and an early stopping strategy. The learning rate is set to 1e-5, with batch sizes configured as 10, 16, and 32 for AWA2, CUB, and SUN, respectively. |