Few-Shot Learning via Repurposing Ensemble of Black-Box Models
Authors: Minh Hoang, Trong Nghia Hoang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experimental studies showcasing the effectiveness of the proposed methods on variety of realistic task domains with high-complexity task models (Section 4). To demonstrate the effectiveness of the proposed black-box model fusion via repurposing or reprogramming (MFR) approach, we set up experimental scenarios where multiple specialized pre-trained models need to be combined and reprogrammed to solve another related task with limited domain data. |
| Researcher Affiliation | Academia | Minh Hoang1, Trong Nghia Hoang2 1 Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ 08540 2School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99613 |
| Pseudocode | No | The paper describes the proposed method conceptually and mathematically, but does not include structured pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The interested readers are referred to the extended version of this paper1 for all appendices. 1https://htnghia87.github.io/publication/aaai2024a. This link leads to a publication page which includes a specific GitHub repository link for the code. |
| Open Datasets | Yes | These are derived from real-world benchmark datasets in computer vision, which include MNIST (Le Cun, Cortes, and Burges 2010), CIFAR-10 (Krizhevsky 2009), Mini-Image Net (Vinyals et al. 2017) and the Large-Scale Celeb Faces Attributes (Celeb A) (Liu et al. 2015). |
| Dataset Splits | No | For MNIST: 'Among which 50, 000 images are used as training data and the remaining is used for testing (Le Cun, Cortes, and Burges 2010).' For CIFAR-10: 'There are 50,000 training images and 10,000 test images.' The paper specifies training and testing data but does not explicitly provide details for a validation split (e.g., size or percentage). |
| Hardware Specification | No | The paper mentions training models and using sophisticated architectures like convolutional neural networks, but it does not provide specific details about the hardware used (e.g., GPU/CPU models, memory, or number of machines). |
| Software Dependencies | No | The paper describes the implementation using flow-based generative models and neural networks, but it does not specify any particular software versions for programming languages (e.g., Python version) or libraries (e.g., PyTorch version). |
| Experiment Setup | No | The paper mentions architectural components like 'two invertible flow blocks, each composes of Planar Flow and Radial Flow', and discusses various layers and activation functions. However, it explicitly states that 'The specifics of these deconvolution, pooling and feed-forward nets depend on the application domain and are deferred to Appendix D.' and 'The exact configurations of these layers depend on the specifics of the application domain and are deferred to Appendix D.', meaning concrete hyperparameter values and detailed training configurations are not provided in the main text. |