Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

HiFE: Hierarchical Feature Ensemble Framework for Few-shot Hypotheses Adaptation

Authors: Yongfeng Zhong, Haoang Chi, Feng Liu, Xiao-Ming Wu, Bo Han

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evidence from our experiments indicates that these weaker models, while not optimal within the source domain context, contribute to an enhanced generalization capacity of the resultant model for the target domain. Moreover, the Hi FE framework we introduce demonstrates superior performance, surpassing other leading baselines across a spectrum of few-shot hypothesis adaptation scenarios. [...] Comprehensive evaluation of the proposed Hi FE methodology, conducted over an array of benchmark datasets including Mnist, SVHN, USPS, CIFAR-10, STL-10, Amazon, DSLR, Webcam, and Vis DA-C has established that our approach achieves performance on par with or exceeding current SOTA methods in various FHA tasks.
Researcher Affiliation	Academia	Yongfeng Zhong EMAIL Department of Data Science and Artificial Intelligence The Hong Kong Polytechnic University [...] Haoang Chi EMAIL Intelligent Game and Decision Lab, Defense Innovation Institute National University of Defense Technology [...] Feng Liu EMAIL Computing and Information Systems University of Melbourne [...] Xiaoming Wu EMAIL Department of Data Science and Artificial Intelligence The Hong Kong Polytechnic University [...] Bo Han EMAIL Department of Computer Science Hong Kong Baptist University
Pseudocode	No	The paper describes the Hi FE framework and its components (WRU, De CL) using architectural diagrams (Figure 2) and mathematical formulations, but it does not present a distinct pseudocode or algorithm block.
Open Source Code	Yes	The full code is available at https://github.com/yf Zhong/HIFE.git.
Open Datasets	Yes	Datasets. We conduct experiments on various standard DA benchmarks to evaluate our approach1 Digits. We choose three-digit datasets, i.e., Mnist (M), USPS (U), and SVHN (S) for our experiments. [...] Office. We use three domains of the office datasets (Saenko et al., 2010): Amazon (A), DSLR (D), and Webcam (W). [...] Image classification. We use two image classification benchmarks CIFAR-10 (CF) (Krizhevsky, 2009) and STL-10 (ST) (Coates et al., 2011). [...] Vis DA-C. Vis DA-C (Peng et al., 2017) is a demanding large-scale benchmark designed primarily for the 12-class synthesis-to-real object recognition task.
Dataset Splits	Yes	Digits. We choose three-digit datasets, i.e., Mnist (M), USPS (U), and SVHN (S) for our experiments. Following (Motiian et al., 2017; Chi et al., 2021), we experiment with different numbers of target samples from 1 to 7 per class. [...] Office. We conduct several experiments with different numbers of target samples per class ranging from 1 to 5. [...] As the two domains are more complex than digits, we increase the number of target samples to 15 and 20 for each class. [...] Vis DA-C. We randomly choose 10% of the target data set (7200 images) as the testing set. [...] we experiment on a larger number (10, 30, and 50) of the target samples.
Hardware Specification	Yes	The network uses the Py Torch framework on a PC with four NVIDIA 2080ti GPUs.
Software Dependencies	No	The paper mentions 'Py Torch framework' and 'SGD with Nesterov momentum' but does not specify version numbers for any software libraries or packages.
Experiment Setup	Yes	We trained the source hypothesis using a stochastic gradient descent (SGD) optimizer with a momentum value of 0.5 with the learning rate initialized to 1e-2 and decreased to 1e-5 step by step. During the adaptation, we adopt SGD with Nesterov momentum (Ruder, 2016) with a momentum value of 0.9. Following (Liang et al., 2020), we insert a batch normalization layer and a weight normalization layer before the end of each encoder and classifier, respectively. [...] We study the advantage of our training loss by incorporating feature De CL loss LDe CL in Equation (4) with different β values ranging from 0 to 1.0 with the digit datasets. [...] In our Hi FE, we set the number of input features for each WRU to two, leading to a four-layer feature ensemble structure. [...] The batch size, the number of target samples per class, and the number of pre-trained models are set to 128, 7, and 8, respectively.