Accelerating Convergence in Bayesian Few-Shot Classification
Authors: Tianjun Ke, Haoqun Cao, Feng Zhou
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate competitive classification accuracy, improved uncertainty quantification, and faster convergence compared to baseline models. |
| Researcher Affiliation | Academia | 1Center for Applied Statistics and School of Statistics, Renmin University of China, Beijing, China 2Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing. |
| Pseudocode | Yes | Algorithm 1: Mirror Descent based Bayesian Few-Shot Classification Training: Input: Input feature and class labels for S tasks: {Xs}S s=1, {ys}S s=1 Output: GP kernel hyperparameter η Initialize GP kernel hyperparameter η and variational parameters eθ0 = 0 and θ1 = η; for Iteration do for Task s do # Update task-specific parameters for Step t do Update eθs t by Equation (4) and Section 3.3; Update θs t+1 = eθs t + η; end # Update task-common parameters Update η by Equation (5). end end Test: Input: Support set S = {X, y}; query set Q = X ; learned hyperparameter ˆη Output: Predicted labels Initialize variational parameters eθ0 = 0 and θ1 = ˆη; # Update task-specific parameters for Step t do Update eθt by Equation (4) and Section 3.3; Update θt+1 = eθt + ˆη; end # Predict labels for x X do Predict y by Equation (6). end |
| Open Source Code | Yes | Code is publicly available at https: //github.com/keanson/MD-BSFC. |
| Open Datasets | Yes | We address three challenging tasks using benchmark datasets, including Caltech-UCSD Birds (Wah et al., 2011), mini-Image Net (Ravi & Larochelle, 2017), Omniglot (Lake et al., 2011), and EMNIST (Cohen et al., 2017). |
| Dataset Splits | Yes | The standard split of 100 training, 50 validation, and 50 test classes is employed (Snell & Zemel, 2021). ... We employed the common split of 64 training, 16 validation, and 20 test classes as well (Snell & Zemel, 2021). ... In the domain transfer task, we utilize 31 for validation and the other for test. |
| Hardware Specification | Yes | We use one Quadro RTX 6000 to run each method. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | The Adam optimizer with a standard learning rate of 10-3 for the neural network and a learning rate of 10-4 for other kernel parameters is employed across all our experiments in the outer loop. For a single epoch, 100 random episodes are sampled from the complete dataset for all methods. As for the steps used for variational inference, we run 3 steps with ρ = 1 during training time and 50 steps during testing time with ρ = 0.5. |