Improving Neural Network Generalization on Data-Limited Regression with Doubly-Robust Boosting
Authors: Hao Wang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that DRBoost improves the generalization performance of various prevalent neural network backbones effectively. We conduct extensive experiments on real-world datasets to verify the efficacy of DRBoost in diverse settings. Datasets. In this study, we validate the efficacy of DRBoost in regression tasks. The datasets are partitioned into training, validation, and testing sets in a 0.7:0.15:0.15 ratio. |
| Researcher Affiliation | Academia | Zhejiang University haohaow@zju.edu.cn |
| Pseudocode | Yes | Algorithm 1: The workflow of DRBoost. Input: X: input features; Y: labels; G : pretrained neural network with activation function σ. Parameter: L: metric of interest; ds: number of latent bases for the statistical learner As; Z = [z1, ..., z P]: initialized population with P individuals for the heuristic optimizer Az; T: number of iterations. Output: boosted model G . 1: H generate (G , X). 2: Z initialize(W(L 1)) 3: for t [0, T] do 4: for p [0, P] do 5: Rp σ (H zp). 6: sp As.fit (Rp, Y, ds). 7: Ip L (As.predict (Rp, sp) , Y). 8: end for 9: Z Az.update {Ip}P p=1, Z 10: zbest Az.update Best {Ip}P p=1, Z, zbest 11: end for 12: s As.fit (σ(H zbest), Y, ds). 13: G derive (G , zbest, s). |
| Open Source Code | No | The paper does not contain an explicit statement or a link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We utilize well-known public benchmarks in this domain (Chen et al. 2001; Li et al. 2018; Lai et al. 2018), with relevant dataset statistics provided in Table 1. |
| Dataset Splits | Yes | The datasets are partitioned into training, validation, and testing sets in a 0.7:0.15:0.15 ratio. |
| Hardware Specification | No | The paper does not explicitly state the specific hardware used (e.g., GPU models, CPU types, or cloud instance specifications) for running the experiments. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' and implementing 'PLSR' and 'metaheuristic optimizers' (PSO, CS, BAS), but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow with their respective versions) that are necessary to reproduce the experiment. |
| Experiment Setup | Yes | Pretraining protocol. We train all neural models for 200 epochs with Adam optimizer, using mean squared error (MSE) as a surrogate loss functions. All models are trained with the same set of hyper parameters to make the results comparable. Specifically, we set learning rate to 0.001 and other hyperparameters consistent with Kingma and Ba (2015). For zero-order optimization, we implement it using metaheuristic optimizers, with hyperparameters tuned to optimize the performance on the validation set. For zero-order optimization, we implement it using metaheuristic optimizers, with hyperparameters tuned to optimize the performance on the validation set. To avoid overclaiming, we perform Algorithm 1 with fixed ds = 48 and disable the ensemble trick in Section 3.4. We select a multilayer perceptron (MLP) with 64-64-64-1 neurons as the backbone model. |