Improving Neural Network Generalization on Data-Limited Regression with Doubly-Robust Boosting

Authors: Hao Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate that DRBoost improves the generalization performance of various prevalent neural network backbones effectively. We conduct extensive experiments on real-world datasets to verify the efficacy of DRBoost in diverse settings. Datasets. In this study, we validate the efficacy of DRBoost in regression tasks. The datasets are partitioned into training, validation, and testing sets in a 0.7:0.15:0.15 ratio.
Researcher Affiliation Academia Zhejiang University haohaow@zju.edu.cn
Pseudocode Yes Algorithm 1: The workflow of DRBoost. Input: X: input features; Y: labels; G : pretrained neural network with activation function σ. Parameter: L: metric of interest; ds: number of latent bases for the statistical learner As; Z = [z1, ..., z P]: initialized population with P individuals for the heuristic optimizer Az; T: number of iterations. Output: boosted model G . 1: H generate (G , X). 2: Z initialize(W(L 1)) 3: for t [0, T] do 4: for p [0, P] do 5: Rp σ (H zp). 6: sp As.fit (Rp, Y, ds). 7: Ip L (As.predict (Rp, sp) , Y). 8: end for 9: Z Az.update {Ip}P p=1, Z 10: zbest Az.update Best {Ip}P p=1, Z, zbest 11: end for 12: s As.fit (σ(H zbest), Y, ds). 13: G derive (G , zbest, s).
Open Source Code No The paper does not contain an explicit statement or a link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We utilize well-known public benchmarks in this domain (Chen et al. 2001; Li et al. 2018; Lai et al. 2018), with relevant dataset statistics provided in Table 1.
Dataset Splits Yes The datasets are partitioned into training, validation, and testing sets in a 0.7:0.15:0.15 ratio.
Hardware Specification No The paper does not explicitly state the specific hardware used (e.g., GPU models, CPU types, or cloud instance specifications) for running the experiments.
Software Dependencies No The paper mentions using the 'Adam optimizer' and implementing 'PLSR' and 'metaheuristic optimizers' (PSO, CS, BAS), but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow with their respective versions) that are necessary to reproduce the experiment.
Experiment Setup Yes Pretraining protocol. We train all neural models for 200 epochs with Adam optimizer, using mean squared error (MSE) as a surrogate loss functions. All models are trained with the same set of hyper parameters to make the results comparable. Specifically, we set learning rate to 0.001 and other hyperparameters consistent with Kingma and Ba (2015). For zero-order optimization, we implement it using metaheuristic optimizers, with hyperparameters tuned to optimize the performance on the validation set. For zero-order optimization, we implement it using metaheuristic optimizers, with hyperparameters tuned to optimize the performance on the validation set. To avoid overclaiming, we perform Algorithm 1 with fixed ds = 48 and disable the ensemble trick in Section 3.4. We select a multilayer perceptron (MLP) with 64-64-64-1 neurons as the backbone model.