reproducibilityindex.ai

Bagging by Design (on the Suboptimality of Bagging)

Authors: Periklis Papakonstantinou, Jia Xu, Zhu Cao

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our analytical results are backed up by experiments on classiﬁcation and regression settings. Empirical results We provide empirical evidence (i) supporting the covariance assumption and (ii) comparing bagging to design-bagging in classiﬁcation and regression settings.
Researcher Affiliation	Academia	Periklis A. Papakonstantinou and Jia Xu and Zhu Cao IIIS, Tsinghua University
Pseudocode	Yes	Algorithm 1 Blocks Generating Algorithm (BGA) Input: block size b, number of blocks m, universe size N Initialize m empty blocks. for i = 1 to b m do choose L at random from the set of blocks with current min # of elements S : set of elements in the universe not in L that appear least frequently L L[{e}, where e 2 S chosen uniformly at random end for Output: m blocks each with b distinct elements.
Open Source Code	No	The paper does not provide an explicit statement about releasing source code, nor does it include a link to a code repository.
Open Datasets	Yes	Our study is on various base learners and data sets from the UCI repository and on a real data set MNIST. The details of the data sets can be obtained from the UCI repository: number of samples, features as well as classes can be found in the full version.
Dataset Splits	Yes	The test set is 10% of the whole data set uniform randomly selected and the rest samples are taken as the training set for each task. On Fisher s Iris data, we applied a 10-fold crossvalidation to evaluate a binary classiﬁcation task...
Hardware Specification	No	The paper does not provide any specific hardware details such as CPU or GPU models used for running the experiments.
Software Dependencies	No	The paper mentions software used (e.g., 'SVM in Matlab' and 'Decision Tree C4.5') but does not specify version numbers for these or any other software dependencies.
Experiment Setup	Yes	Bagging and design bagging are performed on 30 bootstraps (m = 30) and combined based on voting, and we set the number of samples in each bootstrap to N/2 (cf. (B uhlmann and Yu 2002; Friedman and Hall 2007) justifying this choice), where N is the number of training samples. We repeated the same experiment 1000 times for all classiﬁcation tasks to remove the random noise. For polynomial regression we repeated for 450K times.