reproducibilityindex.ai

Robust Model Compression Using Deep Hypotheses

Authors: Omri Armstrong, Ran Gilad-Bachrach6688-6695

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the success of this algorithm empirically by compressing neural networks and random forests into small decision trees, which are interpretable models, and show that they are more accurate and robust than other comparable methods. In addition, our empirical study shows that our method outperforms Knowledge Distillation on DNN to DNN compression.
Researcher Affiliation	Academia	Omri Armstrong, Ran Gilad-Bachrach Tel Aviv University Ramat Aviv 699780, Tel Aviv armstrong@mail.tau.ac.il, rgb@tauex.tau.ac.il
Pseudocode	Yes	Algorithm 1: Multiclass Empirical Median Optimization (MEMO) Algorithm (Section 3) and Algorithm 2: Compact Robust Estimated Median Belief Optimization (CREMBO) (Section 4).
Open Source Code	Yes	Our code is available at https://github.com/TAU-ML-well/Rubust-Model Compression.
Open Datasets	Yes	We evaluated the CREMBO algorithm on ﬁve classiﬁcation tasks (Table 1) from the UCI repository (Dua and Graff 2017)... The models were trained on the CIFAR-10 dataset (Krizhevsky, Hinton et al. 2009).
Dataset Splits	Yes	To ﬁnd the median tree, we split Strain into a train and validation sets, S train, Sval, with a random 15% split and run the CREMBO algorithm. (Section 5.1). Then we divided the training set to a train and validation set with a random 10% split. (Section 5.2).
Hardware Specification	No	The paper mentions software used (PyTorch, scikit-learn) and training parameters but does not specify any hardware details like GPU models, CPU types, or memory used for the experiments.
Software Dependencies	No	The paper mentions using 'Py Torch (Paszke et al. 2017)' and 'scikit-learn (Pedregosa et al. 2011) package' but does not specify their version numbers.
Experiment Setup	Yes	The DNNs are all fully connected with two hidden layers of 128 units with Re Lu activation functions. They were trained with an ADAM optimizer with default parameters and batch size of 32 for 10 epochs. (Section 5.1). We used ADAM optimizer, batch size of 128, learning rate of 0.01 for 60 epochs and then learning rate of 0.001 for another 30 epochs. (Section 5.2).