reproducibilityindex.ai

Oblique Decision Trees from Derivatives of ReLU Networks

Authors: Guang-He Lee, Tommi S. Jaakkola

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that our method outperforms alternative techniques for training oblique decision trees in the context of molecular property classiﬁcation and regression tasks. Empirically, a locally constant network always outperforms alternative methods for training oblique decision trees by a large margin, and the ensemble of locally constant networks is competitive with classic ensemble methods. Here we evaluate the efﬁcacy of our models (LCN, ALCN, and ELCN) using the chemical property prediction datasets from Molecule Net (Wu et al., 2018), where random forest performs competitively. We include 4 (multi-label) binary classiﬁcation datasets and 1 regression dataset. The statistics are available in Table 1.
Researcher Affiliation	Academia	Guang-He Lee & Tommi S. Jaakkola Computer Science and Artiﬁcial Intelligence Lab MIT {guanghe,tommi}@csail.mit.edu
Pseudocode	No	The paper describes computational procedures in numbered lists within the text, for example, under 'Computation and time complexity', but these are not formatted as distinct pseudocode or algorithm blocks.
Open Source Code	Yes	1Our implementation and data are available at https://github.com/guanghelee/iclr20-lcn.
Open Datasets	Yes	Here we evaluate the efﬁcacy of our models (LCN, ALCN, and ELCN) using the chemical property prediction datasets from Molecule Net (Wu et al., 2018), where random forest performs competitively. We include 4 (multi-label) binary classiﬁcation datasets and 1 regression dataset. The statistics are available in Table 1.
Dataset Splits	Yes	Each dataset is splitted into (train, validation, test) sets under the criterion speciﬁed in Molecule Net.
Hardware Specification	No	The paper does not provide specific hardware specifications (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies	No	For RF: we use the scikit-learn (Pedregosa et al., 2011) implementation of random forest. For GBDT: we use the scikit-learn (Pedregosa et al., 2011) implementation of gradient boosting trees. The paper mentions software such as 'scikit-learn' but does not specify version numbers for any libraries or dependencies.
Experiment Setup	Yes	For decision trees, LCN, LLN, and ALCN, we tune the tree depth in {2, 3, . . . , 12}. For LCN, LLN, and ALCN, we also tune the Drop Connect probability in {0, 0.25, 0.5, 0.75}. For all the datasets, we tune the depth in {2, 3, . . . , 12} and the Drop Connect probability in {0, 0.25, 0.5, 0.75}. The models are optimized with mini-batch stochastic gradient descent with batch size set to 64. For all the classiﬁcation tasks, we set the learning rate as 0.1, which is annealed by a factor of 10 for every 10 epochs (30 epochs in total). For the regression task, we set the learning rate as 0.0001, which is annealed by a factor of 10 for every 30 epochs (60 epochs in total).