Training Decision Trees as Replacement for Convolution Layers
Authors: Wolfgang Fuhl, Gjergji Kasneci, Wolfgang Rosenstiel, Enkeljda Kasneci3882-3889
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results on multiple publicly available data sets show that our approach performs similar to conventional neuronal networks. |
| Researcher Affiliation | Academia | Wolfgang Fuhl, Gjergji Kasneci, Wolfgang Rosenstiel, Enkeljda Kasneci Eberhard Karls Universit at T ubingen Sand 14 T ubingen, Germany {wolfgang.fuhl, gjergji.kasneci, wolfgang.rosenstiel, enkelejda.kasneci}@uni-tuebingen.de |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | An implementation for TensorFlow (Abadi et al. 2016) and Torch (Collobert, Bengio, and Mari ethoz 2002) is also planned since those are currently the most popular frameworks. |
| Open Datasets | Yes | The Le Net-5 model was used in the comparison on the MNIST (Le Cun et al. 1998) dataset... The Res Net-34 was used for the comparison on the CIFAR10 (Krizhevsky and Hinton 2009) dataset... we used landmark regression. Therefore, we compared the decision trees with convolutions on the 300W (Zhu and Ramanan 2012) dataset... |
| Dataset Splits | No | The paper explicitly mentions training and test set sizes for MNIST (60,000 training, 10,000 test), CIFAR10 (50,000 training, 10,000 test), and 300W (3,148 training, 689 test), but does not explicitly state a separate validation set split. |
| Hardware Specification | Yes | For Le Net-5 we used a desktop PC with an Intel i5-4570 CPU (3.2 GHz), 16 GB DDR4 RAM, NVIDIA GTX 1050Ti GPU with 4GB RAM and Windows 7 64 bit operating system. The second hardware setup was used for the Res Net models since those require more GPU RAM. Therefore, we used a server with an Intel i9-9900K CPU (3.6 GHz), 64 GB DDR4 RAM, two RTX 2080ti GPUs with 11.2GB RAM each and an Windows 8.1 64 bit operating system. |
| Software Dependencies | No | We implemented the decision tree layer in C++ on the CPU and in CUDA on the GPU. The implementation was integrated into the DLIB (King 2009) framework which uses CUDNN functions. |
| Experiment Setup | Yes | Training parameters for MNIST: We used the Adam optimizer (Kingma and Ba 2014) with the first momentum set to 0.9 and the second momentum set to 0.999. Weight decay was set to 5 10 4 for the convolutions and to 10 8 for the decision trees. The batch size was set to 400 and each batch was always balanced in terms of available classes. ... The initial learning rate was set to 10 2 and reduced by 10 1 after each 100 epochs until it reached 10 4. For the learning rate of 10 4 we continued the training for additional 1000 epochs and selected the best result. For data augmentation we used random noise in the range of 0-30% of the image resolution. |