Deep Neural Networks Constrained by Decision Rules

Authors: Yuzuru Okajima, Kunihiko Sadamasa2496-2505

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on datasets of time-series and sentiment classification showed rule-constrained networks achieved accuracy as high as that achieved by original neural networks and significantly higher than that achieved by existing rule-based models, while presenting decision rules supporting the decisions.
Researcher Affiliation Industry Yuzuru Okajima, Kunihiko Sadamasa NEC Corporation 1753 Shimonumabe, Nakahara-ku, Kawasaki, Kanagawa 211-8666, Japan y-okajima@bu.jp.nec.com, k-sadamasa@az.jp.nec.com
Pseudocode Yes Algorithm 1 Generalized EM algorithm
Open Source Code No The paper does not provide any specific links or explicit statements about the availability of its source code.
Open Datasets Yes For time-series classification, we used the top five largest binary classification data from the UCR time series repository (Chen et al. 2015). For sentiment classification, three review datasets were used: IMDB for movies, Elec for electronics products (Maas et al. 2011), and Yelp for local businesses2.
Dataset Splits Yes The hyperparameters of CART, RF and SVMs were selected from Table 2 by grid search with five-fold cross validation using training data. The hyperparameter λ of SBRL was also selected from 100.5, 10, 101.5, 102, 102.5, and 103 by validation.
Hardware Specification No The paper does not specify any details about the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications).
Software Dependencies No The paper mentions software like "scikit-learn implementations" and "R implementation of SBRL" but does not provide specific version numbers for these or any other ancillary software components, preventing reproducibility of the exact software environment.
Experiment Setup Yes The number of decision trees in RF was set to 100. The hyperparameters of CART, RF and SVMs were selected from Table 2 by grid search with five-fold cross validation using training data. The hyperparameter λ of SBRL was also selected from 100.5, 10, 101.5, 102, 102.5, and 103 by validation. The hidden layer size, h, was 128 for CNN and 512 for Bi-LSTM. The RCNs were first trained without rule set optimization, i.e., using all rules in R for time-series and sentiment classification. After that, they were trained with rule set optimization with sample size s set to 100.