Cost-Aware Pre-Training for Multiclass Cost-Sensitive Deep Learning

Authors: Yu-An Chung, Hsuan-Tien Lin, Shao-Wen Yang

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results justify the validity of the novel loss function for making existing deep learning models cost-sensitive, and demonstrate that our proposed model with cost-aware pre-training and training outperforms non-deep models and other deep models that digest the cost information in other stages.
Researcher Affiliation Collaboration Yu-An Chung Department of CSIE National Taiwan University b01902040@ntu.edu.tw Hsuan-Tien Lin Department of CSIE National Taiwan University htlin@csie.ntu.edu.tw Shao-Wen Yang Intel Labs Intel Corporation shao-wen.yang@intel.com
Pseudocode Yes Algorithm 1 CSDNN Input: Cost-sensitive training set S = {(xn, yn, cn)}N n=1 1: for each hidden layer i = {Wi, bi} do 2: Learn a CAE by minimizing (9). 3: Take {Wi, bi} of CAE as i. 4: end for 5: Fine-tune the network parameters {{ i}H i=1, SOSR} by minimizing (8) using back-propagation. Output: The fine-tuned deep neural network with (5) as gr.
Open Source Code No The paper mentions standard frameworks like Caffe and Theano that were used ('All experiments were conducted using Theano.', 'for CNN, we considered a standard structure in Caffe [Jia et al., 2014]'), but it does not provide a link or explicit statement about releasing the source code for their own methodology.
Open Datasets Yes We conducted experiments on MNIST, bg-img-rot (the hardest variant of MNIST provided in [Larochelle et al., 2007]), SVHN [Netzer et al., 2011], and CIFAR-10 [Krizhevsky and Hinton, 2009].
Dataset Splits Yes For all four datasets, the training, validation, and testing split follows the source websites;
Hardware Specification No The paper does not provide any specific details about the hardware used for the experiments, such as CPU or GPU models, memory, or cloud computing specifications.
Software Dependencies No The paper mentions using 'Theano' and 'Caffe', but it does not provide specific version numbers for these software dependencies, which would be necessary for full reproducibility.
Experiment Setup Yes The β in (9), needed by SEAE and SCAE algorithms, was selected among {0, 0.05, 0.1, 0.25, 0.4, 0.75, 1}. As mentioned, for CNN, we considered a standard structure in Caffe [Jia et al., 2014].