Cost-Aware Pre-Training for Multiclass Cost-Sensitive Deep Learning
Authors: Yu-An Chung, Hsuan-Tien Lin, Shao-Wen Yang
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results justify the validity of the novel loss function for making existing deep learning models cost-sensitive, and demonstrate that our proposed model with cost-aware pre-training and training outperforms non-deep models and other deep models that digest the cost information in other stages. |
| Researcher Affiliation | Collaboration | Yu-An Chung Department of CSIE National Taiwan University b01902040@ntu.edu.tw Hsuan-Tien Lin Department of CSIE National Taiwan University htlin@csie.ntu.edu.tw Shao-Wen Yang Intel Labs Intel Corporation shao-wen.yang@intel.com |
| Pseudocode | Yes | Algorithm 1 CSDNN Input: Cost-sensitive training set S = {(xn, yn, cn)}N n=1 1: for each hidden layer i = {Wi, bi} do 2: Learn a CAE by minimizing (9). 3: Take {Wi, bi} of CAE as i. 4: end for 5: Fine-tune the network parameters {{ i}H i=1, SOSR} by minimizing (8) using back-propagation. Output: The fine-tuned deep neural network with (5) as gr. |
| Open Source Code | No | The paper mentions standard frameworks like Caffe and Theano that were used ('All experiments were conducted using Theano.', 'for CNN, we considered a standard structure in Caffe [Jia et al., 2014]'), but it does not provide a link or explicit statement about releasing the source code for their own methodology. |
| Open Datasets | Yes | We conducted experiments on MNIST, bg-img-rot (the hardest variant of MNIST provided in [Larochelle et al., 2007]), SVHN [Netzer et al., 2011], and CIFAR-10 [Krizhevsky and Hinton, 2009]. |
| Dataset Splits | Yes | For all four datasets, the training, validation, and testing split follows the source websites; |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for the experiments, such as CPU or GPU models, memory, or cloud computing specifications. |
| Software Dependencies | No | The paper mentions using 'Theano' and 'Caffe', but it does not provide specific version numbers for these software dependencies, which would be necessary for full reproducibility. |
| Experiment Setup | Yes | The β in (9), needed by SEAE and SCAE algorithms, was selected among {0, 0.05, 0.1, 0.25, 0.4, 0.75, 1}. As mentioned, for CNN, we considered a standard structure in Caffe [Jia et al., 2014]. |