Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks
Authors: Zhen Li, Yizhou Yu
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the CB6133 dataset, the public CB513 benchmark, and the recent CASP10 and CASP11 datasets demonstrate that our proposed deep network outperforms existing methods and achieves state-of-the-art performance. |
| Researcher Affiliation | Academia | Zhen Li, Yizhou Yu Department of Computer Science, The University of Hong Kong zli@cs.hku.hk, yizhouy@acm.org |
| Pseudocode | No | The paper describes the network architecture using figures and equations but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our model and results are publicly available1. 1https://github.com/icemansina/IJCAI2016 |
| Open Datasets | Yes | We use four publicly available datasets, CB6133 produced with PISCES Cull PDB [Wang and Dunbrack, 2003], CB513 [Cuff and Barton, 1999] 3, CASP10 [Kryshtafovych et al., 2014] and CASP11 [Moult et al., 2014], to evaluate the performance of our proposed deep neural network. 3http://www.princeton.edu/~jzthree/datasets/ICML2014/ |
| Dataset Splits | Yes | CB6133 is a large non-homologous protein sequence and structure dataset, that has 6128 proteins, which include 5600 proteins (index 0 to 5599) for training, 256 proteins (index 5877 to 6132) for validation and 272 proteins (index 5605 to 5876) for testing. |
| Hardware Specification | Yes | The entire deep network is trained on a single NVIDIA Ge Force GTX TITAN X GPU with 12GB memory. |
| Software Dependencies | No | Our code is implemented in Theano [Bastien et al., 2012; Bergstra et al., 2010], a publicly available deep learning software4, on the basis of the Keras [Chollet, 2015] library5. While the software and frameworks are mentioned, specific version numbers for Theano and Keras are not provided. |
| Experiment Setup | Yes | In our experiments, multiscale CNN layers with kernel size 3, 7, and 11 are used... Each of the three stacked BGRU layers has 600 hidden units... The output from the BGRU layers is regularized with dropout (= 0.5)... We set λ1 = 1, λ2 = 0.001... The batch size is set to 128. |