Improving Tandem Mass Spectra Analysis with Hierarchical Learning
Authors: Zhengcong Fei
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on different public datasets demonstrate that our method achieves a new state-of-the-art performance in peptide identification task, leading to a marked improvement in terms of both precision and recall. |
| Researcher Affiliation | Academia | Zhengcong Fei Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China University of Chinese Academy of Sciences, Beijing, China feizhengcong@ict.ac.cn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. The methods are described in prose. |
| Open Source Code | No | The paper does not include any explicit statement about releasing the source code for the methodology described, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | All the experiments are conducted on five public data sets from different labs and species. Table 1 presents the basic information of the data sets where #Spectra represents the number of spectra used in training or testing. The forms of these data were all high-resolution HCD. Openp Find [Chi et al., 2018] was employed to deal with these raw data sets and the five data sets were searched against the corresponding reviewed database of human, mouse, E.coli, and yeast, respectively, which were all downloaded from Uniprot and their versions are consistent with [Chi et al., 2018]. |
| Dataset Splits | Yes | After each epoch, we evaluate the model performance on the validation set and choose the sequencing model with the best rewarding. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. It describes the experimental setup in terms of software and datasets, but not hardware. |
| Software Dependencies | No | The paper mentions using 'Openp Find [Chi et al., 2018]' and 'T-Net [Qiao et al., 2019]' but does not provide specific version numbers for these or any other software libraries or dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | The length of the guiding tag was set to 5 (k = 5) and the total number of the stage was set to 3 (N = 3). For guiding tag decoder, we use the small transformer (dmodel = 256, dhidden = 256, pdropout = 0.1, nlayer = 3, and nhead = 2). For extended decoder, we use the base transformer by [Vaswani et al., 2017](dmodel = 512, dhidden = 512, pdropout = 0.1, nlayer = 3 and nhead = 8). We first train our model under the cross-entropy cost using Adam optimizer (lr=0.002) and a momentum parameter of 0.9. Later on, we adopt the proposed RL-based approach on the just trained sequencing model to further optimize. During this stage, we use Adam with a learning rate of 0.0002. |