Learning a Hybrid Architecture for Sequence Regression and Annotation

Authors: Yizhe Zhang, Ricardo Henao, Lawrence Carin, Jianling Zhong, Alexander Hartemink

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Results show that the availability of additional continuous response variables can simultaneously improve the annotation of the sequential observations and yield good prediction performance in both synthetic data and real-world datasets. We conducted a simulation study in a PBM scenario, where we are interested in recovering the correct emission matrix and predicting for new observations. We evaluated our algorithm on DREAM5, a publicly available dataset of PBM experiments for methods competition (Weirauch et al. 2013).
Researcher Affiliation Academia Yizhe Zhang, Ricardo Henao, and Lawrence Carin Department of Electrical and Computer Engineering Duke University, Durham, NC 27708, USA Jianling Zhong and Alexander J. Hartemink Program in Computational Biology and Bioinformatics Duke University, Durham, NC 27708, USA
Pseudocode Yes Algorithm 1 Top D Conditional Viterbi Path Integration.
Open Source Code No Further details on multiple regression tasks are provided in the supplements 1. 1http://people.duke.edu/ yz196/pdf/AAAI suppl.zip. This link points to supplementary material, but it is not explicitly stated that source code for the methodology is provided within.
Open Datasets Yes We evaluated our algorithm on DREAM5, a publicly available dataset of PBM experiments for methods competition (Weirauch et al. 2013).
Dataset Splits No 10,000 sequence instances were used for training, while the remaining 10,000 instances were used for evaluation. The overall task is to use one array type ( 41, 000 probe sequences) for training and predict the real signal intensity responses on the other type. While train and test splits are mentioned, there is no explicit mention of a separate validation set or a three-way split with specific percentages for validation.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper describes algorithms and methods (e.g., EM inference, Viterbi path integration, dynamic programming, coordinate gradient descent, trust-region algorithm) but does not list specific software packages or libraries with version numbers required for reproduction.
Experiment Setup Yes The motif length K for all scenarios was set to 6. The initial value for the emission matrix was established from the most frequent 6-mer. around 100 Viterbi paths usually cover 99% probability mass for most probe sequences.