reproducibilityindex.ai

TopicEq: A Joint Topic and Mathematical Equation Model for Scientific Texts

Authors: Michihiro Yasunaga, John D. Lafferty7394-7401

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that this joint model signiﬁcantly outperforms existing topic models and equation models for scientiﬁc texts. Moreover, we qualitatively show that the model effectively captures the relationship between topics and mathematics, enabling novel applications such as topic-aware equation generation, equation topic inference, and topic-aware alignment of mathematical symbols and words.
Researcher Affiliation	Academia	Michihiro Yasunaga, John D. Lafferty Yale University michihiro.yasunaga@yale.edu
Pseudocode	No	The paper describes the model mathematically and textually but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement about releasing source code or a link to a code repository.
Open Datasets	Yes	To obtain a dataset of context-equation pairs, we used scientiﬁc articles published on ar Xiv.org. We sampled 100k articles from all domains in the past 5 years, and split them into train, validation and test sets (80%, 10%, 10%).
Dataset Splits	Yes	We sampled 100k articles from all domains in the past 5 years, and split them into train, validation and test sets (80%, 10%, 10%).
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU models, CPU types, or cloud instances) used for the experiments.
Software Dependencies	No	The paper mentions using RNNs, LSTMs, and Adam optimizer, but does not provide specific version numbers for any software libraries or frameworks (e.g., TensorFlow, PyTorch, scikit-learn).
Experiment Setup	Yes	For the inference network q(η\|C), we use a 2-layer FFNN with 300 units, similar to (Miao, Yu, and Blunsom 2016; Miao, Grefenstette, and Blunsom 2017). The equation TE-LSTM architecture has two layers and state size 500, with dropout rate 0.5 applied to each layer (Srivastava et al. 2014). The parameters of the Topic Eq model are jointly optimized by Adam (Kingma and Ba 2015), with batch size 200, learning rate 0.002, and gradient clipping 1.0 (Pascanu, Mikolov, and Bengio 2012).