reproducibilityindex.ai

Disambiguating Symbolic Expressions in Informal Documents

Authors: Dennis Müller, Cezary Kaliszyk

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated several baseline models on this dataset, which failed to yield even syntactically valid LATEX before overﬁtting. Consequently, we describe a methodology using a transformer language model pre-trained on sources obtained from arxiv.org, which yields promising results despite the small size of the dataset. We evaluate our model using a plurality of dedicated techniques, taking the syntax and semantics of symbolic expressions into account.
Researcher Affiliation	Academia	Dennis Müller Knowledge Representation and Management FAU Erlangen-Nürnberg Computational Logic University of Innsbruck d.mueller@kwarc.info Cezary Kaliszyk Computational Logic University of Innsbruck Institute of Computer science Warsaw University cezary.kaliszyk@uibk.ac.at
Pseudocode	Yes	The generating algorithm takes as input a set of symbols Sym (e.g. all Mit M-symbols for which an alignment to SMGLo M exists) and a starting symbol s Sym (e.g. nattimes; binary multiplication on natural numbers). The algorithm then proceeds as follows: 1. If s : T has a (simple or dependent) function type, we ﬁll in the required arguments. ...
Open Source Code	Yes	All code and data relevant to this paper is available at https://gl.kwarc.info/dmueller/ fifom.
Open Datasets	Yes	We have two datasets of s TEX-content: 1. The SMGLo M3, ... 2. The Mi Ko MH4-repository of lecture notes by Michael Kohlhase... 3https://gl.mathhub.info/smglom 4https://gl.mathhub.info/Mi Ko MH
Dataset Splits	No	The paper describes a training corpus (911 entries from SMGLo M, 9200 from Mi Ko MH, and 23,000 synthesized sentences) and an evaluation dataset (161 symbolic expressions) but does not specify a separate validation dataset split or its characteristics used during model training.
Hardware Specification	No	The paper does not provide specific details on the hardware used for training or evaluating the models.
Software Dependencies	No	The paper mentions software components like GPT2, La Te XML, and MMT system, but it does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	The GPT2-model was ﬁnetuned on these for ﬁve epochs, resulting in an average training loss of 0.04 and yielding promising results on the evaluation set.