Disambiguating Symbolic Expressions in Informal Documents

Authors: Dennis Müller, Cezary Kaliszyk

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated several baseline models on this dataset, which failed to yield even syntactically valid LATEX before overfitting. Consequently, we describe a methodology using a transformer language model pre-trained on sources obtained from arxiv.org, which yields promising results despite the small size of the dataset. We evaluate our model using a plurality of dedicated techniques, taking the syntax and semantics of symbolic expressions into account.
Researcher Affiliation Academia Dennis Müller Knowledge Representation and Management FAU Erlangen-Nürnberg Computational Logic University of Innsbruck d.mueller@kwarc.info Cezary Kaliszyk Computational Logic University of Innsbruck Institute of Computer science Warsaw University cezary.kaliszyk@uibk.ac.at
Pseudocode Yes The generating algorithm takes as input a set of symbols Sym (e.g. all Mit M-symbols for which an alignment to SMGLo M exists) and a starting symbol s Sym (e.g. nattimes; binary multiplication on natural numbers). The algorithm then proceeds as follows: 1. If s : T has a (simple or dependent) function type, we fill in the required arguments. ...
Open Source Code Yes All code and data relevant to this paper is available at https://gl.kwarc.info/dmueller/ fifom.
Open Datasets Yes We have two datasets of s TEX-content: 1. The SMGLo M3, ... 2. The Mi Ko MH4-repository of lecture notes by Michael Kohlhase... 3https://gl.mathhub.info/smglom 4https://gl.mathhub.info/Mi Ko MH
Dataset Splits No The paper describes a training corpus (911 entries from SMGLo M, 9200 from Mi Ko MH, and 23,000 synthesized sentences) and an evaluation dataset (161 symbolic expressions) but does not specify a separate validation dataset split or its characteristics used during model training.
Hardware Specification No The paper does not provide specific details on the hardware used for training or evaluating the models.
Software Dependencies No The paper mentions software components like GPT2, La Te XML, and MMT system, but it does not specify version numbers for these or other software dependencies.
Experiment Setup Yes The GPT2-model was finetuned on these for five epochs, resulting in an average training loss of 0.04 and yielding promising results on the evaluation set.