Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Constraint Detection in Natural Language Problem Descriptions

Authors: Zeynep Kiziltan, Marco Lippi, Paolo Torroni

IJCAI 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the method, we develop an original annotated corpus which gathers 110 problem descriptions from several resources. Our results show significant accuracy with respect to metrics used in cognate tasks. We performed experiments on our dataset following the leave-one-problem-out (LOO) procedure.
Researcher Affiliation Academia Zeynep Kiziltan and Marco Lippi and Paolo Torroni Department of Computer Science and Engineering DISI University of Bologna, Italy EMAIL
Pseudocode No The paper does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our system together with all the reported predictions are available at: http://nlp4cp.disi.unibo.it
Open Datasets Yes Being the first ones to tackle constraint detection, we had to construct a dataset, that is, a corpus of NL problem descriptions where the parts of text containing problem constraints are annotated. ... The final dataset6 contains 1,075 sentences, for a total of 25,317 words... http://nlp4cp.disi.unibo.it
Dataset Splits Yes We performed experiments on our dataset following the leave-one-problem-out (LOO) procedure. This is a standard ML methodology, where each problem in turn is selected as test set while the remaining ones form the training set.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running experiments.
Software Dependencies No The paper mentions 'Stanford Core NLP library' and 'SVM-HMM', but does not provide specific version numbers for these software components.
Experiment Setup Yes Table 1 reports the results obtained on our dataset by different classifiers, as a function of the diameter D used to build contextual features for each word. ... for each word wj we keep the original (unchanged) term, and we also extract the part-of-speech and the stemmed word, both obtained with the Stanford Core NLP library7. ... Finally, we also add the following bag-of-trigrams both for words and for part-of-speech tags: [wj 2wj 1wj], [wj 1wjwj+1], [wjwj+1wj+2].