reproducibilityindex.ai

Knowledge-Based Sequence Mining with ASP

Authors: Martin Gebser, Thomas Guyet, René Quiniou, Javier Romero, Torsten Schaub

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To this end, we conducted experiments on simulated databases. First, we present time efﬁciency results comparing our approach with the CP-based one of CPSM [Negrevergne and Guns, 2015]. Then, we illustrate the effectiveness of preference handling to reduce the size of output pattern sets. [...] We ran our respective encodings on two classical databases of the UCI collection [Lichman, 2013]: jmlr (a natural language processing database; each transaction is an abstract of a paper from the Journal of Machine Learning Research) and Unix user (each transaction is a series of shell commands executed by a user during one session).
Researcher Affiliation	Academia	Martin Gebser,3 Thomas Guyet,1 Ren e Quiniou,2 Javier Romero,3 Torsten Schaub2,3 1AGROCAMPUS-OUEST/IRISA, France 2Inria Centre de Rennes Bretagne Atlantique, France 3University of Potsdam, Germany
Pseudocode	Yes	Listing 2: Basic encoding of frequent sequence mining, Listing 3: Encoding part for selecting maximal patterns, Listing 4: Modiﬁcations for selecting closed patterns, Listing 5: Preference type implementation
Open Source Code	No	The paper states 'The databases used in our experiments are available at https: //sites.google.com/site/aspseqmining.' but does not explicitly state that the source code for their methodology is available or provide a link to it.
Open Datasets	Yes	we generated databases using a retro-engineering process: [...] The databases used in our experiments are available at https: //sites.google.com/site/aspseqmining. [...] We ran our respective encodings on two classical databases of the UCI collection [Lichman, 2013]: jmlr (a natural language processing database; each transaction is an abstract of a paper from the Journal of Machine Learning Research) and Unix user (each transaction is a series of shell commands executed by a user during one session).
Dataset Splits	No	The paper describes generating and using databases but does not specify any explicit train/validation/test splits, percentages, or methodology for data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions 'clingo' and 'asprin' (ASP systems) and 'gecode' (CP solver) but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	It relies on two parameters: max determines a maximum length for patterns of interest, and k speciﬁes the frequency threshold. [...] In our experiments, we vary the mean length from 10 to 40, and contained items are randomly generated according to a Gaussian law (some items are more frequent than others) over a vocabulary of 50 items. [...] The timeout was set to 20 minutes.