Knowledge-Based Sequence Mining with ASP

Authors: Martin Gebser, Thomas Guyet, René Quiniou, Javier Romero, Torsten Schaub

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To this end, we conducted experiments on simulated databases. First, we present time efficiency results comparing our approach with the CP-based one of CPSM [Negrevergne and Guns, 2015]. Then, we illustrate the effectiveness of preference handling to reduce the size of output pattern sets. [...] We ran our respective encodings on two classical databases of the UCI collection [Lichman, 2013]: jmlr (a natural language processing database; each transaction is an abstract of a paper from the Journal of Machine Learning Research) and Unix user (each transaction is a series of shell commands executed by a user during one session).
Researcher Affiliation Academia Martin Gebser,3 Thomas Guyet,1 Ren e Quiniou,2 Javier Romero,3 Torsten Schaub2,3 1AGROCAMPUS-OUEST/IRISA, France 2Inria Centre de Rennes Bretagne Atlantique, France 3University of Potsdam, Germany
Pseudocode Yes Listing 2: Basic encoding of frequent sequence mining, Listing 3: Encoding part for selecting maximal patterns, Listing 4: Modifications for selecting closed patterns, Listing 5: Preference type implementation
Open Source Code No The paper states 'The databases used in our experiments are available at https: //sites.google.com/site/aspseqmining.' but does not explicitly state that the source code for their methodology is available or provide a link to it.
Open Datasets Yes we generated databases using a retro-engineering process: [...] The databases used in our experiments are available at https: //sites.google.com/site/aspseqmining. [...] We ran our respective encodings on two classical databases of the UCI collection [Lichman, 2013]: jmlr (a natural language processing database; each transaction is an abstract of a paper from the Journal of Machine Learning Research) and Unix user (each transaction is a series of shell commands executed by a user during one session).
Dataset Splits No The paper describes generating and using databases but does not specify any explicit train/validation/test splits, percentages, or methodology for data partitioning.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'clingo' and 'asprin' (ASP systems) and 'gecode' (CP solver) but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes It relies on two parameters: max determines a maximum length for patterns of interest, and k specifies the frequency threshold. [...] In our experiments, we vary the mean length from 10 to 40, and contained items are randomly generated according to a Gaussian law (some items are more frequent than others) over a vocabulary of 50 items. [...] The timeout was set to 20 minutes.