A Framework for Outlier Description Using Constraint Programming

Authors: Chia-Tung Kuo, Ian Davidson

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate our proposed framework on real datasets, including medical imaging and text corpus, and demonstrate how the results are useful and interpretable in these domains.
Researcher Affiliation Academia Chia-Tung Kuo Department of Computer Science University of California, Davis tomkuo@ucdavis.edu Ian Davidson Department of Computer Science University of California, Davis davidson@cs.ucdavis.edu
Pseudocode No The paper describes the formulations and constraints in prose and mathematical notation but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper mentions using specific CP languages and solvers (Numberjack, CPLEX, Gurobi) but does not state that the authors' own implementation code for the described methodology is open-source or provide a link.
Open Datasets Yes The version we used is from http://qwone.com/ jason/ 20Newsgroups/ where cross-posts and some headers were removed.
Dataset Splits No The paper describes how instances were designated as 'normal' or 'outlier' for the purpose of outlier description (e.g., 'we use the 19 scans from the healthy subjects as normal points and randomly choose 3 scans from the demented subjects as outliers'), but it does not provide train/validation/test splits in the context of model training for reproducibility.
Hardware Specification No The paper states: 'Each of our experiments (f MRI and text documents) took about 5 minutes to run on a 12-core workstation.' This describes the number of cores but lacks specific CPU model, GPU, or memory details.
Software Dependencies No The paper mentions using 'the CP language Numberjack' and refers to 'Gurobi (Gurobi Optimization 2015) and CPLEX' as well as 'Gecode (Gecode Team 2006)', but it does not provide specific version numbers for these software dependencies.
Experiment Setup Yes For Experiment 1, the parameters were set as 'kmin = 1, kmax = 10 and discretized domain for r {0, 0.5, 1, . . . , 10}' and 'a bound constraint on the size of the learnt subspace with i fi 30'. For Experiment 2, 'kmax = 20 and r {0.01, 0.02, . . . , 2.0} and we enforce a constraint on the size of the subspace, i fi 10'. For multiple subspaces, 'an additional bound on k N k O 5'.