Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
One-class classification of point patterns of extremes
Authors: Stijn Luca, David A. Clifton, Bart Vanrumste
JMLR 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The approach is illustrated using simulated data and then a real-life application is used as an exemplar, whereby accelerometry data from epileptic seizures are analysed these are known to be extreme and rare with respect to normal accelerometer data. Keywords: Sequence classification; novelty detection; extreme value theory; class imbalance; asymptotic theory. The method is evaluated using synthetic as well as real-world data, and is compared with commonly used algorithms for outlier detection such as one-class support vector machines (OCSVMs) and hidden Markov models (HMMs). |
| Researcher Affiliation | Academia | Stijn Luca EMAIL KU Leuven Technology Campus Geel Department of Electrical Engineering Kleinhoefstraat 4, 2440, Geel, Belgium. David A. Clifton EMAIL University of Oxford Department of Engineering Science Old Road Campus Research Building Roosevelt Drive, Oxford, OX3 7DQ, UK. Bart Vanrumste EMAIL KU Leuven Technology Campus Geel Department of Electrical Engineering Kleinhoefstraat 4, 2440, Geel, Belgium. |
| Pseudocode | No | The paper describes methods and theoretical concepts, and illustrates features with mathematical formulations and figures. However, it does not include any explicitly labeled pseudocode blocks or algorithms with structured steps. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository in the main text or supplementary materials. |
| Open Datasets | Yes | In this section, a case study in the healthcare domain is considered using a set of acceleration data collected from movements of patients suffering from epilepsy (Cuppens et al., 2013). Table 1: Overview of epileptic accelerometry data set. |
| Dataset Splits | Yes | To this end, 5-fold cross-validation is performed where in each run a random subset of the data from the normal class is used for training and the remainder of the data is split evenly between validation and test data. The randomized runs are kept the same across the different classifiers to allow a consistent comparison. |
| Hardware Specification | No | The paper mentions 'accelerometry data' collected from sensors, but it does not specify any hardware (like GPUs, CPUs, or specific computer models) used for running the experiments or simulations. |
| Software Dependencies | No | The paper mentions using a 'kernel density estimator' and comparing with 'OCSVMs' and 'HMMs'. However, it does not provide specific version numbers for any software libraries, frameworks, or tools used for implementation. |
| Experiment Setup | Yes | For the HMM, the number of states varies from 1 4 (Rabiner and Murray, 1989), while for the OCSVM the standard hyperparameters (σ, ν) are optimized that respectively denote the kernel width of the Gaussian kernel that is used and an upper bound on the fraction of outliers (Sch olkopf et al., 2001). The threshold on the novelty scores is optimized using the validation data. For the EVT model, no validation step is performed and no data from the abnormal class are considered during training. A threshold of 95% is chosen on the novelty score (motivated from a probabilistic viewpoint). The density of the distribution X describing the normal class is estimated using a kernel density estimation with Gaussian kernels, and where the kernel width is estimated by minimization of the mean integrated squared error (Scott, 1992). |