R1SVM: A Randomised Nonlinear Approach to Large-Scale Anomaly Detection
Authors: Sarah M. Erfani, Mahsa Baktashmotlagh, Sutharshan Rajasegarar, Shanika Karunasekera, Chris Leckie
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical analysis on several real-life and synthetic datasets shows that our randomised 1SVM algorithm achieves comparable or better accuracy to deep autoencoder and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing. |
| Researcher Affiliation | Collaboration | NICTA Victoria Research Laboratory Department of Computing and Information Systems, The University of Melbourne, Australia |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | The experiments are conducted on six real-life datasets from the UCI Machine Learning Repository: (i) Forest (ii) Adult (iii) Gas Sensor Array Drift (Gas), (iv) Opportunity Activity Recognition (OAR), (v) Daily and Sport Activity (DSA), and (vi) Human Activity Recognition using Smartphones (HAR), with dimensionalities of 54, 123,128, 242, 3151 and 561 features, respectively. |
| Dataset Splits | No | The paper states '80% of records are randomly selected for training and 20% for testing' but does not explicitly mention a validation split. |
| Hardware Specification | Yes | The reported training/testing times are in seconds based on experiments run on a machine with an Intel Core i7 CPU at 3.40 GHz and 8 GB RAM. |
| Software Dependencies | Yes | We used the svdd implementation from Dd-tools (Tax 2013) as the one-class SVM method. Note that the hypersphere-based SVDD model using an RBF kernel is equivalent to a hyperplane-based 1SVM model. In the case of the autoencoder, we implemented a basic autoencoder including five-layers with tied weights and a sigmoid activation function for both the encoder and decoder. The training is conducted in mini-batches of q = 100 records. |
| Experiment Setup | Yes | Experimental setup: For visualisation purposes, in the first experiment, we used a tool called improved Visual Assessment of cluster Tendency (i VAT) (Wang et al. 2010), which helps visualise the possible number of clusters in, or the cluster tendency of, a set of objects. In the second experiment we used the svdd implementation from Dd-tools (Tax 2013) as the one-class SVM method. In the case of the autoencoder, we implemented a basic autoencoder including five-layers with tied weights and a sigmoid activation function for both the encoder and decoder. The training is conducted in mini-batches of q = 100 records. |