reproducibilityindex.ai

R1SVM: A Randomised Nonlinear Approach to Large-Scale Anomaly Detection

Authors: Sarah M. Erfani, Mahsa Baktashmotlagh, Sutharshan Rajasegarar, Shanika Karunasekera, Chris Leckie

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical analysis on several real-life and synthetic datasets shows that our randomised 1SVM algorithm achieves comparable or better accuracy to deep autoencoder and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.
Researcher Affiliation	Collaboration	NICTA Victoria Research Laboratory Department of Computing and Information Systems, The University of Melbourne, Australia
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code for the methodology described.
Open Datasets	Yes	The experiments are conducted on six real-life datasets from the UCI Machine Learning Repository: (i) Forest (ii) Adult (iii) Gas Sensor Array Drift (Gas), (iv) Opportunity Activity Recognition (OAR), (v) Daily and Sport Activity (DSA), and (vi) Human Activity Recognition using Smartphones (HAR), with dimensionalities of 54, 123,128, 242, 3151 and 561 features, respectively.
Dataset Splits	No	The paper states '80% of records are randomly selected for training and 20% for testing' but does not explicitly mention a validation split.
Hardware Specification	Yes	The reported training/testing times are in seconds based on experiments run on a machine with an Intel Core i7 CPU at 3.40 GHz and 8 GB RAM.
Software Dependencies	Yes	We used the svdd implementation from Dd-tools (Tax 2013) as the one-class SVM method. Note that the hypersphere-based SVDD model using an RBF kernel is equivalent to a hyperplane-based 1SVM model. In the case of the autoencoder, we implemented a basic autoencoder including ﬁve-layers with tied weights and a sigmoid activation function for both the encoder and decoder. The training is conducted in mini-batches of q = 100 records.
Experiment Setup	Yes	Experimental setup: For visualisation purposes, in the ﬁrst experiment, we used a tool called improved Visual Assessment of cluster Tendency (i VAT) (Wang et al. 2010), which helps visualise the possible number of clusters in, or the cluster tendency of, a set of objects. In the second experiment we used the svdd implementation from Dd-tools (Tax 2013) as the one-class SVM method. In the case of the autoencoder, we implemented a basic autoencoder including ﬁve-layers with tied weights and a sigmoid activation function for both the encoder and decoder. The training is conducted in mini-batches of q = 100 records.