Unsupervised Representation Learning by Predicting Random Distances

Authors: Hu Wang, Guansong Pang, Chunhua Shen, Congbo Ma

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results on 19 real-world datasets show that our learned representations substantially outperform a few state-of-the-art methods for both anomaly detection and clustering tasks. This section evaluates our method through two typical unsupervised tasks: anomaly detection and clustering.
Researcher Affiliation Academia Hu Wang , Guansong Pang , Chunhua Shen , Congbo Ma The University of Adelaide, Australia
Pseudocode No The paper describes the RDP framework but does not include any explicit pseudocode or algorithm blocks.
Open Source Code Yes Code is available at: https://git.io/ RDP
Open Datasets Yes As shown in Table 1, 14 publicly available datasets taken from the literature [Liu et al., 2008; Pang et al., 2018; Zong et al., 2018], are used, which are from various domains, including network intrusion, credit card fraud detection, and disease detection.
Dataset Splits No The paper mentions public datasets and statistical procedures like averaging over runs for stability, but does not explicitly provide train/validation/test split percentages, sample counts, or specific predefined splits for reproduction.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper mentions using SGD as an optimizer and K-means, but does not specify version numbers for any software dependencies or libraries used.
Experiment Setup Yes The RDP consists of one fully connected layer with 50 hidden units, followed by a leaky-Re LU layer. It is trained using Stochastic Gradient Descent (SGD) as its optimiser for 200 epochs, with 192 samples per batch. The learning rate is fixed to 0.1. Compared to anomaly detection, more semantic information is required for clustering algorithms to work well, so the network consists of 1,024 hidden units and is trained for 1,000 epochs.