Unsupervised Representation Learning by Predicting Random Distances
Authors: Hu Wang, Guansong Pang, Chunhua Shen, Congbo Ma
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on 19 real-world datasets show that our learned representations substantially outperform a few state-of-the-art methods for both anomaly detection and clustering tasks. This section evaluates our method through two typical unsupervised tasks: anomaly detection and clustering. |
| Researcher Affiliation | Academia | Hu Wang , Guansong Pang , Chunhua Shen , Congbo Ma The University of Adelaide, Australia |
| Pseudocode | No | The paper describes the RDP framework but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at: https://git.io/ RDP |
| Open Datasets | Yes | As shown in Table 1, 14 publicly available datasets taken from the literature [Liu et al., 2008; Pang et al., 2018; Zong et al., 2018], are used, which are from various domains, including network intrusion, credit card fraud detection, and disease detection. |
| Dataset Splits | No | The paper mentions public datasets and statistical procedures like averaging over runs for stability, but does not explicitly provide train/validation/test split percentages, sample counts, or specific predefined splits for reproduction. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions using SGD as an optimizer and K-means, but does not specify version numbers for any software dependencies or libraries used. |
| Experiment Setup | Yes | The RDP consists of one fully connected layer with 50 hidden units, followed by a leaky-Re LU layer. It is trained using Stochastic Gradient Descent (SGD) as its optimiser for 200 epochs, with 192 samples per batch. The learning rate is fixed to 0.1. Compared to anomaly detection, more semantic information is required for clustering algorithms to work well, so the network consists of 1,024 hidden units and is trained for 1,000 epochs. |