reproducibilityindex.ai

Unsupervised Representation Learning by Predicting Random Distances

Authors: Hu Wang, Guansong Pang, Chunhua Shen, Congbo Ma

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results on 19 real-world datasets show that our learned representations substantially outperform a few state-of-the-art methods for both anomaly detection and clustering tasks. This section evaluates our method through two typical unsupervised tasks: anomaly detection and clustering.
Researcher Affiliation	Academia	Hu Wang , Guansong Pang , Chunhua Shen , Congbo Ma The University of Adelaide, Australia
Pseudocode	No	The paper describes the RDP framework but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at: https://git.io/ RDP
Open Datasets	Yes	As shown in Table 1, 14 publicly available datasets taken from the literature [Liu et al., 2008; Pang et al., 2018; Zong et al., 2018], are used, which are from various domains, including network intrusion, credit card fraud detection, and disease detection.
Dataset Splits	No	The paper mentions public datasets and statistical procedures like averaging over runs for stability, but does not explicitly provide train/validation/test split percentages, sample counts, or specific predefined splits for reproduction.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions using SGD as an optimizer and K-means, but does not specify version numbers for any software dependencies or libraries used.
Experiment Setup	Yes	The RDP consists of one fully connected layer with 50 hidden units, followed by a leaky-Re LU layer. It is trained using Stochastic Gradient Descent (SGD) as its optimiser for 200 epochs, with 192 samples per batch. The learning rate is ﬁxed to 0.1. Compared to anomaly detection, more semantic information is required for clustering algorithms to work well, so the network consists of 1,024 hidden units and is trained for 1,000 epochs.