SemanticMask: A Contrastive View Design for Anomaly Detection in Tabular Data

Authors: Shuting Tao, Tongtian Zhu, Hongwei Wang, Xiangming Meng

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiment results validate the superiority of Semantic Mask over the state-of-the-art anomaly detection methods and existing augmentation techniques for tabular data.
Researcher Affiliation Academia 1College of Computer Science and Technology, Zhejiang University 2The Zhejiang University-University of Illinois Urbana-Champaign Institute, Zhejiang University
Pseudocode No The paper includes a block diagram (Figure 1) and mathematical formulations but does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The source code and appendix are available on Git Hub at https://github.com/TST826/Semantic Mask.
Open Datasets Yes We conduct experiments on nine datasets with column names sourced from the Outlier Detection Data Sets (ODDS) [Rayana, 2016], the KEEL datasets [Derrac et al., 2015] and the UCI datasets [Markelle et al., 2013].
Dataset Splits Yes We train our method on a random selected 50% subset of the normal data. The validation set, consisting of 25% normal data, is used to determine the threshold. The methods are then tested on the remaining normal data and all anomalous samples.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for conducting the experiments.
Software Dependencies No The paper mentions software components like Sentence-BERT and Adam optimizer, but does not provide specific version numbers for any libraries, frameworks, or programming languages used in the implementation or experimentation.
Experiment Setup Yes For Semantic Mask and its variants, λ is set to 0.5, pm is selected from the set {0.4, 0.5, 0.6}. For Semantic Mask+description, ϵ is set to 0.1. We set k of k-means proportionally to the feature dimension d. For d < 18, k = 2. For 18 d < 100, k = 3. For complex datasets such as Arrhythmia [Rayana, 2016], where d ≥ 100, k = d/100 + 3, features are partitioned into k clusters, forming two disjoint subsets with k/2 clusters each. Contrastive loss uses a constant temperature τ of 0.01. The threshold for identifying anomalies is determined by the 85th quantiles of the Mahalanobis distance in the validation set. The encoder is a multilayer perceptron consisting of two hidden layers with 128 and 64 hidden units, along with the Re LU activation layer. The encoder is trained using the Adam optimizer with a learning rate of 0.001 and default values for other hyperparameters.