Towards a Hierarchical Bayesian Model of Multi-View Anomaly Detection
Authors: Zhen Wang, Chao Lan
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the experiment, we show the proposed Bayesian detector consistently outperforms state-of-the-art counterparts across several public data sets and three well-known types of multi-view anomalies. In the experiment, we show the proposed model consistently outperforms state-of-the-art multi-view anomaly detectors across both synthetic and real-world multi-view data. |
| Researcher Affiliation | Academia | Zhen Wang and Chao Lan Department of Computer Science, University of Wyoming, WY, USA {zwang10, clan}@uwyo.edu |
| Pseudocode | Yes | Algorithm 1 Compute Optimal Threshold Input: Data {X, X }, Swapping Rate γ, Detection Rate ζ Output: Detection Threshold ˆτζ 1: Generate mixture set X γ via swapping views randomly. 2: Compute anomaly scores for all points in X and X γ via Eq. (28), and denote them as S and Sγ respectively. 3: Calculate empirical CDF ˆFa. 4: Optimize threshold by ˆτζ = max s(x) {S, Sγ}| ˆFa(s(x)) 1 ζ |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the methodology described is publicly available or released. |
| Open Datasets | Yes | We now show the effectiveness of proposed method on public Outlier Detection Datasets (ODDS)2, Web KB dataset3 and Movie Lens dataset4. 2http://odds.cs.stonybrook.edu 3http://lig-membres.imag.fr/grimal/data.html 4https://grouplens.org/datasets/movielens/latest |
| Dataset Splits | No | The paper states: 'After the outlier generation stage, we equivalently split all normal instances into two parts, and use one of them as the training set to train the proposed model. Then we verify the outlier detection performance on the test set'. While a train/test split is described, there is no mention of a separate validation set or specific details for reproducibility of the split beyond 'equivalently split'. |
| Hardware Specification | No | The paper describes the proposed model and experimental evaluations, but it does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used to run the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, frameworks, or specific solvers). |
| Experiment Setup | Yes | In particular, we assign the automatic relevance determination (ARD) prior [Neal, 2012] on the projection matrices to sparsify their columns for automatically determining the dimension of latent factor; we also place Student s t distributions on the latent factor prior and the likelihood to improve robustness of the estimator [Archambeau et al., 2006; Gai et al., 2008]. Since we have no further knowledge about the hyperparameters of priors, we choose broad ones by setting aα = bα = βv = 10-3, Kv = 10-3Idv, νv = dv + 1, aν = 2 and bν = 0.1, m = min{dv 1; v = 1, . . . , V }. On each dataset, we repeat the random outlier generation procedure 20 times and at each time, we perturb 2.5% of the data in that procedure. |