ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels
Authors: Yue Zhao, Guoqing Zheng, Subhabrata Mukherjee, Robert McCann, Ahmed Awadallah
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive results on eight datasets (including a proprietary enterprise security dataset) demonstrate the effectiveness of ADMo E, where it brings up to 34% performance improvement over not using it. |
| Researcher Affiliation | Collaboration | Yue Zhao1*, Guoqing Zheng2, Subhabrata Mukherjee2, Robert Mc Cann2, Ahmed Awadallah2 1Carnegie Mellon University 2Microsoft zhaoy@cmu.edu, {zheng, submukhe, robmccan, hassanam}@microsoft.com |
| Pseudocode | No | The paper describes the ADMo E framework and its components (e.g., MoE layers, gating function, loss function) using textual descriptions and diagrams (like Figure 2) but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | 2See code and appendix: https://github.com/microsoft/admoe |
| Open Datasets | Yes | As shown in Table 2, we evaluate ADMo E on seven public datasets adapted from AD repositories (Campos et al. 2016; Han et al. 2022) and a proprietary enterprise-security dataset (with t = 3 sets of noisy labels). |
| Dataset Splits | Yes | For methods with built-in randomness, we run four independent trials and take the average, with a fixed dataset split (70% train, 25% for test, 5% for validation). |
| Hardware Specification | No | The paper describes the experimental setup and training process but does not provide specific details on the hardware used, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper does not explicitly provide a list of software dependencies with specific version numbers (e.g., 'PyTorch 1.9', 'Python 3.8') needed for replication in the provided text. |
| Experiment Setup | Yes | Backbone AD Algorithms, Model Capacity, and Hyperparameters. We show the generality of ADMo E to enhance (i) simple MLP and (ii) SOTA Deep SAD (Ruff et al. 2019). To ensure a fair comparison, we ensure all methods have the equivalent number of trainable parameters and FLOPs. See Appx. C.2 and code for additional settings. |