Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
ADPretrain: Advancing Industrial Anomaly Detection via Anomaly Representation Pretraining
Authors: Xincheng Yao, Yan Luo, Zefeng Qian, Chongyang Zhang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For evaluation, based on five embedding-based AD methods, we simply replace their original features with our pretrained representations. Extensive experiments on five AD datasets and five backbones consistently show the superiority of our pretrained features. |
| Researcher Affiliation | Academia | Xincheng Yao1, Yan Luo3,4 , Zefeng Qian1, Chongyang Zhang1,2 1School of Information Science and Electronic Engineering, Shanghai Jiao Tong University 2Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University 3College of Artificial Intelligence, Nanjing Agricultural University 4Key Laboratory of Livestock Farming Equipment, Ministry of Agriculture and Rural Affairs, Nanjing Agricultural University EMAIL1 EMAIL.cn3 |
| Pseudocode | No | The paper describes its methodology using mathematical equations and figures (e.g., Figure 2: Framework overview) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code is available at https://github.com/xcyao00/ADPretrain. |
| Open Datasets | Yes | We conduct extensive experiments on five AD datasets, including MVTec AD [2], Vis A [56], BTAD [25], MVTec3D [4], and MPDD [18], to evaluate the effectiveness of our pretrained AD representations. ... Luckily, the proposal of the Real IAD dataset [44] provides us with such a prerequisite. |
| Dataset Splits | Yes | All training and test images are resized and cropped to 224 224. ... For image-level anomaly detection, the standard metric in anomaly detection, AUROC, is used [33, 3, 2]. ... where we utilize only 10% normal samples from each downstream AD dataset for training. ... We follow the 2-shot and 4-shot settings in KAG-Prompt [38], the results are in Tab.5. |
| Hardware Specification | No | The paper mentions computational costs (parameters and FLOPs) for different backbone networks in Table 7 but does not specify the exact hardware (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states, 'We use Adam [28] optimizer with 1e 4 learning rate to train' and refers to reproducing methods based on their 'official open-source code', but it does not specify version numbers for any programming languages, libraries, or frameworks (e.g., Python, PyTorch, CUDA). |
| Experiment Setup | Yes | Implementation Details. Like most pretraining works, we employ multiple backbones for anomaly representation pretraining. ... We fix the parameters of the backbone networks, as preserving their basic visual representation capabilities is beneficial (see Tab.2(a)). We use Adam [28] optimizer with 1e 4 learning rate to train. The batch size is set as 32 and the total training epochs are 10. The temperature hyperparameter τ and margin r are set as 0.15 and 0.75. The Nr is set to 2048. We use 42 as the random seed during pretraining. All training and test images are resized and cropped to 224 224. ... When generating residual features, we match each input image with 8 reference samples. |