Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
MTVHunter: Smart Contracts Vulnerability Detection Based on Multi-Teacher Knowledge Translation
Authors: Guokai Sun, Yuan Zhuang, Shuo Zhang, Xiaoyu Feng, Zhenguang Liu, Liguo Zhang
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on 229,178 real-world smart contracts that concerns four types of common vulnerabilities. Extensive experiments show MTVHunter achieves significantly performance gains over state-of-the-art approaches. |
| Researcher Affiliation | Academia | 1 College of Computer Science and Technology, Harbin Engineering University, Heilongjiang, China 2The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, China 3Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security, Hangzhou, China EMAIL, EMAIL |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided in the paper. The methodology is described in prose. |
| Open Source Code | Yes | The codes are available at https://github.com/KDSCVD/MTVHunter. |
| Open Datasets | Yes | We collected 229,178 public smart contracts from the official Ethereum website. |
| Dataset Splits | Yes | Eventually, we manually labeled the ground truth in each category by auditing the source code of contracts, and split 1627 positive contracts and 5860 negative contracts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as CPU/GPU models or memory specifications. |
| Software Dependencies | No | Concretely, we first employ solc1 compiler to generate hexadecimal bytecode from source code, and then disassemble it into opcodes. Later, a CFG is constructed with the opcodes by an off-the-shelf symbolic execution solver, namely Octopus2. The paper does not provide specific version numbers for these tools or any other software dependencies. |
| Experiment Setup | No | While the paper discusses various losses and hyperparameters (e.g., α and β for multi-knowledge loss, number of neurons for distillation), it does not provide concrete numerical values for these hyperparameters (e.g., learning rate, batch size, specific values for α and β, number of epochs) or details about the optimizer used, which are essential for reproducing the experimental setup. |