Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Vector Database Watermarking

Authors: Zhiwen Ren, Wei Fan, Qiyi Yao, Jing Qiu, Weiming Zhang, Nenghai Yu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that compared to the current most effective and relevant watermarking schemes, the TVP scheme can significantly reduce the number of missed and false queries by approximately 75%. The paper includes a dedicated '7 Experiments' section which details experimental setup, comparison with other schemes, and parameter analysis. It also presents tables and figures showing quantitative results such as AMQ, AFQ, and BER on various datasets.
Researcher Affiliation	Academia	Zhiwen Ren, Wei Fan, Qiyi Yao , Jing Qiu, Weiming Zhang , Nenghai Yu School of Cyber Science and Technology University of Science and Technology of China, Hefei, China Corresponding author: {qyyao@mail., zhangwm@} ustc.edu.cn.
Pseudocode	Yes	The watermark embedding process consists of four steps: generating vector identifier, vector grouping, cryptographic mapping and vector modification. The pseudo-code is shown in Algorithm 1 in Appendix B. Appendix B is titled 'Pseudo-code' and contains 'Algorithm 1: Watermark Embedding', 'Algorithm 2: Cryptographic Mapping', and 'Algorithm 3: Watermark Extraction'.
Open Source Code	Yes	Justification: I will provide all the code and datasets for the experiment.
Open Datasets	Yes	The data and environment used for the experiments are consistent with Section 6.1, i.e., the commonly used vector dataset ANN_SIFT1M[18] was implemented and experimented with using Python and Faiss libraries on PCs equipped with AMD Ryzen 5 5600G processors. Laurent Amsaleg and Hervé Jégou. Datasets for approximate nearest neighbor search. http://corpus-texmex.irisa.fr/. Accessed: 2024-08-03. 2010.
Dataset Splits	No	The paper does not explicitly mention training/test/validation dataset splits. It focuses on evaluating the impact of watermarking on Approximate Nearest Neighbor (ANN) queries on existing vector databases, rather than a machine learning model training process that would typically require such splits.
Hardware Specification	Yes	The data and environment used for the experiments are consistent with Section 6.1, i.e., the commonly used vector dataset ANN_SIFT1M[18] was implemented and experimented with using Python and Faiss libraries on PCs equipped with AMD Ryzen 5 5600G processors.
Software Dependencies	No	The data and environment used for the experiments are consistent with Section 6.1, i.e., the commonly used vector dataset ANN_SIFT1M[18] was implemented and experimented with using Python and Faiss libraries on PCs equipped with AMD Ryzen 5 5600G processors. No specific version numbers for Python or Faiss are provided.
Experiment Setup	Yes	The default parameters for the experiment are M = 8, ef Construct = 100, and k = 100 neighbors per query. We evaluated the average number of missed queries (AMQ) and average number of false queries (AFQ) introduced by TVP under various HNSW parameter settings (M {4, 8, 12, 16} and ef Construct {50, 100, 150}), with the number of retrieved neighbors fixed at k = 100.