Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

AugSplicing: Synchronized Behavior Detection in Streaming Tensors

Authors: Jiabao Zhang, Shenghua Liu, Wenting Hou, Siddharth Bhatia, Huawei Shen, Wenjian Yu, Xueqi Cheng4653-4661

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We design the experiments to answer the following questions: Q1. Speed and Accuracy: How fast and accurate does our algorithm run compared to the state-of-the-art streaming algorithms and the re-run of batch algorithms on real data? Q2. Real-World Effectiveness: Which anomalies or lockstep behavior does AUGSPLICING spot in real data? Q3. Scalability: How does the running time of our algorithm increase as input tensor grows?
Researcher Affiliation	Collaboration	Jiabao Zhang1, Shenghua Liu1 #, Wenting Hou2, Siddharth Bhatia3, Huawei Shen1, Wenjian Yu4 #, Xueqi Cheng1 1 Institute of Computing Technology, Chinese Academy of Sciences 2 Beijing Innov Sharing Co.Ltd 3 National University of Singapore 4Dept. Computer Science & Tech., Tsinghua University
Pseudocode	Yes	Algorithm 1 Splice two dense blocks
Open Source Code	Yes	Reproducibility: Our code and datasets are publicly available at https://github.com/BGT-M/Aug Splicing.
Open Datasets	Yes	Two rating data are publicly available. App data is mobile device-app installation and uninstallation data under an NDA agreement from a company. Wi-Fi data is device-AP connection and disconnection data from Tsinghua University.
Dataset Splits	No	The paper describes a streaming tensor setting where data arrives incrementally, but it does not specify traditional train/validation/test dataset splits with percentages, sample counts, or predefined split citations.
Hardware Specification	Yes	All experiments are carried out on a 2.3GHz Intel Core i5 CPU with 8GB memory.
Software Dependencies	No	The paper mentions that 'D-CUBE is implemented in Java' but does not provide specific version numbers for Java or any other software dependencies, libraries, or solvers used in the experiments.
Experiment Setup	Yes	We set time stride s to 30 in a day for Yelp, 15 in a day for Beer Advocate, 1 in a day for App and Wi-Fi, as different time granularity. k is set to 10 and l to 5 for all datasets.