Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Spotting Collective Behaviour of Online Frauds in Customer Reviews

Authors: Sarthika Dhawan, Siva Charan Reddy Gangireddy, Shiv Kumar, Tanmoy Chakraborty

IJCAI 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on four real-world labeled datasets (two of them were prepared by us) show that De Frauder signiﬁcantly outperforms ﬁve baselines it beats the best baseline by 11.35% higher accuracy for detecting groups, and 17.11% higher NDCG@50 for ranking groups (averaged over all datasets).
Researcher Affiliation	Academia	Sarthika Dhawan1 , Siva Charan Reddy Gangireddy1 , Shiv Kumar2 and Tanmoy Chakraborty1 1Indraprastha Institute of Information Technology Delhi (IIITD), India 2Netaji Subhas University of Technology (NSUT), Delhi, India EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Extract Groups
Open Source Code	Yes	De Frauder: Detecting Fraud Reviewer Groups, Code is available in [Dhawan et al.2019].
Open Datasets	Yes	We collected four real-world datasets Yelp NYC: hotel/restaurant reviews of New York city [Rayana and Akoglu2015]; Yelp Zip: aggregation of reviews on restaurants/hotels from a number of areas with continuous zip codes starting from New York city [Rayana and Akoglu2015]; Amazon: reviews on musical instruments [He and Mc Auley2016], and Playstore: reviews of different applications available on Google Playstore.
Dataset Splits	No	The paper describes the datasets used and evaluation metrics, but does not explicitly provide details about how the datasets were split into training, validation, and test sets for reproduction.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory, cloud instances) used to conduct the experiments.
Software Dependencies	No	The paper mentions using 'Word2Vec [Mikolov et al.2013]' and 'Node2Vec [Grover and Leskovec2016]' for embedding, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	Extract Groups achieves best results with τt = 20 and τr = (max min)20% (see Sec. 5.3).