Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Fraud-Proof Revenue Division on Subscription Platforms

Authors: Abheek Ghosh, Tzeh Yuan Neoh, Nicholas Teh, Giannis Tyrovolas

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, experiments with both real-world and synthetic streaming data support SCALEDUSERPROP as a fairer alternative compared to existing rules.
Researcher Affiliation	Academia	1University of Oxford, UK 2Harvard University, USA.
Pseudocode	No	The paper describes methods and rules textually but does not contain a dedicated 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Our code is accessible at https://github.com/nicteh/Fraud-Proof-Revenue-Division.
Open Datasets	Yes	We utilize data from the Music Listening Histories Dataset (Vigliensoni & Fujinaga, 2017), that contains the listening history of approximately 583, 000 users, 439, 000 artists, and a cumulative total of 27 billion listening events (i.e., user-artist interactions).
Dataset Splits	No	The paper describes how synthetic datasets were generated and uses a real-world dataset, but does not provide specific training/test/validation splits for either.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, used to replicate the experiment.
Experiment Setup	Yes	We generate synthetic problem instances involving 10, 000 users and 1, 000 artists. For each user, we ﬁrst determine the number of artists they interact with by drawing a value uniformly at random from the range [1, 100]. ... For each chosen artist, the number of times the user streams their music is sampled from a Poisson distribution with λ = 1. We repeat the experiments 100 times. ... we analyze the top and bottom few users based on their pay-per-stream (PPS) relative to GLOBALPROP s PPS, as the revenue share ( ) varies.