Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Steering Social Activity: A Stochastic Optimal Control Point Of View
Authors: Ali Zarezade, Abir De, Utkarsh Upadhyay, Hamid R. Rabiee, Manuel Gomez-Rodriguez
JMLR 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we experiment both with synthetic and real data gathered from Twitter and show that our algorithms consistently steer social activity more effectively than the state of the art. and 5. Experiments |
| Researcher Affiliation | Academia | Ali Zarezade EMAIL Sharif University of Technology Teheran, Iran, Abir De EMAIL Max Planck Institute for Software Systems Kaiserslautern, Germany, Utkarsh Upadhyay EMAIL Max Planck Institute for Software Systems Kaiserslautern, Germany, Hamid R. Rabiee EMAIL Sharif University of Technology Teheran, Iran, Manuel Gomez-Rodriguez EMAIL Max Planck Institute for Software Systems Kaiserslautern, Germany |
| Pseudocode | Yes | Algorithm 1: Red Queen for fixed s, q and one follower. Algorithm 2: Optimal posting times with an oracle. Algorithm 3: Cheshire: it returns user i and time τ for the next incentivized action |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It does not contain an explicit statement about releasing the code or a link to a code repository. |
| Open Datasets | Yes | We use data gathered from Twitter as reported in previous work (Cha et al., 2010), which comprises profiles of 52 million users, 1.9 billion directed follow links among these users, and 1.7 billion public tweets posted by the collected users. and To create each data set, we used the Twitter search API11 to collect all the tweets (corresponding to a 2-3 weeks period around the event date) that contain hashtags related to: [...] We then built the follower-followee network for the users that posted the collected tweets using the Twitter rest API12. |
| Dataset Splits | No | The paper describes using historical Twitter data for evaluation and mentions "20 simulation runs" for robustness, but it does not specify explicit training/validation/test splits for the datasets to reproduce the experiment in a standard machine learning context. |
| Hardware Specification | Yes | The experiments are carried out in a single machine with 24 cores and 64 GB of main memory. |
| Software Dependencies | No | The paper mentions using "Ogata s thinning algorithm (Ogata, 1981)" and "Lewis and Shedler (1979)" for sampling, and "Twitter search API" and "Twitter rest API" for data collection. However, it does not provide specific version numbers for any software libraries, frameworks, or APIs. |
| Experiment Setup | Yes | Unless otherwise stated, we set the significance si(t) = 1, t, i and use the parameter q to control the number of posts by Red Queen10. For each network, we draw B from a uniform distribution U(0, 10), λ0 also from a uniform distribution U(0, 10) for 20% of the nodes and λ0 = 0 for the remaining 80%, and set ω = 16, t0 = 0 and tf = 5.5. For each data set, we estimate the influence matrix B of the multidimensional Hawkes process defined by Eq. 2 using maximum likelihood, as elsewhere (Farajtabar et al., 2014; Valera and Gomez-Rodriguez, 2015). Moreover, we set the decay parameter ω of the corresponding exponential triggering kernel κ(t) by cross-validation. and we tune the parameters Q, S and F to be diagonal matrices such that the total number of incentivized tweets posted by our method is equal to the budget used in the state of the art methods and baselines. |