reproducibilityindex.ai

Optimizing Interventions via Offline Policy Evaluation: Studies in Citizen Science

Authors: Avi Segal, Kobi Gal, Ece Kamar, Eric Horvitz, Grant Miller

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implemented TCI in the wild with Galaxy Zoo, one of the largest citizen science platforms on the web. We found that TCI was able to outperform the state-of-the-art intervention policy for this domain, and signiﬁcantly increased the contributions of thousands of users. This work demonstrates the beneﬁt of combining traditional AI planning with off-line policy methods to generate intelligent intervention strategies. Our experiments, which were performed in the wild on the Galaxy Zoo platform, showed that TCI was able to outperform an earlier myopic approach, by considering the long term effects of the intervention messages. We also found that the policy correction step is critical; the corrected policy achieved signiﬁcant gains in user productivity when deployed in the live system compared to the target policy generated with a version of TCI without the correction step. We conducted two separate studies to evaluate the effect of the TCI approach. Both studies were based on interventions that were performed in real time in the Galaxy Zoo domain.
Researcher Affiliation	Collaboration	Avi Segal, Kobi Gal Ben-Gurion University of the Negev, Israel Ece Kamar, Eric Horvitz Microsoft Research, Redmond WA Grant Miller University of Oxford, U.K.
Pseudocode	Yes	Algorithm 1: The TCI Approach Algorithm 2: Policy Correction for Intervention Policy
Open Source Code	No	Data and accompanying information to this paper can be found at http://tinyurl.com/ztujcvz. The statement mentions "accompanying information" but does not explicitly state that source code for the methodology is provided.
Open Datasets	Yes	The trajectory history consists of an expanded version of the dataset of randomized intervention trials collected from the study of Segal et al. (2016). This data is divided into training, validation and test sets as summarized in Table 2. Data and accompanying information to this paper can be found at http://tinyurl.com/ztujcvz.
Dataset Splits	Yes	This data is divided into training, validation and test sets as summarized in Table 2. Table 2: Users Interventions Records Training 2,302 3,265 245,695 Validation 1,722 1,730 114,788 Test 1,281 2,173 119,457
Hardware Specification	Yes	The running time of computing the TCI optimized policy for the dataset in Table 2 on a Mac Book Air 1.7 GHz Intel Core i7, 8GB 1600 MHz DDR3 was 210.84 minutes.
Software Dependencies	No	The paper mentions using the Particle Swarm Optimization (PSO) algorithm and the CPLEX solver (in related work, not explicitly for their implementation), but does not specify version numbers for any software dependencies used in their experiments. It only refers to a programming language implicitly through 'implemented in Python' or similar in other contexts, but no specific libraries with versions.
Experiment Setup	Yes	In all studies. the discount factor δ was set to 0.95 (determined empirically). We used parameter values recommended by Pedersen et al. (2010) with a swarm size of 100 particles and a maximum of 40,000 ﬁtness evaluation steps. Stopping was performed if the evaluation steps limit was reached or if ﬁtness did not improve in the last 100 iterations (set empirically).