Optimizing Interventions via Offline Policy Evaluation: Studies in Citizen Science
Authors: Avi Segal, Kobi Gal, Ece Kamar, Eric Horvitz, Grant Miller
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We implemented TCI in the wild with Galaxy Zoo, one of the largest citizen science platforms on the web. We found that TCI was able to outperform the state-of-the-art intervention policy for this domain, and significantly increased the contributions of thousands of users. This work demonstrates the benefit of combining traditional AI planning with off-line policy methods to generate intelligent intervention strategies. Our experiments, which were performed in the wild on the Galaxy Zoo platform, showed that TCI was able to outperform an earlier myopic approach, by considering the long term effects of the intervention messages. We also found that the policy correction step is critical; the corrected policy achieved significant gains in user productivity when deployed in the live system compared to the target policy generated with a version of TCI without the correction step. We conducted two separate studies to evaluate the effect of the TCI approach. Both studies were based on interventions that were performed in real time in the Galaxy Zoo domain. |
| Researcher Affiliation | Collaboration | Avi Segal, Kobi Gal Ben-Gurion University of the Negev, Israel Ece Kamar, Eric Horvitz Microsoft Research, Redmond WA Grant Miller University of Oxford, U.K. |
| Pseudocode | Yes | Algorithm 1: The TCI Approach Algorithm 2: Policy Correction for Intervention Policy |
| Open Source Code | No | Data and accompanying information to this paper can be found at http://tinyurl.com/ztujcvz. The statement mentions "accompanying information" but does not explicitly state that source code for the methodology is provided. |
| Open Datasets | Yes | The trajectory history consists of an expanded version of the dataset of randomized intervention trials collected from the study of Segal et al. (2016). This data is divided into training, validation and test sets as summarized in Table 2. Data and accompanying information to this paper can be found at http://tinyurl.com/ztujcvz. |
| Dataset Splits | Yes | This data is divided into training, validation and test sets as summarized in Table 2. Table 2: Users Interventions Records Training 2,302 3,265 245,695 Validation 1,722 1,730 114,788 Test 1,281 2,173 119,457 |
| Hardware Specification | Yes | The running time of computing the TCI optimized policy for the dataset in Table 2 on a Mac Book Air 1.7 GHz Intel Core i7, 8GB 1600 MHz DDR3 was 210.84 minutes. |
| Software Dependencies | No | The paper mentions using the Particle Swarm Optimization (PSO) algorithm and the CPLEX solver (in related work, not explicitly for their implementation), but does not specify version numbers for any software dependencies used in their experiments. It only refers to a programming language implicitly through 'implemented in Python' or similar in other contexts, but no specific libraries with versions. |
| Experiment Setup | Yes | In all studies. the discount factor δ was set to 0.95 (determined empirically). We used parameter values recommended by Pedersen et al. (2010) with a swarm size of 100 particles and a maximum of 40,000 fitness evaluation steps. Stopping was performed if the evaluation steps limit was reached or if fitness did not improve in the last 100 iterations (set empirically). |