World of Bits: An Open-Domain Platform for Web-Based Agents
Authors: Tianlin Shi, Andrej Karpathy, Linxi Fan, Jonathan Hernandez, Percy Liang
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we show that agents trained via behavioral cloning and reinforcement learning can complete a range of web-based tasks. [...] 4. Experiments Our goal in this section is to establish baselines that current techniques provide on web environments, and highlight the challenges for future work in this area. |
| Researcher Affiliation | Collaboration | 1Stanford University, Stanford, USA 2Open AI, San Francisco, USA. Correspondence to: Tianlin (Tim) Shi <tianlin@cs.stanford.edu>. |
| Pseudocode | No | The paper describes methods like behavior cloning and reinforcement learning (A3C) but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states: "To interact with a web browser, we developed our platform on top of Open AI Universe (http://universe.openai.com/)", which refers to a third-party platform used, not the authors' own source code for their methodology. There is no explicit statement or link providing access to their own source code. |
| Open Datasets | No | The paper describes the creation of datasets such as Mini Wo B, Form Wo B, and QAWo B, and details their characteristics and collection methods (e.g., "Our crowdsourced QAWo B dataset has 521 query templates."). However, it does not provide any specific link, DOI, repository name, or formal citation for public access to these datasets. |
| Dataset Splits | No | The paper states: "We split the tasks on each website into 80% for training, and 20% for testing." It does not explicitly mention a separate validation set split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions various software components and libraries such as "Open AI Universe", "Gym", "Chrome browser inside a Docker container", and optimization algorithms like "Adam" and "A3C". However, it does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | We obtain a behavior cloning policy by training on the demonstrations using Adam (Kingma & Ba, 2014) with a learning rate of 10 3 and batch size of 32. We achieved better results by weighing click and keyboard event losses (which are rare compared to move events) 10 times higher in the objective. [...] We run 12 environments in parallel at 12 FPS for up to 1 million steps and perform an update every 200 time steps (i.e. training batches have size 12 200 = 2400 steps) with Adam and a learning rate of 10 4. [...] We use similar supervised learning setting as in Mini Wo B, except the learning rate is 10 4 and the keyboard event losses are weighted 20 times higher. For every episode, we sample randomly from the set of queries and run the model at 8 FPS. |