Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
ScreenAgent: A Vision Language Model-driven Computer Control Agent
Authors: Runliang Niu, Jindong Li, Shiqi Wang, Yali Fu, Xiyu Hu, Xueyuan Leng, He Kong, Yi Chang, Qi Wang
IJCAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Additionally, we construct the Screen Agent Dataset, which collects screenshots and action sequences when completing daily computer tasks. Finally, we train a model, Screen Agent, which achieves comparable computer control capabilities to GPT-4V and demonstrated more precise UI positioning capabilities. |
| Researcher Affiliation | Academia | 1 School of Arti๏ฌcial Intelligence, Jilin University 2 Engineering Research Center of Knowledge-Driven Human-Machine Intelligence, Ministry of Education, China EMAIL, EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and more detailed information are at https://github.com/niuzaisheng/Screen Agent. |
| Open Datasets | Yes | The dataset has 273 complete task sessions, with 203 sessions (3005 screenshots) for training and 70 sessions (898 screenshots) for testing. |
| Dataset Splits | No | The paper explicitly mentions training and testing splits for their dataset, but does not provide details for a validation split for their dataset, nor for the other datasets used for fine-tuning. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific software dependencies (e.g., library names with version numbers) needed to replicate the experiment. |
| Experiment Setup | No | The paper mentions fine-tuning a model and data mixing for training phases, but it does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, epochs) or optimizer settings. |