Virtual-Taobao: Virtualizing Real-World Online Retail Environment for Reinforcement Learning
Authors: Jing-Cheng Shi, Yang Yu, Qing Da, Shi-Yong Chen, An-Xiang Zeng4902-4909
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, Virtual-Taobao is trained from hundreds of millions of real Taobao customers records. Compared with the real Taobao, Virtual-Taobao faithfully recovers important properties of the real environment. We further show that the policies trained purely in Virtual-Taobao, which has zero physical sampling cost, can have significantly superior real-world performance to the traditional supervised approaches, through online A/B tests. |
| Researcher Affiliation | Collaboration | Jing-Cheng Shi,1,2 Yang Yu,1 Qing Da,2 Shi-Yong Chen,1 An-Xiang Zeng2 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China {shijc, yuy, chensy}@lamda.nju.edu.cn 2Alibaba Group {jingcheng.sjc, daqing.dq}@alibaba-inc.com, renzhong@taobao.com |
| Pseudocode | Yes | Algorithm 1 GAN-SD |
| Open Source Code | No | No explicit statement or link providing access to source code for the described methodology was found. |
| Open Datasets | No | In experiments, Virtual-Taobao is trained from hundreds of millions of real Taobao customers records. |
| Dataset Splits | No | No explicit train/validation/test dataset splits with percentages, counts, or references to predefined standard splits were found. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments were mentioned. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'Python 3.8') were mentioned. |
| Experiment Setup | Yes | We then retrain the TRPO agent with the ANC strategy in which ρ = 1 and µ = 0.01, and R2P is decreased to 0.115 in Virtual-Taobao which is more acceptable. |