Building Personalized Simulator for Interactive Search
Authors: Qianlong Liu, Baoliang Cui, Zhongyu Wei, Baolin Peng, Haikuan Huang, Hongbo Deng, Jianye Hao, Xuanjing Huang, Kam-Fai Wong
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results based on real-world dataset demonstrate the effectiveness of our agent and personalized simulator. and 5 Experiments 5.1 Dataset and Abstract Query 5.2 Experiment Setup 5.3 Experimental Results |
| Researcher Affiliation | Collaboration | 1Fudan University, China 2Alibaba Group, China 3The Chinese University of Hong Kong, Hong Kong 4Tianjin University, China |
| Pseudocode | Yes | Algorithm 1: Building real experience buffer with abstract query., Algorithm 2: Rollout Simulation, and Algorithm 3: Training Algorithm |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating the open-sourcing of the code for the described methodology. |
| Open Datasets | No | Our dataset is derived from the log of Taobao APP 2, which is processed into transition tuple, i.e., (s, a, r, s , trmt). 2The largest E-commerce platform in China. |
| Dataset Splits | Yes | Table 1: Description of our dataset, of which 60% for training, 20% for testing, 20% for validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions 'word2vec' but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | Implementation Details. The simulator is pre-trained before interacting with our agent and fixed unchanged while interacting with the agent. Each word is represented by word embedding (200-dim) trained on the historical queries from other scenarios of Taobao APP ( 12.7 million queries, 332,922 words) via word2vec. ϵ = 0.2, γ = 0.9, m = 5, N s = 5, T = 20, learning rate is set to 10 5 and 10 3 for the training of environment simulator and agent respectively. At each turn, the agent recommends 3 tags to the user (i.e., K = 3). The target network of the agent is updated every 200 steps. The hidden size (both bidirectional LSTM and fully connected layers) of simulator and agent is 5 and 10. The length of simulated experiences buffer Ds is 1000. |