AD-VAT: An Asymmetric Dueling mechanism for learning Visual Active Tracking
Authors: Fangwei Zhong, Peng Sun, Wenhan Luo, Tingyun Yan, Yizhou Wang
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiment is conducted in various 2D and 3D environments for further studying AD-VAT. The 2D environment is a matrix map where obstacles are randomly placed. In the 2D environments, we evaluate and quantify the effectiveness of our approach in an ideal condition, free from noise in observation and action. We also conduct an ablation study to show the effectiveness of the two important components, partial zero-sum reward and tracker-aware network. The 3D environments are built on Unreal Engine 4, a popular game engine for building high-fidelity environments. We choose a large room for training, where the texture of the background/players and the illumination are randomized. Three realistic scenarios built by artists are used for further evaluating the robustness. |
| Researcher Affiliation | Collaboration | Nat l Eng. Lab. for Video Technology, Key Lab. of Machine Perception (Mo E), Computer Science Dept., Peking University Tencent AI Lab Cooperative Medianet Innovation Center Peng Cheng Lab Deepwise AI Lab {zfw, yanty18, yizhou.wang}@pku.edu.cn, {pengsun000, whluo.china}@gmail.com |
| Pseudocode | No | No pseudocode or algorithm blocks were explicitly labeled or formatted as such in the paper. |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about the release of their own source code for the methodology. It mentions using a pytorch implementation for A3C and provides links for supplementary videos and an UnrealCV wrapper used, but not their specific AD-VAT implementation code. |
| Open Datasets | Yes | The 3D environments are built on the Unreal Engine, which and could flexibly simulate a photo-realistic world. We employ Unreal CV (Qiu et al., 2017), which provides convenient APIs, along with a wrapper (Zhong et al., 2017) compatible with Open AI Gym (Brockman et al., 2016), for interactions between RL algorithms and the environment. |
| Dataset Splits | Yes | Validation is performed in parallel and the best validation network model is applied to report performance in testing environments. Note that the validation environment is of the same settings as training, except that the target is controlled by a Nav agent. Compared with the Ram agent, the Nav agent is more challenging, thus is more suitable for validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU models, memory). It only mentions general setups like "multiple workers are running in parallel". |
| Software Dependencies | No | The paper mentions "A3C (Mnih et al., 2016), a commonly used reinforcement learning algorithm" and that "The code for A3C is based on a pytorch implementation (Griffis)", along with "Unreal CV (Qiu et al., 2017)" and "Open AI Gym (Brockman et al., 2016)". However, it does not specify version numbers for these software components or Python itself. |
| Experiment Setup | Yes | Hyper Parameters. For the tracker, the learning rates δ1 and δ 1 in 2D and 3D environments are 0.001 and 0.0001, respectively. The reward discount factor γ = 0.9, generalized advantage estimate parameter τ = 1.00, and regularizer factor for tracker λ1 = 0.01. The parameter updating frequency n is 20, and the maximum global iteration for training is 150K. Comparing to the tracker, a higher regularizer factor is used for encouraging the target to explore, λ2 = 0.2 in 2D and λ 2 = 0.05 in 3D. |