Tree-Structured Reinforcement Learning for Sequential Object Localization
Authors: Zequn Jie, Xiaodan Liang, Jiashi Feng, Xiaojie Jin, Wen Lu, Shuicheng Yan
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on PASCAL VOC 2007 and 2012 validate the effectiveness of the Tree-RL, which can achieve comparable recalls with current object proposal algorithms via much fewer candidate windows. |
| Researcher Affiliation | Academia | 1 National University of Singapore, Singapore 2 Carnegie Mellon University, USA |
| Pseudocode | No | The paper describes the proposed method in text and figures, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states that "The implementations are based on the publicly available Torch7 [25] platform", which refers to a third-party library used, not the authors' own source code for the proposed method. No direct link or explicit statement about releasing their code is provided. |
| Open Datasets | Yes | We train a deep Q-network on VOC 2007+2012 trainval set [7] for 25 epochs. |
| Dataset Splits | No | The paper mentions using the "VOC 2007+2012 trainval set" and a separate "testing set" but does not specify how the trainval set itself was split for training and validation, nor does it provide specific percentages or counts for a validation set. |
| Hardware Specification | Yes | The implementations are based on the publicly available Torch7 [25] platform on a single NVIDIA Ge Force Titan X GPU with 12GB memory. |
| Software Dependencies | No | The paper states that "The implementations are based on the publicly available Torch7 [25] platform", but it does not specify a version number for Torch7 or any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | During ϵ-greedy training, ϵ is annealed linearly from 1 to 0.1 over the first 10 epochs. Then ϵ is fixed to 0.1 in the last 15 epochs. The discount factor γ is set to 0.9. We run each episode with maximal 50 steps during training. The replay memory size is set to 800,000, which contains about 1 epoch of transitions. The mini batch size in training is set to 64. |