Tree-Structured Reinforcement Learning for Sequential Object Localization

Authors: Zequn Jie, Xiaodan Liang, Jiashi Feng, Xiaojie Jin, Wen Lu, Shuicheng Yan

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on PASCAL VOC 2007 and 2012 validate the effectiveness of the Tree-RL, which can achieve comparable recalls with current object proposal algorithms via much fewer candidate windows.
Researcher Affiliation Academia 1 National University of Singapore, Singapore 2 Carnegie Mellon University, USA
Pseudocode No The paper describes the proposed method in text and figures, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper states that "The implementations are based on the publicly available Torch7 [25] platform", which refers to a third-party library used, not the authors' own source code for the proposed method. No direct link or explicit statement about releasing their code is provided.
Open Datasets Yes We train a deep Q-network on VOC 2007+2012 trainval set [7] for 25 epochs.
Dataset Splits No The paper mentions using the "VOC 2007+2012 trainval set" and a separate "testing set" but does not specify how the trainval set itself was split for training and validation, nor does it provide specific percentages or counts for a validation set.
Hardware Specification Yes The implementations are based on the publicly available Torch7 [25] platform on a single NVIDIA Ge Force Titan X GPU with 12GB memory.
Software Dependencies No The paper states that "The implementations are based on the publicly available Torch7 [25] platform", but it does not specify a version number for Torch7 or any other software dependencies, which is required for reproducibility.
Experiment Setup Yes During ϵ-greedy training, ϵ is annealed linearly from 1 to 0.1 over the first 10 epochs. Then ϵ is fixed to 0.1 in the last 15 epochs. The discount factor γ is set to 0.9. We run each episode with maximal 50 steps during training. The replay memory size is set to 800,000, which contains about 1 epoch of transitions. The mini batch size in training is set to 64.