reproducibilityindex.ai

Embodied Visual Active Learning for Semantic Segmentation

Authors: David Nilsson, Aleksis Pirinen, Erik Gärtner, Cristian Sminchisescu2373-2383

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We extensively evaluate the proposed models using the photorealistic Matterport3D simulator and show that a fully learnt method outperforms comparable pre-speciﬁed counterparts, even when requesting fewer annotations. We extensively evaluate the proposed models using the photorealistic Matterport3D simulator and show that a fully learnt method outperforms comparable pre-speciﬁed counterparts, even when requesting fewer annotations. We develop a battery of methods, ranging from pre-speciﬁed ones to a fully trainable deep reinforcement learning-based agent, which we evaluate extensively in the photorealistic Matterport3D environment. We perform extensive evaluation in a photorealistic 3d environment and show that a fully learnt method outperforms comparable pre-speciﬁed ones.
Researcher Affiliation	Collaboration	David Nilsson1,2 , Aleksis Pirinen1, Erik G artner1,2 , Cristian Sminchisescu1,2 1Department of Mathematics, Faculty of Engineering, Lund University 2Google Research {david.nilsson, aleksis.pirinen, erik.gartner, cristian.sminchisescu}@math.lth.se
Pseudocode	No	The paper describes the methods in text but does not include any pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using third-party tools like 'RLlib reinforcement learning package', 'Open AI Gym', 'Tensor Flow', and 'PWCNet', but it does not provide an explicit statement or link to the source code for the authors' own methodology described in the paper.
Open Datasets	Yes	We evaluate the methods on the Matterport3D dataset (Chang et al. 2017) using the embodied agent framework Habitat (Savva et al. 2019).
Dataset Splits	Yes	We use the same 61, 11 and 18 houses for training, validation and testing as Chang et al. (2017). For validation and testing we use 3 and 4 starting positions per scene, respectively, so each agent is tested for a total of 33 episodes in validation and 72 episodes in testing. Hyperparameters of the learnt and pre-speciﬁed agents are tuned on the validation set.
Hardware Specification	Yes	Our system is implemented in Tensor Flow (Abadi et al. 2016), and it takes about 3 days to train an agent using 4 Nvidia Titan X GPUs.
Software Dependencies	No	The paper mentions several software components like 'PPO', 'RLlib', 'Open AI Gym', 'Tensor Flow', and 'PWCNet', but it does not provide specific version numbers for these dependencies, which would be required for reproducible ancillary software description.
Experiment Setup	Yes	Mini-batches of size 8, which always include the latest added labeled image, are used in training. The network is reﬁned either until it has trained for 1, 000 iterations or until the accuracy of a mini-batch exceeds 95%. We use a standard cross-entropy loss averaged over all pixels. The segmentation network is trained using stochastic gradi-ent descent with learning rate 0.01, weight decay 10 5 and momentum 0.9. For optimization we use Adam (Kingma and Ba 2014) with batch size 512, learning rate 10 4 and discount rate 0.99. During training, each episode consists of 256 actions. The agent is trained for 4k episodes, which totals 1024k steps. The Res Net-50 feature extractor is pre-trained on Image Net (Jia Deng et al. 2009) with weights frozen during policy training.