Semi-Supervised Learning for Multi-Task Scene Understanding by Neural Graph Consensus

Authors: Marius Leordeanu, Mihai Cristian Pîrvu, Dragos Costea, Alina E Marcu, Emil Slusanschi, Rahul Sukthankar1882-1892

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We give theoretical justifications of the proposed idea and validate it on a large dataset. We show how prediction of different representations such as depth, semantic segmentation, surface normals and pose from RGB input could be effectively learned through self-supervised consensus in our graph. We also compare to state-of-the-art methods for multi-task and semi-supervised learning and show superior performance.
Researcher Affiliation Collaboration Marius Leordeanu1,2, Mihai Cristian Pˆırvu1,2, Dragos Costea1,2, Alina E Marcu1,2, Emil Slusanschi1 and Rahul Sukthankar3 1 University Politehnica of Bucharest 2 Institute of Mathematics of the Romanian Academy 3 Google Research
Pseudocode Yes Algorithm 1 Learning with Neural Graph Consensus Step 1: Pre-train a set of deep neural networks that transform different input to output representations, using the labeled data available. Step 2: Form the NGC graph by linking the nets such that the output of one (or several) becomes input to another. Step 3: On a completely new unlabeled set, re-train the nets using as pseudo-ground truth for a specific node (representation) the consensual output of all paths that reach that node. Repeat Step 3, by choosing a new unlabeled set and newly trained nets, until convergence.
Open Source Code No The paper states 'we provide public access to our dataset and code1' and provides a URL in footnote 1: 'https://sites.google.com/site/aerialimageunderstanding/semisupervised-learning-of-multiple-scene-interpretations-by-neuralgraph'. However, this is a general project website and not a direct link to a source-code repository.
Open Datasets Yes To test the NGC approach in the case of many scene representations we capture a large dataset using a customized virtual environment based on the CARLA simulator (Dosovitskiy et al. 2017)... Moreover, we provide public access to our dataset and code1.
Dataset Splits Yes The dataset is divided into four subsets: supervised training set (subdivided in 8k images for training and 2k for validation), 2 test sets (10 k images each, for unsupervised learning iterations 1 and 2) and a separate evaluation set (10 k images, never seen during learning).
Hardware Specification No The paper vaguely mentions 'GPU computational resources' but does not specify any exact GPU models, CPU models, or other hardware components used for running experiments.
Software Dependencies Yes We developed a general NGC framework on top of the existing deep learning framework Py Torch (Paszke et al. 2019), which can model arbitrary complex graphs and which we make publicly available.
Experiment Setup Yes All architectures have about 1.1M trainable parameters, making them very light compared to most state-of-the-art nets for similar tasks. They are trained for 100 epochs, with Adam W optimizer, using our novel Pytorch-based NGC graph framework.