Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction
Authors: Dan Xu, Wanli Ouyang, Xavier Alameda-Pineda, Elisa Ricci, Xiaogang Wang, Nicu Sebe
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments ran on two publicly available datasets (BSDS500 and NYUDv2) demonstrate the effectiveness of the latent AG-CRF model and of the overall hierarchical framework. |
| Researcher Affiliation | Academia | 1The University of Trento, 2The University of Sydney, 3Perception Group, INRIA 4University of Perugia, 5The Chinese University of Hong Kong |
| Pseudocode | Yes | In order to infer the hidden variables and learn the parameters of the AG-CRFs together with those of the front-end CNN, we implement the AG-CRFs updates in neural network with several steps: (i) message passing from the se-th scale to the current sr-th scale is performed with hse sr Lse sr hse, where denotes the convolutional operation and Lse sr denotes the corresponding convolution kernel, (ii) attention map estimation q(gse,sr = 1) σ(hsr (Lse sr hse) + lse sr hse + lsr se hsr), where Lse sr, lse sr and lsr se are convolution kernels and represents element-wise product operation, and (iii) attention-gated message passing from other scales and adding unary term: hsr = fsr asr P se =sr(q(gse,sr = 1) hse sr), where asr encodes the effect of the ai sr for weighting the message and can be implemented as a 1 1 convolution. |
| Open Source Code | Yes | The implementation code is available on Github2. 2https://github.com/danxuhk/Attention Gated Multi-Scale Feature Learning |
| Open Datasets | Yes | To evaluate the proposed approach we employ two different benchmarks: the BSDS500 and the NYUDv2 datasets. The BSDS500 dataset is an extended dataset based on BSDS300 [1]. [...] The NYUDv2 [33] contains 1449 RGB-D images... |
| Dataset Splits | Yes | The BSDS500 dataset is an extended dataset based on BSDS300 [1]. It consists of 200 training, 100 validation and 200 testing images. [...] The NYUDv2 [33] contains 1449 RGB-D images and it is split into three subsets, comprising 381 training, 414 validation and 654 testing images. |
| Hardware Specification | Yes | The training and testing phase are carried out on an Nvidia Titan X GPU with 12GB memory. |
| Software Dependencies | No | The proposed AMH-Net is implemented under the deep learning framework Caffe [18]. No specific version number for Caffe or any other software dependency is provided. |
| Experiment Setup | Yes | The initial learning rate is set to 1e-7 in all our experiments, and decreases 10 times after every 10k iterations. The total number of iterations for BSDS500 and NYUD v2 is 40k and 30k, respectively. The momentum and weight decay parameters are set to 0.9 and 0.0002, as in [38]. As the training images have different resolution, we need to set the batch size to 1, and for the sake of smooth convergence we updated the parameters only every 10 iterations. [...] Within the AG-CRFs, the kernel size for all convolutional operations is set to 3 3 with stride 1 and padding 1. |