Incorporating Side Information by Adaptive Convolution
Authors: Di Kang, Debarun Dhar, Antoni Chan
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of ACNN incorporating side information on 3 tasks: crowd counting, corrupted digit recognition, and image deblurring. Our experiments show that ACNN improves the performance compared to a plain CNN with a similar number of parameters. |
| Researcher Affiliation | Academia | Di Kang Debarun Dhar Antoni B. Chan Department of Computer Science City University of Hong Kong {dkang5-c, ddhar2-c}@my.cityu.edu.hk, abchan@cityu.edu.hk |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. It includes architectural diagrams (e.g., Figure 2, Figure 3, Figure 4) but no step-by-step algorithms. |
| Open Source Code | No | The paper does not provide an explicit statement or link to the open-source code for the described methodology. It mentions that 'More information and demo images can be found in the supplemental' regarding their new dataset, but this does not refer to the code for ACNN. |
| Open Datasets | Yes | For crowd counting, we use two crowd counting datasets: the popular UCSD crowd counting dataset, and our newly collected dataset with camera tilt angle and camera height as side information. We use the MNIST handwritten digits dataset, which contains 60,000 training and 10,000 test examples. The new crowd dataset City UHK-X contains 55 scenes (3,191 images in total), covering a camera tilt angle range of [-10 , -65 ] and a height range of [2.2, 16.0] meters. More information and demo images can be found in the supplemental. |
| Dataset Splits | Yes | We first use the widely adopted protocol of max split, which uses 160 frames (frames 601:5:1400) for training, and the remaining parts (frames 1:600, 1401:2000) for testing. We use the MNIST handwritten digits dataset, which contains 60,000 training and 10,000 test examples. Separate validation and test sets, both containing 90,000 samples, are generated from the original MNIST test set. The training set consists of 43 scenes (2,503 images; 78,592 people), and the test set comprises 12 scenes (688 images; 28,191 people). |
| Hardware Specification | Yes | We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research. |
| Software Dependencies | No | The paper mentions software components like 'Re LU activation' and 'Batch normalization layers', but it does not specify any programming languages, libraries, or frameworks with their version numbers (e.g., Python, TensorFlow, PyTorch, CUDA versions) which are necessary for full reproducibility. |
| Experiment Setup | No | The paper describes some architectural details (e.g., filter sizes, activation functions, use of batch normalization) and general training approaches (e.g., multi-task learning). However, it lacks specific numerical hyperparameters such as learning rates, batch sizes, optimizer choices, number of training epochs, or specific data augmentation strategies, which are essential for reproducing the experimental setup precisely. |