An Appearance-and-Structure Fusion Network for Object Viewpoint Estimation
Authors: Yueying Kao, Weiming Li, Zairan Wang, Dongqing Zou, Ran He, Qiang Wang, Minsu Ahn, Sunghoon Hong
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, our proposed network outperforms state-of-the-art methods on a public PASCAL 3D+ dataset, which verifies the effectiveness of our method and further corroborates the above proposition. In this section, we present the experimental setup, quantitative and qualitative results on viewpoint estimation. |
| Researcher Affiliation | Collaboration | Yueying Kao1, Weiming Li1, Zairan Wang1, Dongqing Zou1, Ran He2, Qiang Wang1, Minsu Ahn3 and Sunghoon Hong3 1 SAIT China Lab, Samsung Research Institute China Beijing (SRC-B) 2 NLPR & CRIPAC, Institute of Automation, Chinese Academy of Sciences 3 Samsung Advanced Institute of Technology (SAIT) |
| Pseudocode | No | The paper includes architectural diagrams (Figure 2, Figure 3) and descriptive text of the network, but no explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statement about making its source code publicly available or provide a link to a code repository. |
| Open Datasets | Yes | Our method is evaluated on 12 object categories of a public PASCAL 3D+ [Xiang et al., 2014] dataset. There are annotations of viewpoints, keypoints, object classes and object bounding boxes in this dataset. This dataset consists of PASCAL VOC 2012 detection train and validation set, and Image Net images. |
| Dataset Splits | Yes | We use the PASCAL train set and Image Net images with GT bounding boxes to train our network. The whole PASCAL validation set is used to evaluate our performance. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions deep learning frameworks and architectures like VGG net and FPN, but it does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | When training the baseline networks and ASFnet, we resize the object images cropped from training set with their GT bounding boxes to 256 256 3, then randomly extract a 224 224 3 patch from the resized image or resized mirror image as the input of all networks. In the piecewise loss weight function f(l), a is set to 0.5, L1 to 100, L2 to 1000 by experience. We initialize the two baseline networks with the trained VGG network on Image Net classification task. Then the ASnet is used to initialize our ASFnet. |