Siamese CNN-BiLSTM Architecture for 3D Shape Representation Learning
Authors: Guoxian Dai, Jin Xie, Yi Fang
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed method is evaluated on two benchmarks, Model Net40 and SHREC 2014, demonstrating superiority over the state-of-the-art methods. |
| Researcher Affiliation | Academia | NYU Multimedia and Visual Computing Lab 2 Dept. of ECE, NYU Abu Dhabi, UAE 3 Dept. of ECE, NYU Tandon School of Engineering, USA 4 Dept. of CSE, NYU Tandon School of Engineering, USA guoxian.dai@nyu.edu, jin.xie@nyu.edu, yfang@nyu.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (specific link, explicit statement of code release, or mention of code in supplementary materials) for the source code of the described methodology. |
| Open Datasets | Yes | Our proposed method is evaluated on two large-scale benchmarks, Princeton Model Net [Wu et al., 2015] and SHREC 2014 [Li and Lu, et al., 2014]. |
| Dataset Splits | Yes | For our experiments, we follow the same training and testing splitting in [Wu et al., 2015; Su et al., 2015], by selecting 100 shapes for each class, 80 for training and 20 for testing. As for our experiments, we randomly split the shapes of each group into two halves equally as training and testing. |
| Hardware Specification | Yes | In addition, our proposed method is implemented using Caffe with a single GPU, Nvidia Tesla K80. |
| Software Dependencies | No | The paper mentions software like 'Caffe' and 'Alex Net' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For the multi-view rendering part, we choose 12 different views at the horizontal plane towards to the centroid of the 3D shape, step by every 30 . The size of the hidden state for LSTM is set to 512. The outputs of all Bi LSTM cells are passed through an average-pooling to generate one compact representation, which is followed by two fully connected layers with the sizes of 200, 200. The whole CNN-Bi LSTM network is constructed into a siamese structure with the contrastive loss function and the margin h is set to 10. The batch size is set to 360, which is 30 sequences; the weight decay rate is set to 0.0005; the momentum rate is set to 0.1; the base learning rate is set to 0.001. |