Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction
Authors: David Klee, Ondrej Biza, Robert Platt, Robin Walters
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS |
| Researcher Affiliation | Academia | Northeastern University |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code is available at https://dmklee.github.io/image2sphere. |
| Open Datasets | Yes | The first dataset, Model Net10-SO(3) (Liao et al., 2019), is composed of rendered images of synthetic, untextured objects from Model Net10 (Wu et al., 2015). ... PASCAL3D+, (Xiang et al., 2014), is a popular benchmark for pose estimation ... Lastly, the SYMSOL dataset was recently introduced by Murphy et al. (2021)... |
| Dataset Splits | Yes | The dataset has a standardized train and test split. It provides two training sets: one with 100 views per object instance (we call this the Full Training Set), and one with 20 views per object instance (we call this the Limited Training Set). The test set has 4 views per instance. ... For each shape, there are 50k renderings in the training set and 5K renderings in the test set. ... The training data is found in the Image Net train, Image Net val, and PASCALVOC train folders, and the test data is in PASCALVOC val. |
| Hardware Specification | No | The paper mentions "I2S is instantiated with PyTorch and the e3nn library" but does not specify any hardware details like GPU/CPU models, processors, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions "PyTorch" and the "e3nn library" but does not provide specific version numbers for these software dependencies, which are necessary for full reproducibility. |
| Experiment Setup | Yes | I2S uses a residual network (He et al., 2016) with weights pretrained on Image Net (Deng et al., 2009) to extract dense feature maps from 2D images. We use a Res Net50 backbone for Model Net10-SO(3) and SYMSOL, and Res Net101 for PASCAL3D+. The orthographic projection uses a HEALPix grid with recursion level of 2, out of which 20 points are randomly selected during each forward pass. ... It is trained using SGD with Nesterov momentum of 0.9 for 40 epochs using a batch size of 64. The learning rate starts at 0.001 and decays by factor of 0.1 every 15 epochs. |