reproducibilityindex.ai

3D-Aware Hypothesis & Verification for Generalizable Relative Object Pose Estimation

Authors: Chen Zhao, Tong Zhang, Mathieu Salzmann

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our comprehensive experiments on the Objaverse, LINEMOD, and CO3D datasets evidence the superior accuracy of our approach in relative pose estimation and its robustness in large-scale pose variations, when dealing with unseen objects.
Researcher Affiliation	Collaboration	Chen Zhao EPFL-CVLab chen.zhao@epfl.ch; Tong Zhang EPFL-IVRL tong.zhang@epfl.ch; Mathieu Salzmann EPFL-CVLab, Clear Space SA mathieu.salzmann@epfl.ch
Pseudocode	No	The paper describes its methodology using textual descriptions, mathematical formulations, and diagrams (e.g., Figure 2 for the overview of the framework), but it does not include a formal pseudocode block or algorithm listing.
Open Source Code	No	The abstract states: "Our project website is at: https://sailor-z.github.io/projects/ICLR2024_3DAHV.html." This is a project overview page, not a direct link to a source-code repository.
Open Datasets	Yes	Our comprehensive experiments on the Objaverse, LINEMOD, and CO3D datasets evidence the superior accuracy of our approach in relative pose estimation and its robustness in large-scale pose variations, when dealing with unseen objects. ... To this end, we utilize the Objaverse (Deitke et al., 2023) and LINEMOD (Hinterstoisser et al., 2012) datasets, which include synthetic and real data, respectively. ... We first perform an evaluation using the benchmark defined in (Lin et al., 2023), where the experiments are conducted on the CO3D (Reizenstein et al., 2021) dataset. ... The synthetic images are generated by rendering objects of Objaverse from randomly sampled viewpoints (Liu et al., 2023). We attach these images to random backgrounds which are sampled from COCO (Lin et al., 2014).
Dataset Splits	No	The paper mentions training and testing data and describes object-level splits (e.g., "reserving the remaining objects for training"). It specifies the number of testing image pairs. However, it does not explicitly provide percentages or counts for a distinct validation dataset split separate from training and testing.
Hardware Specification	Yes	Training takes around 4 days on 4 NVIDIA Tesla V100s.
Software Dependencies	No	The paper mentions the use of the Adam W optimizer and various architectural components like transformers and layer normalization, citing the papers where they were introduced. However, it does not specify version numbers for any programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	We set the number of hypotheses during training and testing to M = 9, 000 and M = 50, 000, respectively. We define the masking threshold h = 0.25 and the geodesic distance threshold λ = 15 (Zhang et al., 2022; Lin et al., 2023). We train our network for 25 epochs using the Adam W (Loshchilov & Hutter, 2017) optimizer with a batch size of 48 and a learning rate of 10 4, which is divided by 10 after 20 epochs.