reproducibilityindex.ai

VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception

Authors: Zhaoliang Wan, Yonggen Ling, Senlin Yi, Lu Qi, Wang Wei Lee, Minglei Lu, Sicheng Yang, Xiao Teng, Peng Lu, Xu Yang, Ming-Hsuan Yang, Hui Cheng

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper addresses the scarcity of large-scale datasets for accurate object-in-hand pose estimation...Vin T-6D comprises 2 million Vin T-Sim and 0.1 million Vin T-Real splits, collected via simulations in Mu Jo Co and Blender and a custom-designed real-world platform. Built upon Vin T-6D, we present a benchmark method that shows signiﬁcant improvements in performance by fusing multi-modal information. Extensive experiments show the effectiveness of our method compared with the other works. (Abstract and Introduction) Section 5. Experimental Results.
Researcher Affiliation	Collaboration	1School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2Robotics X, Tencnet, Shenzhen, China 3The University of California, Merced, Merced, the U.S. 4Chinese Academic of Sciences, Automation Institute, Beijing, China.
Pseudocode	No	The paper describes the architecture of Vin T-Net (Section 4) and its sub-modules, including an overview diagram (Figure 8). However, it does not provide any pseudocode blocks or algorithm listings labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code	Yes	The project is available at https://Vin T-6D.github.io/.
Open Datasets	Yes	Vin T-6D comprises 2 million Vin T-Sim and 0.1 million Vin T-Real splits, collected via simulations in Mu Jo Co and Blender and a custom-designed real-world platform. (Abstract) The project is available at https://Vin T-6D.github.io/.
Dataset Splits	No	The paper mentions training parameters like learning rate, batch size, and epochs (Section 5.1 Implementation Details). However, it does not explicitly specify the percentages or counts for training, validation, and test splits for its dataset, nor does it reference predefined splits from established benchmarks for reproducibility.
Hardware Specification	Yes	The training and testing processes were executed on a computing server equipped with 6 Quadro RTX 8000 GPUs. The Vin T-Sim synthesis procedures were conducted by a cloud computing platform that utilized 16 NVIDIA P40 GPUs.
Software Dependencies	No	The paper mentions software tools like Mu Jo Co and Blender for simulation, and SAM (Segment Anything Model) for segmentation. However, it does not provide specific version numbers for these or any other software components, libraries, or programming languages used in the experiments.
Experiment Setup	Yes	We used the Adam optimizer with an initial learning rate of 0.01 for training and set the batch size at 24. The training process was conducted over 25 epochs, and we set the hyper-parameters λ1, λ2, and λ3 to 1, 2, and 1, respectively.