DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding
Authors: Xiaoxuan Yu, Hao Wang, Weiming Li, Qiang Wang, Soonyong Cho, Younghun Sung
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Qualitative and quantitative experimental results demonstrate that our method achieves state-of-the-art performance on the challenging Scan Net dataset. |
| Researcher Affiliation | Industry | Xiaoxuan Yu1*, Hao Wang1, Weiming Li1, Qiang Wang1, Soonyong Cho2, Younghun Sung2 1Samsung Research China Beijing 2Samsung Advanced Institute of Technology xiaoxuan1.yu@samsung.com, hao1.wang@samsung.com, weiming.li@samsung.com, qiang.w@samsung.com, soonyong.cho@samsung.com, younghun.sung@samsung.com |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/SAITPublic/DOCTR. |
| Open Datasets | Yes | We follow exactly the same data split and pre-processing method as DIMR. |
| Dataset Splits | Yes | We follow exactly the same data split and pre-processing method as DIMR. |
| Hardware Specification | Yes | on a single Nvidia RTX A6000 GPU |
| Software Dependencies | No | The paper mentions 'Adam W ... optimizer' and 'Minkowski Res16UNet34C', but does not provide specific version numbers for programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch), or other libraries (e.g., CUDA) required for reproducibility. |
| Experiment Setup | Yes | During training, we use the Adam W (Loshchilov and Hutter 2017) optimizer for 600 epochs with a batch size of 5 on a single Nvidia RTX A6000 GPU for all the experiments. One-cycle learning rate schedule (Smith and Topin 2019) is utilized with a maximum learning rate of 10 4 and a minimum learning rate of 10 6. Standard data augmentation are performed on point cloud including horizontal flipping, random rotations around the z-axis, elastic distortion, and random scaling. |