NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes
Authors: Zhiwen Fan, Peihao Wang, Yifan Jiang, Xinyu Gong, Dejia Xu, Zhangyang Wang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive results on the LLFF, Blended MVS, CO3Dv2, and Tank & Temples datasets validate the effectiveness of Ne RF-SOS. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of Texas at Austin {zhiwenfan,atlaswang}@utexas.edu |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at: https://github.com/VITA-Group/Ne RF-SOS. |
| Open Datasets | Yes | We evaluate all methods on four representative datasets: Local Light Field Fusion (LLFF) dataset (Mildenhall et al., 2019), Blended MVS (Yao et al., 2020), CO3Dv2 (Reizenstein et al., 2021), and Tank and Temples (T&T) dataset (Riegler & Koltun, 2020). |
| Dataset Splits | No | The paper mentions selecting '12.5% of total images for testing' for some datasets and refers to 'training' generally, but it does not provide explicit train/validation/test dataset splits or percentages for all datasets. |
| Hardware Specification | Yes | All models are trained on an NVIDIA RTX A6000 GPU with 48 GB memory. |
| Software Dependencies | No | The paper mentions using 'official pre-trained DINO-Vi T' but does not specify version numbers for programming languages, libraries, or other key software dependencies required for reproducibility. |
| Experiment Setup | Yes | The loss weights λ0, λ1, λ2, λid, and λneg are set 0, 1, 0.01, 1 and 1 in training the segmentation branch. The segmentation branch is formulated as a four-layer MLP with Re LU as the activation function. The dimensions of hidden layers and the number of output layers are set as 256 and 2, respectively. We randomly sample eight patches from different viewpoints (a.k.a batch size N is 8) in training. The patch size of each sample is set as 64 64, with the patch stride as 6. Hyperparameters of Ne RF-SOS on different datasets are shown in Table 5. |