S$^3$-NeRF: Neural Reflectance Field from Shading and Shadow under a Single Viewpoint

Authors: Wenqi Yang, Guanying Chen, Chaofeng Chen, Zhenfang Chen, Kwan-Yee K. Wong

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on multiple challenging datasets show that our method is capable of recovering 3D geometry, including both visible and invisible parts, of a scene from single-view images.
Researcher Affiliation Collaboration Wenqi Yang The University of Hong Kong wqyang@cs.hku.hk Guanying Chen FNii and SSE, CUHK-Shenzhen chenguanying@cuhk.edu.cn Chaofeng Chen Nanyang Technological University chaofenghust@gmail.com Zhenfang Chen MIT-IBM Watson AI Lab chenzhenfang2013@gmail.com Kwan-Yee K. Wong The University of Hong Kong kykwong@cs.hku.hk
Pseudocode No No structured pseudocode or algorithm block was found in the paper.
Open Source Code Yes Our code and model can be found at https://ywq.github.io/s3nerf.
Open Datasets Yes Specifically, we used 10 3D objects for data rendering, where 5 objects from Di Li Gent-MV Dataset [27] (namely, BEAR, BUDDHA, COW, POT2, and READING), 2 objects from the internet (namely, BUNNY and ARMADILLO), and 3 objects from Ne RF s blender dataset [35] (namely, LEGO, CHAIR, and HOTDOG).
Dataset Splits No The paper mentions 'train view' and 'novel views' for evaluation but does not specify explicit training, validation, and test dataset splits with percentages or counts.
Hardware Specification Yes We train each scene for 800 epochs on one Nvidia RTX 3090 card, which takes about 16 hours to converge.
Software Dependencies No The paper mentions 'Adam optimizer' but does not specify versions for programming languages, frameworks, or other key software dependencies.
Experiment Setup Yes Similar to UNISURF [39], we use an 8-layer MLP (256 channels with softplus activation) to predict the occupancy o and output a 256-dimensional feature vector. Two additional 4-layer MLPs then take the feature vector and point coordinates as input to predict the albedo ρd and weights ω of SG bases. We sample NV = 256 points along the camera ray and NL = 256 points along the surface-to-light line segment. We use Adam optimizer [23] with an initial learning rate of 0.0002 which decays at 200 and 400 epochs. We train each scene for 800 epochs on one Nvidia RTX 3090 card, which takes about 16 hours to converge.