Spline Positional Encoding for Learning 3D Implicit Signed Distance Fields

Authors: Peng-Shuai Wang, Yang Liu, Yu-Qi Yang, Xin Tong

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verified the superiority of our approach over other positional encoding schemes on tasks of 3D shape reconstruction from input point clouds and shape space learning. The efficacy of our approach extended to image reconstruction is also demonstrated and evaluated. Our implementation is based on PyTorch, and all experiments were done with a desktop PC with an Intel Core i7 CPU (3.6 GHz) and GeForce 2080 Ti GPU (11 GB memory).
Researcher Affiliation Collaboration Peng-Shuai Wang1 , Yang Liu1 , Yu-Qi Yang2,1 , Xin Tong1 1Microsoft Research Asia 2Tsinghua University {penwan, t-yuqyan, yangliu, xtong}@microsoft.com
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Our code and trained models are available at https: //wang-ps.github.io/spe.
Open Datasets Yes We collect 7 3D shapes as the benchmark, which include detailed geometric textures (Bunny, Armadillo, and Gargoyle), smooth surfaces (Bimba and Dragon), and sharp features (Fandisk). The Dfaust point cloud is produced by a real scanner provided by [Bogo et al., 2017]. We conducted the experiment on the D-Faust dataset [Bogo et al., 2017] by following the setup of [Gropp et al., 2020].
Dataset Splits No The paper specifies training and testing sets for D-Faust (6258 shapes for training and 181 for testing) but does not mention a separate validation split for any dataset used.
Hardware Specification Yes Our implementation is based on Py Torch, and all experiments were done with a desktop PC with an Intel Core i7 CPU (3.6 GHz) and Ge Force 2080 Ti GPU (11 GB memory).
Software Dependencies No The paper mentions 'Py Torch' as the implementation framework but does not specify its version number or any other software dependencies with their respective versions.
Experiment Setup Yes By default, we use an MLP with 4 fully-connected (FC) layers with the Softplus activation function, each of which contains 256 hidden unit, and choose linear B-Spline bases for SPE. In each iteration during the training stage, we randomly sample 10k to 20k points from the input point cloud and the same number of random points from the 3D bounding box containing the shape. We set the parameters of SPE to K = 256, C = 64, M = 3, resulting a 64 dimension encoding for each point. The parameters λ and τ in Eq. (1) are set to 0.1 and 1. The MLP and SPE are optimized via the Adam [Kingma and Ba, 2014] solver with a learning rate of 0.0001, without using the weight decay and normalization techniques. For the multi-scale optimization, we first initialize SPE with K = 2, then progressively increase K to 8, 32, 128, and 256, with the initialization method provided in Eq. (6).