Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding

Authors: Chuanhao Sun, Zhihang Yuan, Kai Xu, Luo Mai, Siddharth N, Shuo Chen, Mahesh K. Marina

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that SPE, without hyperparameter tuning, consistently achieves enhanced fidelity and faster training across various tasks, including 3D view synthesis, Text-to Speech generation, and 1D regression.
Researcher Affiliation Collaboration 1The University of Edinburgh, Edinburgh, UK 2MIT-IBM Watson AI Lab, Cambridge, MA, US.
Pseudocode No The paper includes mathematical equations and figures, but it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps.
Open Source Code Yes Code: github.com/zhyuan11/SPE
Open Datasets Yes We assess a few-view Ne RF model (Mildenhall et al., 2021) with varied L for different objects on the Blender dataset (used in (Sitzmann et al., 2019; Mildenhall et al., 2021)).
Dataset Splits No The paper states 'We use the 8-views setup that is aligned with Free Ne RF (Yang et al., 2023) and Diet Ne RF (Jain et al., 2021)' but does not provide specific percentages or counts for training, validation, or test splits. It refers to 'training view number' in figures but no explicit split ratios.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions using 'Jax (Bradbury et al., 2018) and the same Python library (Novak et al., 2019)' but does not specify version numbers for these or other software dependencies, which are necessary for reproducible setup.
Experiment Setup Yes To ensure a fair comparison, we use L = 10 for the space position input (i.e., the (x, y, z) in Figure 5), and use L = 4 for the space position input (i.e., the dir in Figure 5), which is the empirical configuration on the Blender dataset. [...] For the training with SPE on Free Ne RF, we find that it is effective to train Free Ne RF and SPE with adversarial loss to minimize the Wasserstein distance to the target view.