Neural Experts: Mixture of Experts for Implicit Neural Representations

Authors: Yizhak Ben-Shabat, Chamin Hewa Koneputugodage, Sameera Ramasinghe, Stephen Gould

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the effectiveness of our approach on multiple reconstruction tasks, including surface reconstruction, image reconstruction, and audio signal reconstruction and show improved performance compared to non-Mo E methods.
Researcher Affiliation Collaboration Yizhak Ben-Shabat Roblox, The Australian National University sitzikbs@gmail.com Chamin Hewa Koneputugodage The Australian National University chamin.hewa@anu.edu.au Sameera Ramasinghe Amazon, Australia sameera.ramasinghe@adelaide.edu.au Stephen Gould The Australian National University stephen.gould@anu.edu.au
Pseudocode No The paper describes the architecture and method steps in text and diagrams, but it does not include a clearly labeled pseudocode or algorithm block.
Open Source Code Yes Code is available at our project page https://sitzikbs.github.io/neural-experts-projectpage/.
Open Datasets Yes We conduct a comprehensive evaluation of image reconstruction on the full Kodak dataset [12] (24 images) in Table 1.
Dataset Splits No The paper describes training for 30K iterations and then training only experts for a final period, but it does not specify a distinct validation dataset or split used for hyperparameter tuning or early stopping.
Hardware Specification Yes These models were trained on an NVIDIA A5000 GPU. ... We run our surface reconstruction experiments on a single RTX 3090 (24GB VRAM).
Software Dependencies No The paper mentions using an "Adam optimizer" and implicitly deep learning frameworks, but it does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA 11.x).
Experiment Setup Yes For our approach we used 4 experts, 2 hidden layers for encoder and 2 for the experts. The manager has a similar architecture with 2 layers for the manager encoder and 2 for the final manager block. Each layer has 128 elements. All models were trained using an Adam optimizer, a learning rate of 10 5 with exponential decay. All models are trained for 30K iterations where for our approach we use tall = 80% and te = 20%.