reproducibilityindex.ai

Acoustic Volume Rendering for Neural Impulse Response Fields

Authors: Zitong Lan, Chenhao Zheng, Zhiwei Zheng, Mingmin Zhao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that AVR surpasses current leading methods by a substantial margin. Additionally, we develop an acoustic simulation platform, Acousti X, which provides more accurate and realistic IR simulations than existing simulators.
Researcher Affiliation	Academia	Zitong Lan1 Chenhao Zheng2 Zhiwei Zheng1 Mingmin Zhao1 1University of Pennsylvania 2University of Washington
Pseudocode	No	The paper describes its methods and processes in narrative text and mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code for AVR and Acousti X are available at https://zitonglan.github.io/avr.
Open Datasets	Yes	We evaluate our model s performance on the datasets collected from real scenes. We adopt two commonly used room impulse response datasets: Mesh RIR [20] and Real Acoustic Field [10]. We use our simulation platform to simulate monaural impulse responses in three rooms and evaluate all methods performance (Tab. 2). We also include two complicated 3D rooms from i Gibson dataset [24, 56].
Dataset Splits	No	We use 90% of the data to train and the rest 10% for testing. The paper mentions a total loss including a multiresolution STFT loss Lstft [58] and an energy loss Lenergy similar in [30], but it does not specify a separate validation dataset split.
Hardware Specification	Yes	The optimization process takes 24 hours on a single NVIDIA L40 GPU.
Software Dependencies	No	Acousti X uses Sionna ray tracing engine [16]. The paper does not provide version numbers for Sionna or any other software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow) or programming languages (e.g., Python).
Experiment Setup	Yes	The sampling numbers used in the experiments are Nθ = 80, Nϕ = 40, and Nr = 64. We set the weights of loss components to be λamp =λphase =0.5, λtime =100, λstft =1, λenergy =5. We train our model for 200 epochs for each scene. We use Adam optimizer with a cosine learning rate scheduler that starts at a learning rate 10 3 and decays to 10 4.