Acoustic Volume Rendering for Neural Impulse Response Fields
Authors: Zitong Lan, Chenhao Zheng, Zhiwei Zheng, Mingmin Zhao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that AVR surpasses current leading methods by a substantial margin. Additionally, we develop an acoustic simulation platform, Acousti X, which provides more accurate and realistic IR simulations than existing simulators. |
| Researcher Affiliation | Academia | Zitong Lan1 Chenhao Zheng2 Zhiwei Zheng1 Mingmin Zhao1 1University of Pennsylvania 2University of Washington |
| Pseudocode | No | The paper describes its methods and processes in narrative text and mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code for AVR and Acousti X are available at https://zitonglan.github.io/avr. |
| Open Datasets | Yes | We evaluate our model s performance on the datasets collected from real scenes. We adopt two commonly used room impulse response datasets: Mesh RIR [20] and Real Acoustic Field [10]. We use our simulation platform to simulate monaural impulse responses in three rooms and evaluate all methods performance (Tab. 2). We also include two complicated 3D rooms from i Gibson dataset [24, 56]. |
| Dataset Splits | No | We use 90% of the data to train and the rest 10% for testing. The paper mentions a total loss including a multiresolution STFT loss Lstft [58] and an energy loss Lenergy similar in [30], but it does not specify a separate validation dataset split. |
| Hardware Specification | Yes | The optimization process takes 24 hours on a single NVIDIA L40 GPU. |
| Software Dependencies | No | Acousti X uses Sionna ray tracing engine [16]. The paper does not provide version numbers for Sionna or any other software dependencies such as deep learning frameworks (e.g., PyTorch, TensorFlow) or programming languages (e.g., Python). |
| Experiment Setup | Yes | The sampling numbers used in the experiments are Nθ = 80, Nϕ = 40, and Nr = 64. We set the weights of loss components to be λamp =λphase =0.5, λtime =100, λstft =1, λenergy =5. We train our model for 200 epochs for each scene. We use Adam optimizer with a cosine learning rate scheduler that starts at a learning rate 10 3 and decays to 10 4. |