Acoustic NLOS Imaging with Cross Modal Knowledge Distillation

Authors: Ui-Hyeon Shin, Seungwoo Jang, Kwangsu Kim

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Additionally, we evaluate real-world datasets and demonstrate that the proposed method outperforms state-of-the-art methods in acoustic NLOS imaging. The experimental results indicate that CMKD is an effective solution for addressing the limitations of current acoustic NLOS imaging methods.
Researcher Affiliation Academia Ui-Hyeon Shin1 , Seungwoo Jang1 and Kwangsu Kim2 1Department of Artificial Intelligence, Sungkyunkwan University, Korea 2College of Computing and Informatics, Sungkyunkwan University, Korea
Pseudocode No The paper describes the network architecture and methodology in detail but does not include any explicit pseudocode blocks or algorithm listings.
Open Source Code Yes Our code, model, and data are available at https: //github.com/shineh96/Acoustic-NLOS-CMKD.
Open Datasets Yes To facilitate this task, we collect a large dataset of 3,600 corresponding frames that consist of RGB images, depth maps, and multi-channel audio. We collect a new acoustic NLOS dataset and make it available to the public. Our code, model, and data are available at https: //github.com/shineh96/Acoustic-NLOS-CMKD.
Dataset Splits Yes The data for the training objects are divided into 1920 samples for training, 240 samples for validation, and 240 samples for testing.
Hardware Specification No The paper describes the acoustic system used for data acquisition (speakers, microphones, audio interface, power amplifier) but does not provide specific details about the computational hardware (e.g., GPU models, CPU types) used for training or running the experiments.
Software Dependencies No The paper does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks used in the experiments.
Experiment Setup Yes We set α to 100 and β to 0.01. The image network is trained using only the depth Loss, which is the pointwise l1 error between the estimated depth map and the actual depth map. We utilize a conditional adversarial network loss based on the Batvision.