Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
OmniResponse: Online Multimodal Conversational Response Generation in Dyadic Interactions
Authors: Cheng Luo, Jianghui Wang, Bing Li, Siyang Song, Bernard Ghanem
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive evaluations on Response Net demonstrate that Omni Response outperforms baseline models in terms of semantic speech content, audio-visual synchronization, and generation quality. Our dataset, code, and models are publicly available at https://omniresponse.github.io/. |
| Researcher Affiliation | Academia | 1King Abdullah University of Science and Technology, 2University of Exeter |
| Pseudocode | No | The paper describes the model architecture and methodology in detail, but it does not include any explicitly labeled pseudocode or algorithm blocks. Figure 3 illustrates the architecture of Tempo Voice but is a diagram, not pseudocode. |
| Open Source Code | No | Our dataset, code, and models are publicly available at https://omniresponse.github.io/. NeurIPS Paper Checklist Question 5: Does the paper provide open access to the data and code...? Answer: [No] Justification: All code and data will be made available upon acceptance of the paper. |
| Open Datasets | No | To fill the dataset gap, we introduce Response Net that comprises 696 temporally synchronized dyadic video pairs, totaling over 14 hours of natural conversational exchanges. Our dataset, code, and models are publicly available at https://omniresponse.github.io/. NeurIPS Paper Checklist Question 5: Does the paper provide open access to the data and code...? Answer: [No] Justification: All code and data will be made available upon acceptance of the paper. |
| Dataset Splits | No | The paper mentions evaluating on the "Response Net test set" in Table 2, but it does not provide specific details on how the dataset is split into training, validation, and test sets (e.g., percentages or sample counts) in the main text. |
| Hardware Specification | Yes | Our framework was implemented using Py Torch [52] and trained on four NVIDIA Tesla A100 GPUs. |
| Software Dependencies | No | Our framework was implemented using Py Torch [52] and trained on four NVIDIA Tesla A100 GPUs. The model optimization was performed using the Adam W optimizer [33] with a learning rate of 2 10 5, β1 = 0.9, β2 = 0.999, and a weight decay of 10 4, accompanied by a cosine learning rate scheduler. Training was executed with a batch size of one for 2,000 epochs. Additionally, we fine-tuned the LLM using the Lo RA [26] technique with a Lo RA rank of 64 and a Lo RA alpha value of 16. While PyTorch, AdamW, LoRA, Spark-TTS, and Moss Former2 are mentioned, specific version numbers for these software dependencies are not provided. |
| Experiment Setup | Yes | The model optimization was performed using the Adam W optimizer [33] with a learning rate of 2 10 5, β1 = 0.9, β2 = 0.999, and a weight decay of 10 4, accompanied by a cosine learning rate scheduler. Training was executed with a batch size of one for 2,000 epochs. Additionally, we fine-tuned the LLM using the Lo RA [26] technique with a Lo RA rank of 64 and a Lo RA alpha value of 16. |