reproducibilityindex.ai

Meta-Diffu$B$: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration

Authors: Yun-Yen Chuang, Hung-Min Hsu, Kevin Lin, Chen-Sheng Gu, Ling-Zhen Li, Ray-I Chang, Hung-yi Lee

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct experiments to verify the performance of our Meta-Diffu B on four benchmark Seq2Seq datasets [48, 6, 17, 8].
Researcher Affiliation	Collaboration	Yun-Yen Chuang1,2, Hung-Min Hsu3, Kevin Lin4, Chen-Sheng Gu1,2, Ling Zhen Li1,2, Ray-I Chang2, Hung-yi Lee2 1Maxora AI 2National Taiwan University 3University of Washington 4Microsoft
Pseudocode	Yes	Algorithm 1 Meta-Diffu B
Open Source Code	Yes	1Code and datasets for Meta-Diffu B are available at: https://github.com/Meta-Diffu B/ Meta-Diffu B.
Open Datasets	Yes	In our experiment, we use four datasets: the Commonsense Conversation dataset (CC) [48], the Quasar-T dataset (QT) [6], the Wiki-Auto dataset (WA) [17], and the Quora Question Pairs dataset (QQP) [8].
Dataset Splits	Yes	The training set contains 3,382,137 pairs, the development set has 2,048, and the test set includes 10,000 pairs.
Hardware Specification	Yes	Experiments are conducted on NVIDIA A100 Tensor Core GPUs, utilizing 4 GPUs for training and a single GPU for inference.
Software Dependencies	No	The paper mentions general software components like 'Transformer model' and 'LSTM' but does not provide specific version numbers for any libraries or dependencies.
Experiment Setup	Yes	The diffusion step count is set at 2,000, and the maximum sequence length is 128. The Minimum Bayes risk (MBR) [23] decoding size, denoted as \|S\|, is 10; this involves generating sentences from 10 random seeds and selecting the best output sequence. The total batch size for both training and testing phases is 2048.