reproducibilityindex.ai

DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents

Authors: Tsu-Jui Fu, William Yang Wang, Daniel McDuff, Yale Song634-642

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We release a dataset of about 6K paired documents and slide decks used in our experiments. We show that our approach outperforms strong baselines and produces slides with rich content and aligned imagery.
Researcher Affiliation	Collaboration	Tsu-Jui Fu1, William Yang Wang1, Daniel Mc Duff2, Yale Song2 1 UC Santa Barbara 2 Microsoft Research
Pseudocode	No	The paper includes architectural diagrams and mathematical equations but does not present any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Project webpage: https://doc2ppt.github.io/
Open Datasets	Yes	To help accelerate research in this domain, we release a dataset of about 6K paired documents and slide decks used in our experiments. Project webpage: https://doc2ppt.github.io/
Dataset Splits	Yes	Table 1: Descriptive statistics of our dataset. We report both the total count and the average number (in parenthesis). Train / Val / Test: CV 2,073 / 265 / 262, NLP 741 / 93 / 97, ML 1,872 / 234 / 236, Total 4,686 / 592 / 595
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only general statements like 'We train our network end-to-end'.
Software Dependencies	No	The paper mentions software components like RoBERTa, ResNet-152, Bi-GRU, Seq2Seq, and ADAM, but it does not specify version numbers for these software dependencies.
Experiment Setup	Yes	For the DR, we use a Bi-GRU with 1,024 hidden units and set the MLPs to output 1,024-dimensional embeddings. Each layer of the PT is based on a 256-unit GRU. The PAR is designed as Seq2Seq (Bahdanau, Cho, and Bengio 2015) with 512-unit GRU. We train our network end-to-end using ADAM (Diederik P. Kingma 2014) withlearning rate 3e-4. We tune the two hyper-parameters θR and θA via cross-validation (we set θR = 0.8, θA = 0.9).