MPMQA: Multimodal Question Answering on Product Manuals
Authors: Liang Zhang, Anwen Hu, Jing Zhang, Shuo Hu, Qin Jin
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We construct a large-scale dataset PM209 with human annotations to support the research on the MPMQA task. It contains 22,021 QA annotations over 209 product manuals in 27 well-known consumer electronic brands. We conduct experiments to validate our URA model on the proposed PM209 dataset. Table 4 shows the comparison between URA and the baselines described above. |
| Researcher Affiliation | Collaboration | 1School of Information, Renmin University of China 2Samsung Research China Beijing (SRC-B) |
| Pseudocode | No | The paper describes its proposed model and methods in prose and with a block diagram (Figure 7), but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The PM209 dataset is available at https://github.com/AIM3-RUC/MPMQA. We release the dataset, code, and model at https://github.com/AIM3-RUC/MPMQA. |
| Open Datasets | Yes | We construct a large-scale dataset PM209 with human annotations to support the research on the MPMQA task. It contains 22,021 QA annotations over 209 product manuals in 27 well-known consumer electronic brands. The PM209 dataset is available at https://github.com/AIM3-RUC/MPMQA. |
| Dataset Splits | Yes | We divide the manuals in the PM209 dataset into Train/Val/Test as shown in Table 3. Table 3: Number of samples in each data split. |
| Hardware Specification | Yes | It takes about 20 hours to converge on 1 NVIDIA RTX A6000 GPU. |
| Software Dependencies | No | We implement the above-mentioned models based on Pytorch (Paszke et al. 2019) and Huggingface Transformers (Wolf et al. 2020). No specific version numbers are provided for PyTorch or Huggingface Transformers. |
| Experiment Setup | Yes | We train the models for 20 epochs with a batch size of 8 and a learning rate of 3e-5. |