A Systematic Evaluation of Object Detection Networks for Scientific Plots

Authors: Pritha Ganguly, Nitesh S Methani, Mitesh M. Khapra, Pratyush Kumar1379-1387

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To answer this question, we train and compare the accuracy of Fast/Faster R-CNN, SSD, YOLO and Retina Net on the Plot QA dataset with over 220, 000 scientific plots.
Researcher Affiliation Academia Pritha Ganguly , Nitesh S Methani*, Mitesh M. Khapra, Pratyush Kumar Robert Bosch Centre for Data Science and AI (RBC-DSAI), Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India. {prithag, nmethani, miteshk, pratyush}@cse.iitm.ac.in
Pseudocode No The paper does not contain structured pseudocode or clearly labeled algorithm blocks. It describes the model architecture and components in text and diagrams.
Open Source Code No The paper does not provide any statement about making its source code openly available, nor does it provide any links to a code repository.
Open Datasets Yes We run our experiments on the Plot QA dataset (Methani et al. 2020), as it is based on real-world data while both Figure QA and DVQA are based on synthetic data.
Dataset Splits No The paper mentions using a 'validation dataset' (e.g., 'Based on evaluation on the validation dataset, we modified the parameters...'), but it does not provide specific details on the split percentages or sample counts for training, validation, and test sets.
Hardware Specification No The paper states 'We are extremely thankful to Google for funding this work. Such extensive experimentation would not have been possible without their invaluable support,' suggesting the use of substantial computational resources, likely Google's infrastructure, but it does not specify any particular GPU, CPU models, or other hardware components used for the experiments.
Software Dependencies No The paper mentions using existing implementations for R-CNN family, YOLO-v3, SSD, and Retina Net, and specifies backbone feature extractors (ResNet50, Inception Net, Dark Net53) and optimizer (Adam), but it does not provide specific version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup Yes These models were trained with an initial base learning rate of 0.025 with momentum stochastic gradient descent algorithm. The network s classification and regression heads use a batch-size of 512 ROIs. Retina Net and SSD models were trained with a batch-size of 32 with a learning rate of 0.004. ... The model was trained with a batchsize of 64 and a learning rate of 0.001. ... We trained our model for 10 epochs using Adam optimizer (Kingma and Ba 2014) with a learning rate of 0.0001.