A Self Validation Network for Object-Level Human Attention Estimation

Authors: Zehua Zhang, Chen Yu, David Crandall

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate on two public datasets, demonstrating that the Self Validation Module significantly benefits both training and testing and that our model outperforms the state-of-the-art.
Researcher Affiliation Academia Zehua Zhang,1 Chen Yu,2 David Crandall1 1Luddy School of Informatics, Computing, and Engineering 2Department of Psychological and Brain Sciences Indiana University Bloomington {zehzhang, chenyu, djcran}@indiana.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No More information is available at http://vision.soic.indiana.edu/mindreader/. This link points to a project page, not an explicit code repository or a statement of code release for the methodology.
Open Datasets Yes We evaluate on two public datasets, ATT [68] (Adult-Toddler Toy play) and Epic-Kitchen Dataset [13].
Dataset Splits Yes We randomly select 90% of the samples in each object class for training and use the remaining 10% for testing, resulting in about 17, 000 training and 1, 900 testing samples, each with 15 continuous frames. ... We randomly select 90% of samples for training, yielding about 120, 000 training and 13, 000 testing samples.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory) used for running its experiments.
Software Dependencies No We implemented our model with Keras [12] and Tensorflow [1]. No specific version numbers are provided for these software dependencies.
Experiment Setup Yes We use stochastic gradient descent with learning rate 0.03, momentum 0.9, decay 0.0001, and L2 regularizer 5e 5. The loss function consists of four parts: global classification Lglobalclass, attention Lattn, anchor box classification Lboxclass, and box regression Lbox, Ltotal = αLglobalclass + βLattn + 1 Npos (γLboxclass + Lbox), where we empirically set α = β = γ = 1.