Experiments on Visual Information Extraction with the Faces of Wikipedia

Authors: Md. Kamrul Hasan, Christopher Pal

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a series of visual information extraction experiments using the Faces of Wikipedia database and Our best probabilistic extraction pipeline yields an expected average accuracy of 77% compared to image only and text only baselines which yield 63% and 66% respectively.
Researcher Affiliation Academia Md. Kamrul Hasan and Christopher Pal D epartement de g enie informatique et g enie logiciel, Polytechnique Montr eal 2500, Chemin de Polytechnique, Universit e de Montr eal, Montr eal, Qu ebec, Canada
Pseudocode No The paper describes the methods in prose and uses diagrams, but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper states: "we provide our database to the community including registered faces, hand labeled and automated disambiguations, processed captions, meta data and evaluation protocols." and "Finally, we release the Faces of Wikipedia database along with these experiments to the community." It mentions releasing the database, but not the source code for the methodology.
Open Datasets Yes We present a series of visual information extraction experiments using the Faces of Wikipedia database a new resource that we release into the public domain for both recognition and extraction research containing over 50,000 identities and 60,000 disambiguated images of faces. and LFW [Huang et al. (2007)]
Dataset Splits No For each face count group, a randomly chosen 70% of its labeled instances plus all labeled data from its immediate above and below group (if any) were used as training, while the remaining 30% of the examples were used for testing. The paper explicitly mentions training and testing splits but does not mention a separate validation set.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions using "the Open CV face detector (Viola and Jones 2004)" and "the Stanford Named Entity Detector (NED) (Finkel, Grenager, and Manning 2005)" but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes Roughly one in every three images had at least one face of at least a moderate resolution (40x40 pixels) and we used this as the minimum size for inclusion in our experiments. and Equation (4) uses hyper-parameters α, which balances the relative importance given to positive matches vs. negative matches, and β, which controls the strength of the regularization term. and For each face count group, a randomly chosen 70% of its labeled instances plus all labeled data from its immediate above and below group (if any) were used as training, while the remaining 30% of the examples were used for testing. The results are averaged over 10 runs.