reproducibilityindex.ai

Contrastive Attention Networks for Attribution of Early Modern Print

Authors: Nikolai Vogler, Kartik Goyal, Kishore PV Reddy, Elizaveta Pertseva, Samuel V. Lemley, Christopher N. Warren, Max G'Sell, Taylor Berg-Kirkpatrick

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our method successfully improves downstream damaged type-imprint matching among printed works from this period, as validated by in-domain human experts. The results of our approach on two important philosophical works from the Early Modern period demonstrate potential to extend the extant historical research about the origins and content of these books. We evaluate our approach against other common methods for image comparison on a downstream damaged type-imprint matching dataset of English early modern (c. 1500 1800) books
Researcher Affiliation	Academia	1 University of California, San Diego 2 Toyota Technological Institute at Chicago 3 Carnegie Mellon University
Pseudocode	No	The paper includes architectural diagrams (Figure 1) and a depiction of the data generation process (Figure 4), but it does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm' with structured, code-like steps.
Open Source Code	Yes	Code located at https://github.com/nvog/damaged-type.
Open Datasets	Yes	We obtain page image scans from 38 different English books printed from the 1650s 1690s by both known and unknown printers of historical interest. We use two different hand-curated datasets from recent bibliographical studies that manually identified and matched damaged type-imprints for attribution of two major early modern printed works (Warren et al. 2020, 2021).
Dataset Splits	Yes	Areopagitica validation set We collect a small validation set of the manually identified type-imprint matches used in the study for printer attribution of John Milton s anonymously printed Areopagitica (Warren et al. 2020). We train each model for 60 epochs and early stop using the best Areopagitica validation set recall.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or memory configurations. It mentions computational models and neural networks but no underlying hardware.
Software Dependencies	No	The paper mentions using 'Ocular OCR system (Berg-Kirkpatrick, Durrett, and Klein 2013a)' and 'scikit-image morphology (van der Walt et al. 2014)' but does not provide specific version numbers for these software dependencies or any other key libraries or programming languages.
Experiment Setup	Yes	We train CAML with the popular triplet loss (Weinberger and Saul 2009), which operates on an anchor/query embedding e along with the embedding e+ of a candidate image that matches the anchor and a non-matching candidate image s embedding e . This results in the following loss: max e e 2 e e+ 2+m, 0 , which focuses on minimizing the Euclidean distance between the anchor and the positive matching images embeddings, and maximizing the distance between the anchor and the non-matching images embeddings, such that the positive and negative examples are separated by a margin of at least m. We train each model for 60 epochs and early stop using the best Areopagitica validation set recall. We sample negative examples uniformly at random from our batch.