Contrastive Attention Networks for Attribution of Early Modern Print
Authors: Nikolai Vogler, Kartik Goyal, Kishore PV Reddy, Elizaveta Pertseva, Samuel V. Lemley, Christopher N. Warren, Max G'Sell, Taylor Berg-Kirkpatrick
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our method successfully improves downstream damaged type-imprint matching among printed works from this period, as validated by in-domain human experts. The results of our approach on two important philosophical works from the Early Modern period demonstrate potential to extend the extant historical research about the origins and content of these books. We evaluate our approach against other common methods for image comparison on a downstream damaged type-imprint matching dataset of English early modern (c. 1500 1800) books |
| Researcher Affiliation | Academia | 1 University of California, San Diego 2 Toyota Technological Institute at Chicago 3 Carnegie Mellon University |
| Pseudocode | No | The paper includes architectural diagrams (Figure 1) and a depiction of the data generation process (Figure 4), but it does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm' with structured, code-like steps. |
| Open Source Code | Yes | Code located at https://github.com/nvog/damaged-type. |
| Open Datasets | Yes | We obtain page image scans from 38 different English books printed from the 1650s 1690s by both known and unknown printers of historical interest. We use two different hand-curated datasets from recent bibliographical studies that manually identified and matched damaged type-imprints for attribution of two major early modern printed works (Warren et al. 2020, 2021). |
| Dataset Splits | Yes | Areopagitica validation set We collect a small validation set of the manually identified type-imprint matches used in the study for printer attribution of John Milton s anonymously printed Areopagitica (Warren et al. 2020). We train each model for 60 epochs and early stop using the best Areopagitica validation set recall. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or memory configurations. It mentions computational models and neural networks but no underlying hardware. |
| Software Dependencies | No | The paper mentions using 'Ocular OCR system (Berg-Kirkpatrick, Durrett, and Klein 2013a)' and 'scikit-image morphology (van der Walt et al. 2014)' but does not provide specific version numbers for these software dependencies or any other key libraries or programming languages. |
| Experiment Setup | Yes | We train CAML with the popular triplet loss (Weinberger and Saul 2009), which operates on an anchor/query embedding e along with the embedding e+ of a candidate image that matches the anchor and a non-matching candidate image s embedding e . This results in the following loss: max e e 2 e e+ 2+m, 0 , which focuses on minimizing the Euclidean distance between the anchor and the positive matching images embeddings, and maximizing the distance between the anchor and the non-matching images embeddings, such that the positive and negative examples are separated by a margin of at least m. We train each model for 60 epochs and early stop using the best Areopagitica validation set recall. We sample negative examples uniformly at random from our batch. |