Visalogy: Answering Visual Analogy Questions
Authors: Fereshteh Sadeghi, C. Lawrence Zitnick, Ali Farhadi
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper we study the problem of visual analogies for natural images and show the first results of its kind on solving visual analogy questions for natural images. Our experimental evaluations show promising results on solving visual analogy questions. |
| Researcher Affiliation | Collaboration | Fereshteh Sadeghi University of Washington fsadeghi@cs.washington.edu C. Lawrence Zitnick Microsoft Research larryz@microsoft.com Ali Farhadi University of Washington, The Allen Institute for AI ali@cs.washington.edu |
| Pseudocode | No | The paper presents a network architecture diagram in Figure 2, but no pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | To evaluate the capability of our trained network for solving analogy questions in the test scenarios explained above, we use a large dataset of 3D chairs [4] as well as a novel dataset of natural images (VAQA), that we collected for solving analogy questions on natural images. |
| Dataset Splits | Yes | We randomly select 1000 styles and 16 view points for training and keep the rest for testing. We have also used the double margin loss function introduced in Equation 3 with m P = 0.2, m N = 0.4 which we empirically found to give the best results in a held-out validatation set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or memory used for experiments. |
| Software Dependencies | No | The paper mentions 'Alex Net pre-trained network for the task of large-scale object recognition (ILSVRC2012) provided by the BVLC Caffe website [31]', but does not specify version numbers for Caffe or any other software dependencies. |
| Experiment Setup | Yes | In all the experiments, we use stochastic gradient descent (SGD) to train our network. We fine-tune the last two fully connected layers (fc6, fc7) and the last convolutional layer (conv5) unless stated otherwise. We have also used the double margin loss function introduced in Equation 3 with m P = 0.2, m N = 0.4 which we empirically found to give the best results in a held-out validatation set. |