reproducibilityindex.ai

CAISE: Conversational Agent for Image Search and Editing

Authors: Hyounghun Kim, Doo Soon Kim, Seunghyun Yoon, Franck Dernoncourt, Trung Bui, Mohit Bansal10903-10911

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We introduce a novel generator-extractor model as a strong starting point baseline for this task and dataset. We employ a copying mechanism... Our experiments show our baseline model performs effectively as a starting point, and we demonstrate a large human-machine performance gap for useful future work. ... We split the total 1,611 dialogues into 1,052, 262, and 297 for train, validation, and test set, respectively. ... We use accuracy as the evaluation metric.
Researcher Affiliation	Collaboration	Hyounghun Kim,1 Doo Soon Kim,2 Seunghyun Yoon,3 Franck Dernoncourt,3 Trung Bui,3 Mohit Bansal1 1UNC Chapel Hill 2Roku Inc. 3Adobe Research {hyounghk, mbansal}@cs.unc.edu {syoon, dernonco, bui}@adobe.com
Pseudocode	No	The paper includes diagrams of the model architecture (Figure 3) and descriptions of commands, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	Yes	Data and code are available: https://github.com/hyounghk/CAISE.
Open Datasets	Yes	Thus, we propose a dataset of an automated Conversational Agent for Image Search and Editing (CAISE). To our knowledge, this is the first dataset that provides conversational image search and editing annotations... Data and code are available: https://github.com/hyounghk/CAISE.
Dataset Splits	Yes	We split the total 1,611 dialogues into 1,052, 262, and 297 for train, validation, and test set, respectively. From the dialogue splits, we obtain 4,059/1,002/1,112 (train/valid/test) instance splits.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments.
Software Dependencies	No	The paper mentions some tools used during data collection (Adobe Stock, Adobe Photoshop, OpenCV) and model components (Faster RCNN, Adam) but does not provide specific version numbers for any software dependencies required to replicate the experiments.
Experiment Setup	Yes	We use 512 as the hidden size and 256 as the word embedding dimension. We use Adam (Kingma and Ba 2015) as the optimizer with the learning rate 1 10 4.