Dress like an Internet Celebrity: Fashion Retrieval in Videos

Authors: Hongrui Zhao, Jin Yu, Yanan Li, Donghui Wang, Jie Liu, Hongxia Yang, Fei Wu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on a new large-scale cross-domain video-to-shop dataset show that DPRNet is efficient and outperforms the state-of-the-art methods on videoto-shop task.
Researcher Affiliation Collaboration Hongrui Zhao1 , Jin Yu2 , Yanan Li3 , Donghui Wang1 , Jie Liu2 , Hongxia Yang2 and Fei Wu1 1College of Computer Science and Technology, Zhejiang University 2Alibaba Group 3Institute of Artificial Intelligence, Zhejiang Lab
Pseudocode No The paper describes the proposed network architecture and components but does not provide pseudocode or a clearly labeled algorithm block.
Open Source Code No The paper does not provide an explicit statement about releasing the source code or a link to a code repository for the methodology described.
Open Datasets Yes Deep Fashion. Deepfashion [Liu et al., 2016b] provides over 800,000 real-world images with rich additional information about categories, attributes, landmarks, etc. Besides, Deep Fashion consumer-to-shop retrieval is a popular dataset for evaluating cross-domain fashion retrieval. Thus, we train and evaluate the fashion retrieval network on Deep Fashion consumer-to-shop retrieval dataset. Deep Fashion2. Due to the Deep Fashion was limited by single clothing-item per image, we use the training set of Deep Fashion2 [Ge et al., 2019], which contains 191,961 images annotated with a bounding box and category label for each clothing item, to train the clothing detector.
Dataset Splits Yes Our method outperforms previous state-of-the-art methods in both val+test (95,961 query images and 22,669 gallery images) and test (47,434 query images and 11,312 gallery images) split ways.
Hardware Specification No DPRNet only needs around two weeks to automatically accomplish the video-to-shop task on 10 million videos and hundreds of millions of gallery images with 200 GPUs. However, specific GPU models or other hardware details are not provided.
Software Dependencies No The paper does not provide specific version numbers for software components, libraries, or frameworks used in the experiments.
Experiment Setup No The paper describes the model architecture and loss functions but does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, number of epochs) or optimizer settings.