Towards Better Understanding the Clothing Fashion Styles: A Multimodal Deep Learning Approach

Authors: Yihui Ma, Jia Jia, Suping Zhou, Jingtian Fu, Yejun Liu, Zijian Tong

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Employing the benchmark dataset we build with 32133 full-body fashion show images, we use BCDA to map the visual features to the FSS. The experiment results indicate that our model outperforms (+13% in terms of MSE) several alternative baselines, confirming that our model can better understand the clothing fashion styles.
Researcher Affiliation Collaboration Yihui Ma,1 Jia Jia,1 Suping Zhou,3 Jingtian Fu,12 Yejun Liu,12 Zijian Tong4 1Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China Key Laboratory of Pervasive Computing, Ministry of Education Tsinghua National Laboratory for Information Science and Technology (TNList) 2Academy of Arts & Design, Tsinghua University, Beijing, China 3Beijing University of Posts and Telecommunications, Beijing, China 4Sogou Corporation, Beijing, China jjia@mail.tsinghua.edu.cn
Pseudocode Yes Algorithm 1 Bimodal Correlative Deep Autoencoder
Open Source Code No The paper states: 'We are willing to make our dataset open to facilitate other people s research on clothing fashion2. 2https://pan.baidu.com/s/1boPm2OB'. This link is specifically for the dataset, not the source code for the methodology.
Open Datasets Yes We build a large full-annotated benchmark dataset, which employs 32133 full-body fashion show images in the last 10 years downloaded from Vogue. It covers both men and women clothing, and contains 550 fashion brands. ... We are willing to make our dataset open to facilitate other people s research on clothing fashion2. 2https://pan.baidu.com/s/1boPm2OB
Dataset Splits Yes All the experiments are performed on 5-folder cross-validation.
Hardware Specification Yes On this condition, the experiment lasts for about 20 minutes in a quadcore 2.80GHz CPU, 16GB memory environment.
Software Dependencies No The paper mentions using 'Support Vector Machine (SVM)' and 'Word Net::Similarity' but does not provide specific version numbers for any software or libraries, which is required for reproducibility.
Experiment Setup Yes The cost function to evaluate the difference between x, c and ˆx, ˆc is defined as: J(W, b) = λ1 i=1 ||xi ˆxi||2 + λ2 i=1 ||ci ˆci||2 l (||W (l)||2 F + ||b(l)||2 2) (2) where m is the number of samples, λ1, λ2, λ3are hyperparameters... The optimization method we adopt is Stochastic Gradient Descent Algorithm... where α is the step size in gradient descent algorithm... According to Figure 6(b), the performance do increase with layer number less than 5, but get worse after the number become larger because of overfitting. Therefore, we take 5 layers in our experiments.