Cross-modal Representation Learning and Relation Reasoning for Bidirectional Adaptive Manipulation

Authors: Lei Li, Kai Fan, Chun Yuan

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on CUB and Visual Genome verify that our approach outperforms the leading methods of single-modal controllable manipulation.
Researcher Affiliation Collaboration Lei Li1,3 , Kai Fan2 and Chun Yuan 3 1Department of Computer Science and Technology, Tsinghua University 2Alibaba DAMO Academy, Alibaba Group Inc. 3Tsinghua Shenzhen International Graduate School, Peng Cheng Lab
Pseudocode No The paper describes the proposed method using text and mathematical equations but does not include a formal pseudocode block or algorithm.
Open Source Code No The paper does not contain any statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes For cross-modal attribute manipulation, we evaluated our method on the CUB dataset [Wah et al., 2011]. For cross-modal relation manipulation, we validated on the Visual Genome (VG) [Krishna et al., 2017].
Dataset Splits No The paper mentions using the 'test set' of the CUB dataset for evaluation, but does not provide explicit training, validation, or overall dataset split percentages or counts for reproducibility beyond the evaluation subsets.
Hardware Specification Yes All methods are benchmarked on a single Nvidia Ge Force RTX 3080 GPU.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup No The paper describes the experimental evaluation metrics and datasets but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed system-level training configurations needed for reproduction.