Cross-modal Representation Learning and Relation Reasoning for Bidirectional Adaptive Manipulation
Authors: Lei Li, Kai Fan, Chun Yuan
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on CUB and Visual Genome verify that our approach outperforms the leading methods of single-modal controllable manipulation. |
| Researcher Affiliation | Collaboration | Lei Li1,3 , Kai Fan2 and Chun Yuan 3 1Department of Computer Science and Technology, Tsinghua University 2Alibaba DAMO Academy, Alibaba Group Inc. 3Tsinghua Shenzhen International Graduate School, Peng Cheng Lab |
| Pseudocode | No | The paper describes the proposed method using text and mathematical equations but does not include a formal pseudocode block or algorithm. |
| Open Source Code | No | The paper does not contain any statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | For cross-modal attribute manipulation, we evaluated our method on the CUB dataset [Wah et al., 2011]. For cross-modal relation manipulation, we validated on the Visual Genome (VG) [Krishna et al., 2017]. |
| Dataset Splits | No | The paper mentions using the 'test set' of the CUB dataset for evaluation, but does not provide explicit training, validation, or overall dataset split percentages or counts for reproducibility beyond the evaluation subsets. |
| Hardware Specification | Yes | All methods are benchmarked on a single Nvidia Ge Force RTX 3080 GPU. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | No | The paper describes the experimental evaluation metrics and datasets but does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed system-level training configurations needed for reproduction. |