Optimal Transport-based Labor-free Text Prompt Modeling for Sketch Re-identification
Authors: Rui Li, Tingting Ren, Jie Wen, Jinxing Li
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments conducted on two public benchmarks consistently demonstrate the robustness and superiority of our OLTM over state-of-the-art methods. |
| Researcher Affiliation | Academia | Rui Li , Tingting Ren , Jie Wen , Jinxing Li School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen |
| Pseudocode | Yes | Algorithm 1 The OT problem via Sinkhorn-Knopp |
| Open Source Code | No | Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: This paper does not currently provide open-access code, but it is planned to be made public in the future. |
| Open Datasets | Yes | Two publicly available benchmark datasets, namely PKU-Sketch [4] and Market-Sketch-1K [6], are utilized for performance evaluation. |
| Dataset Splits | Yes | Market-Sketch-1K is a large-scale dataset derived from the Market-1501 [1], which is created by six artists based on descriptions, featuring multiple perspectives and artistic styles. The training set consists of 2,332 sketches and 12,936 photos, while the testing set comprises 2,375 sketches and 19,732 photos. |
| Hardware Specification | Yes | The model is implemented in Py Torch on the RTX 4090 24GB GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'CLIP-Vi T-B/16', 'CLIP Text Transformer', 'Vi LT-based VQA model', 'Adam optimizer', but no specific version numbers for any of these software components. |
| Experiment Setup | Yes | Input images are resized to 288 × 144, and augmented with random horizontal flipping and style augmentation [61]... The iteration number of the Optimal Transport algorithm is 3. In the Triplet Assignment Loss, the iteration number of the Optimal Transport algorithm is 50. The model is trained with the Adam optimizer, starting with a learning rate of 1e-5, decaying with a cosine scheduler... The dimensions of image and text features are set to 512. Within a batch, we randomly select 8 identities, each comprising 4 images and 4 sketches. Each image is associated with 9 fine-grained textual attributes. To ensure more reliable comparisons, the random seeds are all set to 0. |