TaCo: Textual Attribute Recognition via Contrastive Learning

Authors: Chang Nie, Yiqing Hu, Yanqiu Qu, Hao Liu, Deqiang Jiang, Bo Ren

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that Ta Co surpasses the supervised counterparts and advances the state-of-the-art remarkably on multiple attribute recognition tasks.
Researcher Affiliation Industry Tencent You Tu Lab {changnie, hooverhu, yanqiuqu, ivanhliu, dqiangjiang, timren}@tencent.com
Pseudocode No The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code No Online services of Ta Co will be publicly released soon to assist relevant researchers and designers.
Open Datasets No Now there exist no publicly available datasets for textual attributes. We constructed a large-scale synthetic dataset (Syn Attr) comprising one million images of text segments for system pre-training and fine-tuning.
Dataset Splits No The paper mentions 'For validation, we manually annotated a dataset Attr-5k comprising 5k individual sentence images' but does not specify the train/validation/test splits (e.g., percentages or counts) for its main synthetic dataset (Syn Attr) or Attr-5k for reproducibility.
Hardware Specification Yes All experiments are implemented on a platform with 8 Nvidia V100 GPUs.
Software Dependencies No The paper mentions frameworks like Sim Siam and model architectures like Res Net-50 and Deformable DETR, but does not provide specific version numbers for software dependencies (e.g., programming languages, deep learning libraries).
Experiment Setup Yes The standard SGD optimizer with a learning rate of 0.1 is used for optimization. We train for 100 epochs (taking 26 hours) and adjust the learning rate using a Cosine Annealing strategy. The patch size P and the number of attention heads of MAEM are set to 4. For data augmentation, our pretext tasks include: 1) randomly reordering the words with a probability of 0.5, 2) randomly cropping views from the original image by ratio range (0.8 1, 0.6 1, then rescaling and padding them to a fixed size of (32, 256) without changing its aspect ratio, and 3) color jittering alters the brightness, contrast, saturation and hue of an image with an offset degree of (0.4, 0.4, 0.4, 0.1) with a probability of 0.8.