reproducibilityindex.ai

Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding

Authors: Hongliang He, Junlei Zhang, Zhenzhong Lan, Yue Zhang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on standard semantic text similarity (STS) tasks and achieve an average of 78.30%, 79.47%, 77.73%, and 79.42% Spearman s correlation on the base of BERT-base, BERT-large, Ro BERTa-base, and Ro BERTalarge respectively, a 2.05%, 1.06%, 1.16% and 0.52% improvement compared to unsup-Sim CSE.
Researcher Affiliation	Academia	Hongliang He1,2, Junlei Zhang1,2, Zhenzhong Lan2,3 , Yue Zhang2,3 1Zhejiang University, China 2School of Engineering, Westlake University, China 3Institute of Advanced Technology, Westlake Institute for Advanced Study, China
Pseudocode	No	The paper describes its methods through text and mathematical equations but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/dll-wu/IS-CSE
Open Datasets	Yes	We conduct our main experiments on 7 standard semantic textual similarities (STS) tasks: STS 2012-2016 (Agirre et al. 2012, 2013, 2014, 2015, 2016), STS Benchmark (Cer et al. 2017) and SICK-Relatedness (Marelli et al. 2014). We also include 7 transfer learning tasks (Conneau et al. 2017), taking STS as the main result for comparison following previous Sim CSE-related papers (Gao, Yao, and Chen 2021; Wang et al. 2022; Zhou et al. 2022). Following Sim CSE, the training corpus contains 106 sentences randomly sampled from English Wikipedia.
Dataset Splits	No	The paper refers to the 'STS-B development set' (e.g., in Table 3, 4, 5, 6, 7) which implies a validation set. However, it does not explicitly state the specific percentages or counts for training, validation, and test splits, nor does it provide a direct citation for the exact splits used within this paper.
Hardware Specification	Yes	Our experients are conducted on one NVIDIA A100 GPU.
Software Dependencies	No	The paper mentions using pre-trained checkpoints from Huggingface and the Adam optimizer, but it does not provide specific version numbers for these or any other software dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	Batch size 64 64 512 512 Learning rate 3e-5 1e-5 1e-5 3e-5. We train our model for 1 epoch and use the Adam optimizer (Kingma and Ba 2014). Cosine similarity with τ = 0.05 is used to calculate sentence similarity. In IS-CSE, we set the buffer size L = 1024 and the number of k NN neighbors k = 16. The temperature β for self-attention aggregation is set to 2. For BERTbase and Ro BERTabase we set α = 0.1. For BERTlarge and Ro BERTalarge we set a cosine schedule (Equ. 8) for α from 0.005 to 0.05.