Learning Contrastive Embedding in Low-Dimensional Space

Authors: Shuo Chen, Chen Gong, Jun Li, Jian Yang, Gang Niu, Masashi Sugiyama

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we show experimental results on real-world datasets to validate the effectiveness of our proposed method. In detail, we first conduct ablation study to reveal the usefulness of our introduced new block and new regularizers. Then, we compare our proposed learning algorithm with existing state-of-the-art models on vision and language tasks. Finally, we test our method on the CL based reinforcement learning task.
Researcher Affiliation Academia S. Chen and G. Niu are with RIKEN Center for Advanced Intelligence Project (AIP), Japan (Email: {shuo.chen.ya@riken.jp, gang.niu.ml@gmail.com}). C. Gong, J. Li, and J. Yang are with the PCA Lab, Key Lab of Intelligent Perception and Systems for High Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, China (E-mail: {junli, chen.gong, csjyang}@njust.edu.cn). M. Sugiyama is with RIKEN Center for Advanced Intelligence Project (AIP), Japan; and also with the Graduate School of Frontier Sciences, The University of Tokyo, Japan (E-mail: sugi@k.u-tokyo.ac.jp).
Pseudocode Yes Algorithm 1 Solving Eq. (8) via SGD. Input: Training Data X ={xi}N i=1; Step Size η > 0; Regularization Parameter λ, α > 0; Batch Size n N+. Initialize: Iteration Number t = 0. For t from 1 to T: 1). Uniformly pick (n + 1) data points {xbj}n+1 j=1 from X; 2). Compute the gradient of f(Φ; {xbj}n+1 j=1 ) = ℓ(Φ; {xbj}n+1 j=1 )+λRB(Φ, L; {xbj}n+1 j=1 ) via Eq. (10): 3). Update the learning parameters: Φ(t+1) Φ(t) η Φf(Φ, L; {xbj}n+1 j=1 ) and L(t+1) L(t) η Lf(Φ, L; {xbj}n+1 j=1 ), (9) End. Output: The converged eΦ and e L.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)?[Yes] See Section 4 and supplemental material.
Open Datasets Yes We use the STL-10 and CIFAR-10 datasets to train the baseline Sim CLR [7] and two implementations of CLLR... We employ the Book Corpus dataset [23] to evaluate the performance of all compared methods on six text classification tasks... Here we select contrastive multiview coding (CMC) [35] as baseline methods... on STL10 [12], CIFAR-10 [24], and Image Net-100 [31] datasets... All methods are tested on the Deep Mind control suite [34]
Dataset Splits Yes Here the 10-fold cross validation is adopted, and the average classification accuracy is listed in Tab. 2.
Hardware Specification Yes The training process is implemented on Pytorch [29] with NVIDIA Tesla V100 GPUs.
Software Dependencies No The training process is implemented on Pytorch [29]... (PyTorch is mentioned but without a specific version number used for the experiments).
Experiment Setup Yes The regularization parameters λ and α are fixed to 0.1 and 10, resepectively. The hyperparameters of compared methods are set to the recommended values according to their original papers. We train all models with 100 and 400 epochs with the same batch size and learning rate, respectively