Collaborative Group Learning
Authors: Shaoxiong Feng, Hongshen Chen, Xuancheng Ren, Zhuoye Ding, Kan Li, Xu Sun7431-7438
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations on both image and text tasks indicate that our method significantly outperforms various state-of-the-art collaborative approaches whilst enhancing computational efficiency.Experiments Datasets and Architectures We present our results on six public available datasets of three classification tasks covering image classification, topic classification, and sentiment analysis. |
| Researcher Affiliation | Collaboration | 1School of Computer Science & Technology, Beijing Institute of Technology 2JD.com 3MOE Key Laboratory of Computational Linguistics, School of EECS, Peking University 4Center for Data Science, Peking University |
| Pseudocode | No | The paper describes the proposed method in detail with mathematical formulations and diagrams but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions a third-party tool, 'Sentence Piece', and provides its GitHub link, but it does not explicitly state that the authors' own source code for the methodology described in the paper is openly available or provide a link to it. |
| Open Datasets | Yes | Datasets and Architectures We present our results on six public available datasets of three classification tasks covering image classification, topic classification, and sentiment analysis. ... Table 1 summarizes the statistics of all datasets. (CIFAR-10 (Krizhevsky 2009), CIFAR-100 (Krizhevsky 2009), IMDB Review (Maas et al. 2011), Yelp Review Full (Zhang, Zhao, and Le Cun 2015), Yahoo! Answers (Zhang, Zhao, and Le Cun 2015), Amazon Review Full (Zhang, Zhao, and Le Cun 2015)) |
| Dataset Splits | Yes | Table 1: Statistics of six classification datasets used in our experiments. (includes # Train, # Holdout, # Test columns). The student that obtains the best score on the holdout set is used for evaluation. |
| Hardware Specification | No | The paper discusses the optimization settings and network architectures used (e.g., 'Res Net-18 and Res Net-34', 'VDCNN-9', 'Transformer') but does not specify any hardware details such as GPU or CPU models used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Sentence Piece' for tokenization, but it does not provide specific version numbers for this or any other software libraries or frameworks used in the experiments. |
| Experiment Setup | Yes | Experiment Settings For Res Net-18 and Res Net-34, we use Adam (Kingma and Ba 2015) for optimization with a mini-batch of size 64. The initial learning rate is 0.001, divided by 2 at 60, 120, and 160 of the total 200 training epochs. For VDCNN-9, we adopted the same experimental settings as (Conneau et al. 2017; Zhang, Zhao, and Le Cun 2015). Training is performed with Adam, using a mini-batch of size 64, a learning rate of 0.001 for the total 20 training epochs. We use Sentence Piece1 (BPE) to tokenize IMDB Review and set vocabulary size, embedding dimension, and maximum sequence length to 16000, 512, and 512. For Transformer, the size of blocks and heads is 3 and 4 separately. We set the size of the hidden state and feed-forward layer to 128 and 512. Training is performed with Adam, using a mini-batch of size 64, a learning rate of 0.0001 for the total 30 training epochs. We run each method 3 times and report mean (std). |