reproducibilityindex.ai

AutoAttend: Automated Attention Representation Search

Authors: Chaoyu Guan, Xin Wang, Wenwu Zhu

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show the superiority of our proposed Auto Attend model over previous state-of-the-arts on eight text classiﬁcation tasks in NLP and four node classiﬁcation tasks in GRL.
Researcher Affiliation	Academia	Chaoyu Guan 1 Xin Wang 1 Wenwu Zhu 1 Department of Computer Science and Technology, Tsinghua University . Correspondence to: Xin Wang <xin wang@tsinghua.edu.cn>, Wenwu Zhu <wwzhu@tsinghua.edu.cn>.
Pseudocode	No	The paper describes algorithms verbally (e.g., 'Mont-Carlo to estimate the expectation and use Gradient Descent to find the optimal solution', 'evolutionary search') but does not present them in structured pseudocode or an algorithm block.
Open Source Code	Yes	Code will be published at https://github.com/THUMNLab/AutoAttend
Open Datasets	Yes	The tasks and datasets used in this paper are introduced in Section 5.1. The detailed information of datasets we use is shown in Table 1. (Table 1 lists SST, AG, DBP, YELP-B, YELP, YAHOO, AMZ-B with number of classes, train, valid, test splits). The detailed information of datasets we use is shown in Table 2. (Table 2 lists CORA, CITESEER, PUBMED, PPI with #CLASS, #FEATURE, #NODE, #EDGE). The word embeddings are initialized from pretrained Glo Ve (Pennington et al., 2014) and are ﬁne-tuned during training.
Dataset Splits	Yes	Table 1. Detailed information of natural language processing datasets used in this paper. DATASET #CLASS #TRAIN #VALID #TEST SST 5 8,544 1,101 2,210 SST-B 2 6,920 872 1,821 AG 4 120,000 7,600 DBP 14 560,000 70,000 YELP-B 2 560,000 38,000 YELP 5 650,000 50,000 YAHOO 10 1,400,000 60,000 AMZ-B 2 3,600,000 400,000
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions software components like GloVe and Adam (an optimizer often used with PyTorch), but it does not specify version numbers for any software dependencies (e.g., 'PyTorch 1.9', 'Python 3.8').
Experiment Setup	Yes	For searching in NLP, we set the layer number to 24 to stay consistent with previous works. The word embeddings are initialized from pretrained Glo Ve (Pennington et al., 2014) and are ﬁne-tuned during training. When searching, we use hidden size 64, batch size 128, learning rate 0.005 with Adam (Kingma & Ba, 2015), dropout 0.1, and max input sentence length 64.