U-BERT: Pre-training User Representations for Improved Recommendation

Authors: Zhaopeng Qiu, Xian Wu, Jingyue Gao, Wei Fan4320-4327

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on six benchmark datasets from different domains demonstrate the state-of-the-art performance of U-BERT. The experimental results of all models are summarized in Table 2.
Researcher Affiliation Collaboration 1 Tencent Medical AI Lab 2 Peking University {zhaopengqiu, kevinxwu, davidwfan}@tencent.com, gaojingyue1997@pku.edu.cn
Pseudocode No The paper describes methods using equations and text but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a direct link or an explicit statement about the release of its source code.
Open Datasets Yes Dataset We choose the experimental datasets from the following two sources: Amazon product review datasets 2: ... Yelp challenge dataset 3: ... 2http://jmcauley.ucsd.edu/data/amazon/links.html 3https://www.kaggle.com/yelp-dataset/yelp-dataset
Dataset Splits Yes We randomly selected 80% of user-item pairs in each finetuning dataset for training, 10% for validation, and 10% for test.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies No The paper mentions 'Py Torch' and using 'the original BERT’s weights' but does not specify their version numbers or other software dependencies with specific versions.
Experiment Setup Yes The dimensionality of all embeddings is set to 768, i.e., d = 768. In the pre-training and fine-tuning stages, we set the maximum length of the reviews to 200 and 220, respectively. Since the reviews of the Music domain are relatively longer, we set the maximum review length of this domain to 300. The weight in loss function β is set to 3. At both stages, we use Adam optimizer with a learning rate of 3 10 5. Other training settings, such as the dropout rate and weight decay rate, keep the same with the original BERT.