Attention-over-Attention Field-Aware Factorization Machine

Authors: Zhibo Wang, Jinxin Ma, Yongquan Zhang, Qian Wang, Ju Ren, Peng Sun6323-6330

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that the proposed Ao AFFM improves FM and FFM with large margin, and outperforms state-of-the-art algorithms on three public benchmark datasets.
Researcher Affiliation Academia 1Key Lab of Aerospace Information Security and Trusted Computing, School of Cyber Science and Engineering, Wuhan University 2Department of Computer Science and Technology, Tsinghua University 3State Key Laboratory of Industrial Control Technology, Zhejiang University 4Key Laboratory of Computer Network Technology of Jiangsu Province, Southeast University
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about the release of its source code or a link to a code repository for the described methodology.
Open Datasets Yes We tested the performance of Ao AFFM on three real-world benchmarks datasets: Movielens2, Frappe3, and Criteo4. 2https://grouplens.org/datasets/movielens/ 3http://baltrunas.info/research-menu/frappe 4http://labs.criteo.com/2014/02/kaggle-display-advertisingchallenge-dataset/
Dataset Splits Yes For Movielens and Frappe datasets, we randomly split them into training (70%), validation (20%), and test (10%) sets, respectively. For Criteo dataset, we utilize the train, validation and test sets that it provided.
Hardware Specification No The paper mentions 'computation power limits' implying hardware was used, but it does not provide any specific hardware details such as GPU or CPU models, or cloud instance types.
Software Dependencies No The paper mentions 'tensorflow version' for FM implementation but does not specify any software names with version numbers for reproducibility.
Experiment Setup Yes The learning rate is searched in [0.005, 0.01, 0.05, 0.1], and the best one is selected for each model. All models are learned using Adagrad in minibatches. The batch size for Movielens data is set to 4096. For Frappe, the batch size is 128. Without special mention, t1, which denotes the size of the hidden layer, is set to 256 for the best performance. We adopted the early stopping strategy based on the performance on the validation set and carefully tuned the dropout ratios and regularization strength values for all models to prevent over-fitting.