Attention-over-Attention Field-Aware Factorization Machine
Authors: Zhibo Wang, Jinxin Ma, Yongquan Zhang, Qian Wang, Ju Ren, Peng Sun6323-6330
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that the proposed Ao AFFM improves FM and FFM with large margin, and outperforms state-of-the-art algorithms on three public benchmark datasets. |
| Researcher Affiliation | Academia | 1Key Lab of Aerospace Information Security and Trusted Computing, School of Cyber Science and Engineering, Wuhan University 2Department of Computer Science and Technology, Tsinghua University 3State Key Laboratory of Industrial Control Technology, Zhejiang University 4Key Laboratory of Computer Network Technology of Jiangsu Province, Southeast University |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about the release of its source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | We tested the performance of Ao AFFM on three real-world benchmarks datasets: Movielens2, Frappe3, and Criteo4. 2https://grouplens.org/datasets/movielens/ 3http://baltrunas.info/research-menu/frappe 4http://labs.criteo.com/2014/02/kaggle-display-advertisingchallenge-dataset/ |
| Dataset Splits | Yes | For Movielens and Frappe datasets, we randomly split them into training (70%), validation (20%), and test (10%) sets, respectively. For Criteo dataset, we utilize the train, validation and test sets that it provided. |
| Hardware Specification | No | The paper mentions 'computation power limits' implying hardware was used, but it does not provide any specific hardware details such as GPU or CPU models, or cloud instance types. |
| Software Dependencies | No | The paper mentions 'tensorflow version' for FM implementation but does not specify any software names with version numbers for reproducibility. |
| Experiment Setup | Yes | The learning rate is searched in [0.005, 0.01, 0.05, 0.1], and the best one is selected for each model. All models are learned using Adagrad in minibatches. The batch size for Movielens data is set to 4096. For Frappe, the batch size is 128. Without special mention, t1, which denotes the size of the hidden layer, is set to 256 for the best performance. We adopted the early stopping strategy based on the performance on the validation set and carefully tuned the dropout ratios and regularization strength values for all models to prevent over-fitting. |