Field-wise Learning for Multi-field Categorical Data

Authors: Zhibin Li, Jian Zhang, Yongshun Gong, Yazhou Yao, Qiang Wu

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiment results on two large-scale datasets show the superior performance of our model, the trend of the generalization error bound, and the interpretability of learning outcomes.
Researcher Affiliation Academia 1University of Technology Sydney 2Nanjing University of Science and Technology
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/lzb5600/Field-wise-Learning.
Open Datasets Yes Criteo2: ... http://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/ Avazu3: ... https://www.kaggle.com/c/avazu-ctr-prediction
Dataset Splits Yes We randomly split the data into the train (80%), validation (10%) and test (10%) sets.
Hardware Specification Yes All experiments were run on a Linux workstation with one NVIDIA Quadro RTX6000 GPU of 24GB memory.
Software Dependencies No The paper states, 'We implemented our model using Py Torch [29].', but does not specify a version number for PyTorch or any other software.
Experiment Setup Yes The weight λ for regularization term was selected from {10 3, ..., 10 8}. For setting the rank r in our model, we tried two strategies: 1) chose different rank for different fields by ri = logb di and selected b from {1.2, 1.4, 1.6, 1.8, 2, 3, 4}; 2) set the rank to be the same for all fields and selected r from {4, 8, 12, ..., 28}. For all models, the learning rates were selected from {0.01, 0.1} and the weight decays or weights for L2-regularization were selected from {10 3, ..., 10 8} where applicable. We applied the early-stopping strategy based on the validation sets for all models. All models except GBDT used the Adagrad[23] optimizer with a batch size of 2048.