Field-wise Learning for Multi-field Categorical Data
Authors: Zhibin Li, Jian Zhang, Yongshun Gong, Yazhou Yao, Qiang Wu
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiment results on two large-scale datasets show the superior performance of our model, the trend of the generalization error bound, and the interpretability of learning outcomes. |
| Researcher Affiliation | Academia | 1University of Technology Sydney 2Nanjing University of Science and Technology |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/lzb5600/Field-wise-Learning. |
| Open Datasets | Yes | Criteo2: ... http://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/ Avazu3: ... https://www.kaggle.com/c/avazu-ctr-prediction |
| Dataset Splits | Yes | We randomly split the data into the train (80%), validation (10%) and test (10%) sets. |
| Hardware Specification | Yes | All experiments were run on a Linux workstation with one NVIDIA Quadro RTX6000 GPU of 24GB memory. |
| Software Dependencies | No | The paper states, 'We implemented our model using Py Torch [29].', but does not specify a version number for PyTorch or any other software. |
| Experiment Setup | Yes | The weight λ for regularization term was selected from {10 3, ..., 10 8}. For setting the rank r in our model, we tried two strategies: 1) chose different rank for different fields by ri = logb di and selected b from {1.2, 1.4, 1.6, 1.8, 2, 3, 4}; 2) set the rank to be the same for all fields and selected r from {4, 8, 12, ..., 28}. For all models, the learning rates were selected from {0.01, 0.1} and the weight decays or weights for L2-regularization were selected from {10 3, ..., 10 8} where applicable. We applied the early-stopping strategy based on the validation sets for all models. All models except GBDT used the Adagrad[23] optimizer with a batch size of 2048. |