reproducibilityindex.ai

Field-wise Learning for Multi-field Categorical Data

Authors: Zhibin Li, Jian Zhang, Yongshun Gong, Yazhou Yao, Qiang Wu

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiment results on two large-scale datasets show the superior performance of our model, the trend of the generalization error bound, and the interpretability of learning outcomes.
Researcher Affiliation	Academia	1University of Technology Sydney 2Nanjing University of Science and Technology
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/lzb5600/Field-wise-Learning.
Open Datasets	Yes	Criteo2: ... http://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/ Avazu3: ... https://www.kaggle.com/c/avazu-ctr-prediction
Dataset Splits	Yes	We randomly split the data into the train (80%), validation (10%) and test (10%) sets.
Hardware Specification	Yes	All experiments were run on a Linux workstation with one NVIDIA Quadro RTX6000 GPU of 24GB memory.
Software Dependencies	No	The paper states, 'We implemented our model using Py Torch [29].', but does not specify a version number for PyTorch or any other software.
Experiment Setup	Yes	The weight λ for regularization term was selected from {10 3, ..., 10 8}. For setting the rank r in our model, we tried two strategies: 1) chose different rank for different ﬁelds by ri = logb di and selected b from {1.2, 1.4, 1.6, 1.8, 2, 3, 4}; 2) set the rank to be the same for all ﬁelds and selected r from {4, 8, 12, ..., 28}. For all models, the learning rates were selected from {0.01, 0.1} and the weight decays or weights for L2-regularization were selected from {10 3, ..., 10 8} where applicable. We applied the early-stopping strategy based on the validation sets for all models. All models except GBDT used the Adagrad[23] optimizer with a batch size of 2048.