reproducibilityindex.ai

Learning Optimal Tree Models under Beam Search

Authors: Jingwei Zhuo, Ziru Xu, Wei Dai, Han Zhu, Han Li, Jian Xu, Kun Gai

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on both synthetic and real data verify the rationality of our theoretical analysis and demonstrate the superiority of our algorithm compared to state-of-the-art methods.
Researcher Affiliation	Industry	Jingwei Zhuo 1 Ziru Xu 1 Wei Dai 1 Han Zhu 1 Han Li 1 Jian Xu 1 Kun Gai 1 1Alibaba Group. Correspondence to: Jingwei Zhuo <zjw169463@alibaba-inc.com>.
Pseudocode	Yes	Algorithm 1 Learning Optimal Tree Models under Beam Search
Open Source Code	No	The paper does not include an unambiguous statement or link indicating the release of open-source code for the described methodology.
Open Datasets	Yes	Our experiment are conducted on two large-scale real datasets for recommendation tasks: Amazon Books (Mc Auley et al., 2015; He & Mc Auley, 2016) and User Behavior (Zhu et al., 2018).
Dataset Splits	Yes	We discard the user based data which has less than 10 items and split the rest into training set Dtr, validation set Dval and testing set Dte in the same way as Zhu et al. (2018; 2019).
Hardware Specification	Yes	This number is averaged over 5000 training iterations on a single Tesla P100-PCIE-16GB GPU.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4).
Experiment Setup	Yes	More speciﬁcally, T is set to be a random binary tree over I and g(x, n) = θ n x+bn is parameterized as a linear scorer, where θn Rd and bn R are trainable parameters. The neural network consists of three fully connected layers with hidden size 128, 64 and 24 and parametric Re LU is used as the activation function. In experiment, we set b to be a negative value such that the number of non-zero entries is less than 0.1M to simulate the practical case where the number of relevant targets is much smaller than the target set size. Throughout experiments, we use OTM to denote the tree model trained according to Algorithm 1 since its goal is to learn optimal tree models under beam search. To perform an ablation study, we consider two variants of OTM: OTM (-BS) differs from OTM by replacing Bh(x; θt) with Sh(y) = S+ h (y) S S h (y), and OTM (-Opt Est) differs from OTM by replacing ˆzn(x; θt) in Eq. (13) with zn in Eq. (1).