reproducibilityindex.ai

A Domain Generalization Perspective on Listwise Context Modeling

Authors: Lin Zhu, Yihong Chen, Bowen He5965-5972

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our techniques on benchmark datasets, demonstrating that QILCM outperforms previous state-of-the-art approaches by a substantial margin.
Researcher Affiliation	Industry	Lin Zhu, Yihong Chen, Bowen He Ctrip Travel Network Technology Co., Limited. {zhulb, yihongchen, bwhe}@ctrip.com
Pseudocode	No	The paper describes the model architecture using equations and descriptive text, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper provides links to the implementations of baseline methods (DCM and DLCM) but does not provide a link or explicit statement for the open-source code of their proposed QILCM.
Open Datasets	Yes	We used two largescale LETOR datasets: Istella-S3 (Lucchese et al. 2016) and Microsoft Letor 30K4 (Qin and Liu 2013). ... For this task, we used Airline Itinerary5, which is an anonymized version of the dataset used in (Mottini and Acuna-Agost 2017)...
Dataset Splits	Yes	Each dataset is split in train, validation and test sets according to a 60%-20%-20% scheme.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions the use of the Adam algorithm and an open-source implementation of Lambda MART, but does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	More speciﬁcally, the dimensions of the nonlinear transformations (1) in the Input Encoder were ﬁxed as 100, while MLPs used in (3) and (9) consist of 2 hidden layers with either 256 or 128 ELUs. The models were trained with the Adam algorithm (Kingma and Ba 2014) with a learning rate of 0.001, batch size of 80. Training generally converged after less than 100 passes through the entire training dataset.