reproducibilityindex.ai

Propensity Matters: Measuring and Enhancing Balancing for Recommendation

Authors: Haoxuan Li, Yanghao Xiao, Chunyuan Zheng, Peng Wu, Peng Cui

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on three real-world datasets including a large industrial dataset, and the results show that our approach boosts the balancing property and results in enhanced debiasing performance.
Researcher Affiliation	Academia	1Peking University 2University of Chinese Academy of Sciences 3University of California, San Diego 4Beijing Technology and Business University 5Tsinghua University.
Pseudocode	No	The paper does not contain pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets	Yes	Following the previous studies (Saito, 2020; Wang et al., 2019a; 2021; Chen et al., 2021), we conduct extensive experiments on two real-world datasets, COAT3, YAHOO! R34, and a public large-scale industrial dataset, PRODUCT5 (Gao et al., 2022). Specifically, COAT has 6,960 biased ratings and 4,640 unbiased ratings from 290 users to 300 items. YAHOO! R3 has 311,704 biased ratings and 54,000 unbiased ratings from 15,400 users to 1,000 items. Both datasets are five-scale, and we binarize the ratings greater than three as 1, otherwise as 0. PRODUCT is collected from a short video sharing platform, and it is an almost fully exposed industrial dataset. There are 4,676,570 outcomes from 1,411 users on 3,327 items with a density of 99.6%. The video watching ratios that greater than two are denoted as 1, otherwise as 0. 3https://www.cs.cornell.edu/ schnabts/mnar/ 4http://webscope.sandbox.Yahoo! R3.com/ 5https://github.com/chongminggao/Kuai Rec
Dataset Splits	No	The paper mentions training and evaluation using 'biased' and 'unbiased' ratings but does not provide specific details on how the dataset was split into training, validation, and test sets (e.g., percentages or counts), nor does it reference standard splits.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	All the methods are implemented on PyTorch with Adam as the optimizer. - This mentions software but lacks specific version numbers for PyTorch or other libraries.
Experiment Setup	Yes	All the methods are implemented on PyTorch with Adam as the optimizer. We tune learning rate in {0.0005, 0.001, 0.005, 0.01}, weight decay in {0, 1e 6, 1e 5, ..., 1e 1}, and the regularization hyperparameter λ in {0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1}.