On Estimating Recommendation Evaluation Metrics under Sampling
Authors: Ruoming Jin, Dong Li, Benjamin Mudrak, Jing Gao, Zhi Liu4147-4154
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we report the experimental evaluation on estimating the top-K metrics based on sampling, as well as the learning of empirical rank distribution P(R). Specifically, we aim to answer the following questions: (Question 1) How do the new estimators based on the learned empirical distribution perform against the CLS and BV approach proposed in (Krichene and Rendle 2020) on estimating the top-K metrics based on sampling? ... We use four of the most commonly used datasets for recommendation studies in our study... |
| Researcher Affiliation | Collaboration | Ruoming Jin,1 Dong Li, 1 Benjamin Mudrak,1 Jing Gao, 2 Zhi Liu2 1 Kent State University 2 i Lambda {rjin1,dli12,bmudrak1}@kent.edu {jgao,zliu}@ilambda.com |
| Pseudocode | No | The paper describes methods using mathematical equations and textual explanations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of its source code for the methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We use four of the most commonly used datasets for recommendation studies in our study... (ml-1m dataset (Harper and Konstan 2015))... Table 3: Dataset: citeulike with sample size =99. |
| Dataset Splits | No | The paper mentions 'testing dataset' and 'sampling ranked results', and refers to sample sizes related to the evaluation process (e.g., 'sample size =99'), but it does not provide explicit details about train, validation, and test dataset splits (e.g., percentages or counts). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions) that would be needed to replicate the experiments. |
| Experiment Setup | Yes | The estimators include CLS, BV (with the tradeoff parameters \u03b3 = 0.1 and \u03b3 = 0.01), MLE (Maximal Likelihood Estimation), WMLE (Weighted Maximal Likelihood Estimation where the weighted function is MNDCG with C = 10), MES (Maximal Entropy with Squared distribution distance, where \u03b7 = 0.001). |