reproducibilityindex.ai

Addressing the Long-term Impact of ML Decisions via Policy Regret

Authors: David Lindner, Hoda Heidari, Andreas Krause

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically compare our algorithm with several baselines and ﬁnd that it consistently outperforms them, in particular for long time horizons. In this section, we empirically investigate the effectiveness of our noise-handling approach on several datasets.
Researcher Affiliation	Academia	1ETH Zurich 2Carnegie Mellon University
Pseudocode	Yes	Algorithm 1 The Single-Peaked Optimism (SPO) algorithm.
Open Source Code	Yes	Code to reproduce all of our experiments can be found at https://github.com/david-lindner/single-peaked-bandits.
Open Datasets	Yes	Motivated by our initial example of a budget planner in Section 1, we simulate a credit lending scenario based on the FICO credit scoring dataset from 2003 [Reserve, 2007].
Dataset Splits	No	The paper uses datasets like FICO and synthetic data but does not explicitly provide details about training, validation, or test splits (e.g., percentages, sample counts, or specific splitting methodology) for their experiments.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or processor types) used for running the experiments were mentioned in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library names like 'Python 3.8' or 'PyTorch 1.9') are explicitly mentioned in the paper.
Experiment Setup	Yes	We consider three datasets: (1) a set of synthetic reward functions, (2) a simulation of a user interacting with a recommender system, and (3) a dataset constructed from the FICO credit scoring data. ... We assume the user s inherent preferences stay constant, but the novelty factor decays when showing an item more often. ... The reward is fi(0) = 0 for never showing an item, and subsequent rewards are deﬁned as fi(t) = fi(t 1) + n γt c (fi(t) v).