reproducibilityindex.ai

Counterfactual Learning with General Data-Generating Policies

Authors: Yusuke Narita, Kyohei Okumura, Akihiro Shimizu, Kohei Yata

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our method with experiments on partly and entirely deterministic logging policies. Simulation Experiments. We validate our method with two simulation experiments. Real-World Application. We empirically apply our method to evaluate and optimize coupon targeting policies.
Researcher Affiliation	Collaboration	Yusuke Narita1, Kyohei Okumura2, Akihiro Shimizu3, Kohei Yata4 1 Yale University, 2 Northwestern University, 3 Mercari, Inc., 4 University of Wisconsin-Madison
Pseudocode	No	The paper describes its methods using mathematical formulas and prose, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states 'The full version of the paper, which includes technical appendices, can be found at https://arxiv.org/abs/2212.01925.' but does not provide any specific links to open-source code for the methodology or state that code is released.
Open Datasets	No	The paper states 'Our application is based on proprietary data provided by Mercari Inc.' for its real-world application. For simulations, it describes how data was generated ('We generate a random sample...'), but does not use or provide access to a recognized public dataset.
Dataset Splits	No	The paper describes data generation processes for simulations and mentions 'training sample' for internal models, but it does not provide specific training/validation/test dataset splits (exact percentages, sample counts, or citations to predefined splits) for its experiments.
Hardware Specification	No	The paper does not provide any specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions software like 'sklearn’s Random Forest Regressor' and 'pylift for implementation' but does not specify their version numbers or the version of Python used, which is necessary for reproducible software dependencies.
Experiment Setup	Yes	For the counterfactual policy π, we use D to train a model f(x, a) that predicts the reward given the context and action, using sklearn s Random Forest Regressor with 500 trees and otherwise default parameters.