Performative Prediction with Bandit Feedback: Learning through Reparameterization

Authors: Yatong Chen, Wei Tang, Chien-Ju Ho, Yang Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical results using a toy example to demonstrate the efficiency of our method. In particular, we compare our proposed method (which minimizes PR as a function of the distribution parameter ϕ after reparametrization) with the baseline method (which directly minimizes PR as a model parameter θ). We observe that under different settings, both methods converge. However, our proposed method (shown in orange) is more efficient: it demonstrates a much faster convergence rate on average over multiple runs, indicating that our reparametrization method is effective when dealing with distributions that are nonconvex in θ but convex in ϕ (as per Assumption 1). The plot can be found in Appendix H. The details for reproducing our experimental results can be found at https://github.com/UCSC-REAL/PP-bandit-feedback.
Researcher Affiliation Academia 1Department of Computer Science and Engineering, University of California, Santa Cruz, California, United States. 2Data Science Institute, Columbia University 3Department of Decisions, Operations, and Technology, the Chinese University of Hong Kong 4Department of Computer Science and Engineering, Washington University in St. Louis.
Pseudocode Yes Algorithm 1 Bandit algorithm for minimizing an indirectly convex function with noisy oracles Algorithm 2 Learn a model that approximately induces a given distribution parameter ϕ
Open Source Code Yes The details for reproducing our experimental results can be found at https://github.com/UCSC-REAL/PP-bandit-feedback.
Open Datasets No The paper states it uses a 'toy example' for empirical evaluation and references Appendix H for plots, but it does not specify a publicly available dataset by name, provide a link, or cite a dataset for public access.
Dataset Splits No The paper mentions empirical results using a 'toy example' but does not specify any training, validation, or test dataset splits, nor does it refer to standard predefined splits.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or cloud resources.
Software Dependencies No The paper does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries).
Experiment Setup No The paper describes the algorithms and their theoretical properties but does not provide specific experimental setup details, such as hyperparameter values, optimization settings, or training schedules for the empirical evaluation on the 'toy example'.