Sequential Information Design: Learning to Persuade in the Dark

Authors: Martino Bernasconi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti, Francesco Trovò

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study a repeated information design problem faced by an informed sender who tries to influence the behavior of a self-interested receiver. We consider settings where the receiver faces a sequential decision making (SDM) problem. At each round, the sender observes the realizations of random events in the SDM problem. This begets the challenge of how to incrementally disclose such information to the receiver to persuade them to follow (desirable) action recommendations. We study the case in which the sender does not know random events probabilities, and, thus, they have to gradually learn them while persuading the receiver. We start by providing a non-trivial polytopal approximation of the set of sender s persuasive information structures. This is crucial to design efficient learning algorithms. Next, we prove a negative result: no learning algorithm can be persuasive. Thus, we relax persuasiveness requirements by focusing on algorithms that guarantee that the receiver s regret in following recommendations grows sub-linearly. In the full-feedback setting where the sender observes all random events realizations , we provide an algorithm with O( T) regret for both the sender and the receiver. Instead, in the bandit-feedback setting where the sender only observes the realizations of random events actually occurring in the SDM problem , we design an algorithm that, given an 2 [1/2, 1] as input, ensures O(T ) and O(T max{ ,1 2 }) regrets, for the sender and the receiver respectively. This result is complemented by a lower bound showing that such a regrets trade-off is essentially tight.
Researcher Affiliation Academia Martino Bernasconi Politecnico di Milano Matteo Castiglioni Politecnico di Milano Alberto Marchesi Politecnico di Milano Nicola Gatti Politecnico di Milano Francesco Trovò Politecnico di Milano Email:{martino.bernasconideluca,matteo.castiglioni,alberto.marchesi,nicola.gatti,francesco1.trovo}@polimi.it
Pseudocode Yes Algorithm 1 Full-feedback algorithm Algorithm 2 Bandit-feedback algorithm
Open Source Code No The paper does not provide concrete access to source code for the methodology described. There is no mention of a repository link, an explicit code release statement, or code being available in supplementary materials. The ethics statement explicitly marks 'N/A' for inclusion of code (3a).
Open Datasets No The paper is theoretical and does not conduct experiments with datasets, thus it does not mention or provide access information for a publicly available or open dataset for training. The ethics statement marks 'N/A' for experimental sections.
Dataset Splits No The paper is theoretical and does not involve empirical evaluation using datasets, so it does not provide training/test/validation dataset splits. The ethics statement marks 'N/A' for experimental sections.
Hardware Specification No The paper is theoretical and does not conduct experiments, therefore it does not describe the hardware used. The ethics statement explicitly marks 'N/A' for compute resources (3d).
Software Dependencies No The paper is theoretical and does not describe experiments that would require specific ancillary software dependencies with version numbers. The ethics statement marks 'N/A' for experimental sections.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings. The ethics statement explicitly marks 'N/A' for training details (3b).