Learning Influence Adoption in Heterogeneous Networks

Authors: Vincent Conitzer, Debmalya Panigrahi, Hanrui Zhang6411-6419

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Experimental Evaluation Our results are shown in Figure 2. First note that the hypothesis classes we choose satisfy d(Hg) = 1 and d(Hs) = k. So applying Corollary 16, up to the resolution ε0, given m labeled samples, the error rate of our learner is O((log n + k)/m). This roughly aligns with our experimental observations: as m grows, the error rate drops roughly as the bound indicates, and tends to plateau as it approaches the stability parameter ε0 of the network, which appears to be quite small in this setup. For all 4 values of n the curves are very similar, which corresponds to the logarithmic dependence on n. The dependence on k is more significant, as suggested by the theoretical results. For comparison, we also include the error rates of the feature-oblivious baseline algorithm given in (Conitzer, Panigrahi, and Zhang 2020), which ignores the sets of weaknesses and simply assume all nodes are susceptible (there are two curves because the networks are generated differently for different values of k). As the figure shows, the baseline algorithm fails to exploit the heterogeneity of the network, and suffers significantly higher error rates.
Researcher Affiliation Academia Vincent Conitzer1, Debmalya Panigrahi1, Hanrui Zhang2 1 Duke University 2 Carnegie Mellon University conitzer@cs.duke.edu, debmalya@cs.duke.edu, hanruiz1@cs.cmu.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a concrete statement or link for open-source code for the methodology described.
Open Datasets No The paper describes how the data instances were generated for the experiments ('Fixing the number of nodes n, each edge (u, v) realizes independently with probability puv...'), but does not provide concrete access information (link, DOI, formal citation) for a publicly available or open dataset.
Dataset Splits No The paper describes using 'labeled samples' and varying 'sample size m' but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or testing.
Hardware Specification No Each point is the average of 40 independent runs, and each run takes less than a minute on a laptop.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes For the generator and susceptibility hypothesis classes, we consider the setting discussed in our introductory example (Figure 1): there are k potential weaknesses, and each susceptibility hypothesis (corresponding to a virus) is induced by a subset of the k weaknesses, corresponding to those the virus targets. Recall that a node is susceptible as long as it has some weakness that the virus targets. For simplicity, we assume the generator hypothesis class is singletons of nodes, meaning that exactly one (unknown) computer has been initially exposed to the virus. In our experiments, we choose the ground truth generator node uniformly at random, and the ground truth susceptibility hypothesis by adding each weakness to the set targeted independently with probability 3/k (so in expectation the virus targets 3 weaknesses). Then we generate the set of weaknesses of each node by adding each weakness with probability 0.2. All these numbers are chosen specifically to make the instance rich and the learning problem nontrivial.