Modeling with Homophily Driven Heterogeneous Data in Gossip Learning

Authors: Abhirup Ghosh, Cecilia Mascolo

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on real and synthetic datasets that we generate using a novel generative model for communication networks with heterogeneous data. Our exhaustive empirical evaluation verifies that our proposed method attains a faster convergence rate than the baselines.
Researcher Affiliation Academia Abhirup Ghosh , Cecilia Mascolo University of Cambridge, UK {ag2187, cm542}@cam.ac.uk,
Pseudocode Yes Algorithm 1: Gossip Learning and Algorithm 2: Softmax-distribution-based weighting at node i with validataion set Qi and model Mi
Open Source Code Yes Technical supplementary and code: https://t.ly/x Sn S
Open Datasets Yes We distribute MNIST dataset on different communication networks in Table 1 (except the bird image contributor contacts). We use the network in Figure 1(b) to test other datasets: Fashion MNIST [Xiao et al., 2017], CIFAR-10 3, and UCI Human Activity Recognition (UCIHAR) [Anguita et al., 2013]. A dataset built from i Naturalists 4. (with footnotes linking to respective dataset sources)
Dataset Splits Yes Algorithm 2 summarizes the proposed method. Each node i sets aside a set of validation examples, Qi from its local data. Validation set contains random 10% samples from local data at each node. (Figure 2 caption)
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries (e.g., PyTorch version, TensorFlow version, Python version, CUDA version) beyond mentioning the use of 'Efficient Net B0' and 'cross entropy' loss.
Experiment Setup Yes We initialize Wii to 0.5 and reduce it every epoch using the function: Wii = 0.5(log(t)) β, where t denotes the cumulative number of local epochs till now. The hyper parameter β controls the rate of decrease, for example if β = 0, then Wii remains fixed at 0.5 and a larger value of β reduces Wii at a faster rate. The hyperparameter α is an amplification factor. The relative weighting of the two loss terms is controlled by λt which changes as λt = 1 + exp ( t T ) . Here, λt 1/2 when T and for T 0 we get λt 1. The hyperparameter T controls the growth rate of λt. Here we consider caching the weights (Wij) and recomputing them every P rounds. (with specific values for α, β, T, P mentioned in figure captions like 'α = 4 and β = 4' in Figure 2).