reproducibilityindex.ai

Competitive-Cooperative Multi-Agent Reinforcement Learning for Auction-based Federated Learning

Authors: Xiaoli Tang, Han Yu

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on six commonly adopted benchmark datasets show that MARL-AFL is signiﬁcantly more advantageous compared to six stateof-the-art approaches, outperforming the best by 12.2%, 1.9% and 3.4% in terms of social welfare, revenue and accuracy, respectively.
Researcher Affiliation	Academia	Xiaoli Tang and Han Yu School of Computer Science and Engineering, Nanyang Technological University, Singapore {xiaoli001, han.yu}@ntu.edu.sg
Pseudocode	Yes	Algorithm 1 Learning Θi in Eq. (3) Algorithm 2 MARL-AFL
Open Source Code	No	The paper does not provide any explicit statements about making the source code available or links to a code repository for the described methodology.
Open Datasets	Yes	To evaluate the performance of MARL-AFL, we conduct experiments based on six commonly used datasets in FL studies, including MNIST (http://yann.lecun.com/exdb/mnist/), CIFAR-10 (https://www.cs.toronto.edu/kriz/cifar.html), Fashion-MNIST (i.e., FMNIST) [Xiao et al., 2017], EMNIST-digits (i.e., EMNIST-D), EMNIST-letters (i.e., EMNIST-L) [Cohen et al., 2017] and Kuzushiji-MNIST (i.e., KMNIST) [Clanuwat et al., 2018].
Dataset Splits	Yes	Both the test set and the validation set for each data consumer include 2,000 samples.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models. It mentions training FL models and using a VGG11 network, but no hardware specifics.
Software Dependencies	No	The paper describes the neural network architectures, optimization algorithms (RMSprop), learning rate, discount factor, and other hyperparameters. However, it does not specify software versions for programming languages, libraries (e.g., PyTorch, TensorFlow), or other key software components.
Experiment Setup	Yes	The proposed method utilizes fully connected neural networks with three hidden layers each containing 64 nodes to generate bid prices for data owners on behalf of their respective data consumers. The action-value functions Qi and Qi are trained using a replay buffer D with a size of 5,000. During training, the agents explore the environment using an ϵgreedy policy with an annealing rate from 1.0 to 0.05. To update Qi, 32 episodes uniformly sampled from D are used for each training step, and Qi is updated twice after each episode to speed up convergence. The target networks of Qi and Qi are updated once every 20 training episodes. We use RMSprop with a learning rate of 0.0005 to train all neural networks, and set the discount factor γ to 0.99 and the temperature hyperparameter τ to 4.