Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Reputation-aware Revenue Allocation for Auction-based Federated Learning
Authors: Xiaoli Tang, Han Yu
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments on widely used benchmark datasets, ARAS-AFL demonstrates superior performance compared to state-of-the-art approaches. It outperforms the best baseline by 49.06%, 98.69%, 10.32%, and 4.77% in terms of total revenue, number of data owners, public reputation and accuracy of federated learning models, respectively. |
| Researcher Affiliation | Academia | College of Computing and Data Science, Nanyang Technological University, Singapore EMAIL |
| Pseudocode | No | The paper describes the proposed method and its components through mathematical formulations and descriptive text, but it does not include a distinct, structured pseudocode block or algorithm section. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code, nor does it provide any links to a code repository. |
| Open Datasets | Yes | Our experiments are based on six commonly adopted datasets in FL: MNIST1, CIFAR-102, Fashion MNIST (FMNIST) (Xiao, Rasul, and Vollgraf 2017), EMNIST-digits (EMNISTD) / letters (EMNISTL) (Cohen et al. 2017), Kuzushiji-MNIST (KMNIST) (Clanuwat et al. 2018). |
| Dataset Splits | No | The paper describes experimental scenarios such as 'FL training task overrelease market' and 'FL training task underrelease market' with details on the number of data owners and tasks, but it does not specify how the mentioned datasets (MNIST, CIFAR-10, etc.) were split into training, validation, or test sets for model training and evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'traditional FL training algorithm like FedAvg (Mc Mahan et al. 2017)' but does not specify any software libraries or frameworks with their version numbers that were used for implementation. |
| Experiment Setup | Yes | We set the confidence degree γ in Equation (3) to 0.5 for each DO and the weighting factor ρ in Equation (11) to 0.1. Additionally, we set 0.5 > αi(t) >= 0.3 to ensure that the basic cost of DOs and the basic operating cost of the AFL marketplace were covered. |