Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Federated Nearest Neighbor Machine Translation
Authors: Yichao Du, Zhirui Zhang, Bingzhe Wu, Lemao Liu, Tong Xu, Enhong Chen
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that Fed NN significantly reduces computational and communication costs compared with Fed Avg, while maintaining promising translation performance in different FL settings. |
| Researcher Affiliation | Collaboration | University of Science and Technology of China State Key Laboratory of Cognitive Intelligence Tencent AI Lab EMAIL EMAIL EMAIL EMAIL |
| Pseudocode | No | The paper includes a workflow diagram (Figure 1) but does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is open-sourced on https://github.com/duyichao/Fed NN-MT. |
| Open Datasets | Yes | We adopt WMT14 En-De data (Bojar et al., 2014) and multi-domain En-De dataset (Koehn & Knowles, 2017) to simulate two typical FL scenarios for model evaluation: 1) the non-independently identically distribution (Non-IID setting) where each client distributes data from different domains; 2) the independently identically distribution (IID setting) where each client contains the same data distribution from all domains. |
| Dataset Splits | Yes | Table 3: The statistics of datasets for server and clients. Server WMT14 ... Dev 45,206 ... Client IT ... Dev 2,000 |
| Hardware Specification | Yes | We train all models with 4 Tesla-V100 GPU and set patience to 5 to select the best checkpoint on the validation set. |
| Software Dependencies | No | The paper mentions software like FAIRSEQ, Adam optimizer, FAISS, Moses toolkit, and sacre BLEU, but it does not specify version numbers for these components, which is required for reproducibility. |
| Experiment Setup | Yes | The input embedding size of the transformer layer is 512, the FFN layer dimension is 2048, and the number of self-attention heads is 8. During training, we deploy the Adam optimizer (Kingma & Ba, 2015) with a learning rate of 5e-4 and 4K warm-up updates to optimize model parameters. Both label smoothing coefficient and dropout rate are set to 0.1. The batch size is set to 16K tokens. |