Faster federated optimization under second-order similarity
Authors: Ahmed Khaled, Chi Jin
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS We run linear regression with ℓ2 regularization, where each client has a loss function of the form... We do two sets of experiments: in the first set, we generate the data vectors zm,i synthetically... In the second set, we use the a9a dataset from LIBSVM... Our results are given in Figure 1. |
| Researcher Affiliation | Academia | Ahmed Khaled Princeton University Chi Jin Princeton University |
| Pseudocode | Yes | Algorithm 1: Stochastic Proximal Point Method (SPPM) Data: Stepsize η, initialization x0, number of steps K, proximal solution accuracy b. 1 for k = 0, 1, 2, . . . , K 1 do... |
| Open Source Code | Yes | We attach the code used to run the experiments as supplementary material to the paper. |
| Open Datasets | Yes | In the second set, we use the a9a dataset from LIBSVM (Chang & Lin, 2011) |
| Dataset Splits | No | The paper states using synthetic data and the a9a dataset from LIBSVM, and that 'each client s data is constructed by sampling from the original training dataset with n = 2000 samples per client.' However, it does not provide specific train/validation/test split percentages, sample counts for splits, or a methodology for creating these splits to ensure reproducibility. |
| Hardware Specification | No | We simulate our results on a single machine, running each method for 10000 communication steps. No specific hardware details (e.g., CPU, GPU model, memory) are provided. |
| Software Dependencies | No | In the second set, we use the a9a dataset from LIBSVM (Chang & Lin, 2011). The paper mentions LIBSVM but does not provide specific version numbers for it or any other software dependencies used in the experiments. |
| Experiment Setup | Yes | We run linear regression with ℓ2 regularization, where each client has a loss function of the form... with regularization constant λ = 1... and set the regularization parameter as λ = 0.1... We simulate our results on a single machine, running each method for 10000 communication steps. ...each client s data is constructed by sampling from the original training dataset with n = 2000 samples per client. We compare SVRP against SVRG, SCAFFOLD, and the Accelerated Extragradient algorithms, using the optimal theoretical stepsize for each algorithm. |