Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
The Fair Value of Data Under Heterogeneous Privacy Constraints in Federated Learning
Authors: Justin Singh Kang, Ramtin Pedarsani, Kannan Ramchandran
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We numerically investigate the mechanism design problem under this example and see how heterogeneity impacts the optimal behavior of the platform. Finally, Section 5 explores the platform mechanism design problem. In Theorem 3 we establish that there are three distinct regimes in which the platform s optimal behavior differs depending on the common privacy sensitivity of the users. [...] Fig 7 shows the numerical solution to equation 19 for two different choices of s2 and r2. |
| Researcher Affiliation | Academia | Justin S. Kang EMAIL UC Berkeley Ramtin Pedarsani EMAIL UC Santa Barbara Kannan Ramchandran EMAIL UC Berkeley |
| Pseudocode | Yes | Algorithm 1: Find optimal α input : ni, ci : i = 1 . . . , N output: α n Array(i) ni, i = 1 . . . , N; c Array(i) ci, i = 1 . . . , N; partitions Get Valid Partitions(ni : i = 1 . . . , N) ; /* all ρ that produce unique U */ for i = 1 to len(partitions) do rho partitions(i) ; /* one representative ρ from the partition */ for j = 1 to N do phi(j) Shapley(ρ, j, n Array) ; /* Actual code skips repeated calculations */ end for α grid do ne Exists Tree Search(α phi, c Array) ; /* Check if any ρ in the partition is NE */ if ne Exists then curr Util (1 α) Utility (ρ, n Array); if curr Util > max Util then α α ; /* Update α if needed */ max Util curr Util; end end end end |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | No | In this example, we use DP as our heterogeneous privacy framework. Let Xi represent independent and identically distributed data of user i respectively, with Pr(Xi = 1/2) = p and Pr(Xi = 1/2) = 1 p, with p Unif(0, 1). The platform s goal is to construct an ϵ-DP estimator for µ := E[Xi] = p 1/2 that minimizes Bayes risk. [...] There is no mention of a specific, publicly available dataset used in the experiments. The data is described as synthetically generated for an example scenario. |
| Dataset Splits | No | The paper describes a theoretical framework and uses simulated data for its examples and numerical solutions. It does not mention or define any specific training, validation, or test splits, as would be common in empirical machine learning experiments on a dataset. |
| Hardware Specification | No | The paper focuses on theoretical analysis and numerical simulations. There is no mention of any specific hardware (e.g., GPU/CPU models, cloud instances) used for running these simulations or computations. |
| Software Dependencies | No | The paper describes mathematical models, theorems, and algorithms, including a numerical solution approach. However, it does not specify any particular software libraries, frameworks, or their version numbers that were used for implementation or numerical computation. |
| Experiment Setup | No | The paper describes problem formulations, utility functions, and parameters for theoretical analysis (e.g., N=10 users, s2=100, r2=1 for heterogeneity). It also mentions a 'grid search to determine the optimal α' for the mechanism design problem. However, it does not provide details about a typical experimental setup involving machine learning models, such as hyperparameters (learning rate, batch size, epochs), specific model initialization, or optimizer settings. The 'numerical solution' described is more about finding optimal game-theoretic parameters rather than training a machine learning model. |