Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Differentially Private Community Detection for Stochastic Block Models
Authors: Mohamed S Mohamed, Dung Nguyen, Anil Vullikanti, Ravi Tandon
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present experimental results to assess the performance of our proposed private community detection algorithms, and the associated tradeoffs between privacy and community recovery for both synthetically generated graphs (SBMs) as well as real-world graphs. The proposed mechanisms are implemented in MATLAB 2020b and the optimization (SDP) is done through CVX solver (Grant et al., 2009). In the numerical results, we perform Monte Carlo simulations, where in each iteration we compute the normalized hamming distance between Ļ and ĖĻ as an estimate for the error probability. Our numerical experiments address the following questions: |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of Arizona. 2Biocomplexity Institute and Initiative, University of Virginia. 3Department of Computer Science, University of Virginia. Correspondence to: Mohamed Seif <EMAIL>, Dung Nguyen <EMAIL>, Ravi Tandon <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 MĖĻ Stability(G): Stability Based Mechanism |
| Open Source Code | No | The paper does not provide a link to its source code or explicitly state that the code is publicly available. |
| Open Datasets | Yes | We now discuss our results for two real-world datasets (shown in Figs. 2 (g) & (h)): (1) Zachary s Karate Club dataset which contains a social network of friendships between 34 members of a karate club at a US university in the 1970s. (Girvan & Newman, 2002) and (2) The Political Blogosphere dataset (Adamic & Glance, 2005) which consists of 1490 political blogs captured during 2004 US elections. |
| Dataset Splits | No | The paper discusses synthetic and real-world datasets but does not explicitly provide information on training, validation, or test splits. It conducts Monte Carlo simulations on the overall datasets. |
| Hardware Specification | No | The paper does not provide any specific hardware specifications used for running the experiments. |
| Software Dependencies | Yes | The proposed mechanisms are implemented in MATLAB 2020b and the optimization (SDP) is done through CVX solver (Grant et al., 2009). |
| Experiment Setup | Yes | We first study community recovery on synthetic graphs (SBM) with n = 100 vertices, r = 2 communities, b = 0.1 and vary the parameter a. ... In Figs. 2 (b) and (c), we fix n = 200, a = 3.5, b = 0.1 and study the impact of privacy budget ϵ on the error probability for the case of r = 2 and r = 3 communities. |