Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Exact Community Recovery under Side Information: Optimality of Spectral Algorithms

Authors: Julia Gaudio, Nirmit Joshi

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also run simulations to verify our theoretical results (see Figure 2 and Appendix A.1). ... Lighter pixels correspond to higher rate of success. The blue and red curves are theoretical thresholds with and without side information respectively.
Researcher Affiliation Collaboration Julia Gaudio Northwestern University EMAIL Nirmit Joshi Toyota Technological Institute at Chicago EMAIL
Pseudocode Yes Algorithm 1 An informal sketch of the spectral algorithm Algorithm 2 Spectral recovery algorithm for ROS, without or with side information. Algorithm 3 Degree-Profiling algorithm for ROS in the presence of BEC or BSC side information. Algorithm 4 Find Linear Combination Coefficients Algorithm 5 Spectral recovery algorithm for SBM (Rank-2) Algorithm 6 (Spectral) Recovery algorithm for SBM (Rank-1) Algorithm 7 Degree-Profiling algorithm for SBM in the presence of BEC or BSC side information.
Open Source Code No The paper does not explicitly state that source code for the current work is being released, nor does it provide a direct link to a code repository. It references implementation details in previous works but not a release for this paper.
Open Datasets No The paper uses models like Stochastic Block Model (SBM) and Z2-Synchronization to generate synthetic data for simulations, rather than relying on or providing access to pre-existing publicly available datasets. For example, it states: "Empirical Validations. In Figure 2, we consider the Z2-Synchronization setting..."
Dataset Splits No The paper uses simulated data based on theoretical models and does not describe traditional dataset splits (e.g., train/test/validation percentages or counts). The experimental setup involves generating data for a given 'n' and varying model parameters, not splitting a fixed dataset.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the simulations.
Software Dependencies No The paper does not explicitly mention any software or library dependencies with version numbers.
Experiment Setup Yes In Figure 2, we consider the Z2-Synchronization model which is ROSn(1/2, a, a). With n = 300, for each type of side information channel of strength β (see Appendix A.1), we validate the performance of the spectral algorithm over N = 50 trails over a grid of values for (β, a). ... For the symmetric SBM, i.e. SBMn(1/2, a, b). We let n = 300 and consider the BEC side information channel with ϵ = n β for β {0, 5, 0.7}...