Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Detecting Latent Communities in Network Formation Models
Authors: Shujie Ma, Liangjun Su, Yichong Zhang
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The ļ¬nite sample performance of the new estimation and inference methods is illustrated through both simulated and real datasets. In this section, we conduct some simulations to evaluate the performance of our procedure. In this section, we apply the proposed method to study the community structure of social network datasets. |
| Researcher Affiliation | Academia | Shujie Ma EMAIL Department of Statistics University of California Riverside, CA 92521, USA Liangjun Su EMAIL School of Economics and Management Tsinghua University Beijing, 100084, China Yichong Zhang EMAIL School of Economics Singapore Management University 178903, Singapore |
| Pseudocode | No | The paper describes a multi-step estimation procedure in Section 3 using numbered bullet points to detail the steps. While it outlines a method, it is presented as a narrative description rather than structured pseudocode or a formally labeled algorithm block. |
| Open Source Code | No | The text does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described in this paper. |
| Open Datasets | Yes | Pokec is a popular on-line social network in Slovakia. The whole dataset has more than 1.6 million users, and it can be downloaded from https://snap.stanford.edu/data/soc-Pokec.html. The dataset contains Facebook friendship networks at one hundred American colleges and universities at a single point in time. It was provided and analyzed by Traud et al. (2012), and can be downloaded from https://archive.org/details/oxford-2005-facebook-matrix. |
| Dataset Splits | No | The paper describes data generation mechanisms for simulations and initial data cleaning for empirical applications (e.g., 'select the first 10000 users', 'After deleting the nodes with missing values'). However, it does not specify explicit training, validation, or test splits for evaluating the model on these datasets, nor does it refer to predefined splits with citations for this purpose. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments or simulations. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. Although it mentions 'Python code by Graham (2017)', this refers to a third-party tool and does not specify version information for any software used in the authors' implementation. |
| Experiment Setup | Yes | We select the number of communities K1 by an eigenvalue ratio method given as follows. Let bĻ1,1 bĻKmax,1 be the ļ¬rst Kmax singular values of the SVD decomposition of bĪ1 from the nuclear norm penalization method given in Section 3.1.1. We estimate K1 by b K1 deļ¬ned in (16) by setting c1 = 0.1 and Kmax = 10. We set the tuning parameter Ī»n = CĪ»{ n Y + log n}/{n(n 1)} with CĪ» = 2 and similarly for Ī»(1) n . To require that the estimator of bĪl,ij is bounded by ļ¬nite constants, we let M = 2 and CM = 2. The performance of the method is not sensitive to the choice of these ļ¬nite constants. Deļ¬ne the mean squared error (MSE) of the nuclear norm estimator bĪl for Īl as P i =j(bĪl,ij Ī l,ij)2/{n(n 1)} for l = 0, 1. Table 1 reports the MSEs for bĪl, the mean of b K1 and the percentage of correctly estimating K1 based on the 200 realizations. |