Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
HID: Hierarchical Multiscale Representation Learning for Information Diffusion
Authors: Honglu Zhou, Shuyuan Xu, Zuohui Fu, Gerard de Melo, Yongfeng Zhang, Mubbasir Kapadia
IJCAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three real-world datasets showcase the superiority of our method. |
| Researcher Affiliation | Academia | Department of Computer Science, Rutgers University, New Brunswick, NJ 08901 EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 HID(s, p, d, Dtrain, F) and Algorithm 2 UPSCALING(s, p, Dtrain) |
| Open Source Code | Yes | 1https://github.com/hongluzhou/HID |
| Open Datasets | Yes | Memetracker [Leskovec et al., 2009]., Twitter [Yang and Leskovec, 2011]., Digg [Hogg and Lerman, 2012]. |
| Dataset Splits | Yes | For each dataset, the set of diffusion paths is randomly split into two parts: 80% for training and validation (Dtrain), and the remainder for testing (Dtest). The hyper-parameters are chosen based on validation performance. |
| Hardware Specification | Yes | All models run on a single machine with 256 GB memory, 48 CPU cores at 2.30GHz, and an NVIDIA Quadro K6000 graphics card. |
| Software Dependencies | No | The paper mentions the models used (CDK, CSDK, Forest, HARP, Walklets) but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For CDK, the maximum training epoch was 8,000 and per epoch the number of samples was 5,000. The initial learning rate was 0.01 with a decay of 1 10 6. CSDK shared the same parameters, except 10,000 for the number of samples per epoch and 1 10 12 for decay. For Forest, HARP, and Walklets, we used the parameters suggested by the authors. Forest used a maximum training epoch of 24. The user representation dimensionality was 64. HID requires two hyper-parameters, s and p. For results in Table 2, different values of s in {1, 2, 3} and different values of p in {1.2, 1.5, 2, 3, 4} were tried with a grid search using the validation data before choosing the best-performing settings |