HID: Hierarchical Multiscale Representation Learning for Information Diffusion

Authors: Honglu Zhou, Shuyuan Xu, Zuohui Fu, Gerard de Melo, Yongfeng Zhang, Mubbasir Kapadia

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three real-world datasets showcase the superiority of our method.
Researcher Affiliation Academia Department of Computer Science, Rutgers University, New Brunswick, NJ 08901 {honglu.zhou, shuyuan.xu, zuohui.fu}@rutgers.edu, gdm@demelo.org, yongfeng.zhang@rutgers.edu, mk1353@cs.rutgers.edu
Pseudocode Yes Algorithm 1 HID(s, p, d, Dtrain, F) and Algorithm 2 UPSCALING(s, p, Dtrain)
Open Source Code Yes 1https://github.com/hongluzhou/HID
Open Datasets Yes Memetracker [Leskovec et al., 2009]., Twitter [Yang and Leskovec, 2011]., Digg [Hogg and Lerman, 2012].
Dataset Splits Yes For each dataset, the set of diffusion paths is randomly split into two parts: 80% for training and validation (Dtrain), and the remainder for testing (Dtest). The hyper-parameters are chosen based on validation performance.
Hardware Specification Yes All models run on a single machine with 256 GB memory, 48 CPU cores at 2.30GHz, and an NVIDIA Quadro K6000 graphics card.
Software Dependencies No The paper mentions the models used (CDK, CSDK, Forest, HARP, Walklets) but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For CDK, the maximum training epoch was 8,000 and per epoch the number of samples was 5,000. The initial learning rate was 0.01 with a decay of 1 10 6. CSDK shared the same parameters, except 10,000 for the number of samples per epoch and 1 10 12 for decay. For Forest, HARP, and Walklets, we used the parameters suggested by the authors. Forest used a maximum training epoch of 24. The user representation dimensionality was 64. HID requires two hyper-parameters, s and p. For results in Table 2, different values of s in {1, 2, 3} and different values of p in {1.2, 1.5, 2, 3, 4} were tried with a grid search using the validation data before choosing the best-performing settings