Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Ghidorah: Towards Robust Multi-Scale Information Diffusion Prediction via Test-Time Training
Authors: Wenting Zhu, Chaozhuo Li, Litian Zhang, Senzhang Wang, Xi Zhang
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results across several benchmark datasets validate the superiority of our approach. [...] Experimental Settings Datasets Following previous works (Yang et al. 2021; Sun et al. 2022), we evaluate the proposed framework on four datasets collected from real-world platforms: Christianity, Android, Memetracker (Jiao et al. 2024), and Douban. [...] Performance Comparison Table 2 and Table 3 report the results for microscopic prediction, while Table 4 summarizes the results for macroscopic prediction. [...] Ablation Study We conduct a series of ablation studies on the Christianity and Douban datasets to evaluate the importance of each module within Ghidorah. [...] Hyperparameter Sensitivity Analysis Mask Ratio pm in MAE [...] Auxiliary Task Loss Weight α [...] Number of Constructed Environment F [...] Gradient Steps δ during Test-Time Training |
| Researcher Affiliation | Academia | 1Key Laboratory of Trustworthy Distributed Computing and Service (Mo E), Beijing University of Posts and Telecommunications, China 2School of Cyber Science and Technology, Beihang University, China 3School of Computer Science and Engineering, Central South University, China EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology and workflow in detail with figures and equations, but does not include a distinct pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain an explicit statement about open-sourcing the code or a link to a code repository. |
| Open Datasets | Yes | Following previous works (Yang et al. 2021; Sun et al. 2022), we evaluate the proposed framework on four datasets collected from real-world platforms: Christianity, Android, Memetracker (Jiao et al. 2024), and Douban. |
| Dataset Splits | Yes | We randomly sample 80% of the cascades for training, 10% for validation, and the remaining 10% for testing. |
| Hardware Specification | No | The paper states: "Our model is implemented in Py Torch. The results for Ghidorah are presented as the mean of five runs to ensure reliable evaluation." However, it does not specify any hardware details like CPU, GPU models, or memory. |
| Software Dependencies | No | Our model is implemented in Py Torch. The paper mentions PyTorch but does not provide a specific version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | All hyperparameters are selected through a grid search algorithm based on validation set performance, with final results reported on the test set. [...] Mask Ratio pm in MAE [...] Ghidorah performs best with a mask ratio of 0.4, which masks 40% of users. [...] Auxiliary Task Loss Weight α [...] Optimal performance is achieved when α is set to 0.5. [...] Gradient Steps δ during Test-Time Training [...] performance improves with an increasing number of steps, peaks at 15 gradient updates, and then begins to decline. |