reproducibilityindex.ai

Efficient Correlated Subgraph Searches for AI-powered Drug Discovery

Authors: Hiroaki Shiokawa, Yuma Naoi, Shohei Matsugu

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental analysis conﬁrms that Corgi has a shorter running time and improved accuracy compared to existing state-of-the-art methods, while a case study demonstrates that Corgi is suitable for practical AI-powered drug discovery.
Researcher Affiliation	Academia	Hiroaki Shiokawa1 , Yuma Naoi2 and Shohei Matsugu2 1Center for Computational Sciences, University of Tsukuba, Japan 2Graduate School of Science and Technology, University of Tsukuba, Japan
Pseudocode	Yes	Algorithm 1 (Phase 1) View generation; Algorithm 2 (Phase 2) Mv Tk search
Open Source Code	No	The paper states 'All methods were implemented in C/C++ using the -O3 option.' but does not provide any links or explicit statements about the public release of their source code for the methodology.
Open Datasets	Yes	We tested 12 public molecule databases published by NCI [Nicklaus et al., 2012], DUD-E [Mysinger et al., 2012], LIT-PCBA [Nguyen et al., 2020], and ZINC 20 [Irwin et al., 2012]. Table 2 shows their statistics, where n, n , and d denote the average graph size, the average summarized graph view size, and the average degree, respectively. For more details, please refer Appendix A.
Dataset Splits	No	The paper describes a correlated subgraph search problem and does not utilize traditional machine learning dataset splits (e.g., training, validation, test sets) for model training or evaluation. The 'validation step' mentioned in Section 3.3 refers to a step within the algorithm to filter results, not a dataset split.
Hardware Specification	Yes	Evaluations were conducted on a server with an Intel Xeon CPU 2.90 GHz and 1 Ti B RAM.
Software Dependencies	No	The paper states 'All methods were implemented in C/C++ using the -O3 option.' However, it does not specify any particular software libraries, frameworks, or their version numbers that were used in the implementation.
Experiment Setup	Yes	We employed Top Cor [Ke et al., 2009] for the CSS method invoked in Algorithm 2 and set ϵ = 0.05 and T to the smallest value derived by Lemma 5. For each database, p is set to the largest possible value. ... Consistent with [Ke et al., 2009; Prateek et al., 2020], ten queries are generated for each database by randomly selecting subgraphs from the database. The results are averaged over the above ten queries.