Efficient Correlated Subgraph Searches for AI-powered Drug Discovery
Authors: Hiroaki Shiokawa, Yuma Naoi, Shohei Matsugu
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental analysis confirms that Corgi has a shorter running time and improved accuracy compared to existing state-of-the-art methods, while a case study demonstrates that Corgi is suitable for practical AI-powered drug discovery. |
| Researcher Affiliation | Academia | Hiroaki Shiokawa1 , Yuma Naoi2 and Shohei Matsugu2 1Center for Computational Sciences, University of Tsukuba, Japan 2Graduate School of Science and Technology, University of Tsukuba, Japan |
| Pseudocode | Yes | Algorithm 1 (Phase 1) View generation; Algorithm 2 (Phase 2) Mv Tk search |
| Open Source Code | No | The paper states 'All methods were implemented in C/C++ using the -O3 option.' but does not provide any links or explicit statements about the public release of their source code for the methodology. |
| Open Datasets | Yes | We tested 12 public molecule databases published by NCI [Nicklaus et al., 2012], DUD-E [Mysinger et al., 2012], LIT-PCBA [Nguyen et al., 2020], and ZINC 20 [Irwin et al., 2012]. Table 2 shows their statistics, where n, n , and d denote the average graph size, the average summarized graph view size, and the average degree, respectively. For more details, please refer Appendix A. |
| Dataset Splits | No | The paper describes a correlated subgraph search problem and does not utilize traditional machine learning dataset splits (e.g., training, validation, test sets) for model training or evaluation. The 'validation step' mentioned in Section 3.3 refers to a step within the algorithm to filter results, not a dataset split. |
| Hardware Specification | Yes | Evaluations were conducted on a server with an Intel Xeon CPU 2.90 GHz and 1 Ti B RAM. |
| Software Dependencies | No | The paper states 'All methods were implemented in C/C++ using the -O3 option.' However, it does not specify any particular software libraries, frameworks, or their version numbers that were used in the implementation. |
| Experiment Setup | Yes | We employed Top Cor [Ke et al., 2009] for the CSS method invoked in Algorithm 2 and set ϵ = 0.05 and T to the smallest value derived by Lemma 5. For each database, p is set to the largest possible value. ... Consistent with [Ke et al., 2009; Prateek et al., 2020], ten queries are generated for each database by randomly selecting subgraphs from the database. The results are averaged over the above ten queries. |