HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
Authors: Gong Cheng, Cheng Jin, Yuzhong Qu
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We systematically experiment with our approach on real-world RDF datasets. In this section, we empirically study the quality of summaries for real-world RDF datasets generated by HIEDS under various configurations, compare it with a baseline method, and report its running time. |
| Researcher Affiliation | Academia | Gong Cheng , Cheng Jin, Yuzhong Qu National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China |
| Pseudocode | Yes | Algorithm 1: Computing CPV 0(G) and Algorithm 2: Finding Relations that Satisfy Eq. (11) |
| Open Source Code | No | The paper states 'We have implemented an online prototype of our HIEDS approach.1' with a footnote '1http://ws.nju.edu.cn/hieds/', which links to a project page demonstrating the prototype, but does not provide explicit access to the source code for the methodology. |
| Open Datasets | Yes | Two extensively used RDF datasets were tested. SWDF2 (Semantic Web Dog Food) offers 200K entity-property-value triples describing 20K entities in the research domain (e.g., papers, researchers). Linked MDB3 offers 6M entity-property-value triples describing 0.6M movie-related entities (e.g., actors). Footnotes: 2http://data.semanticweb.org/ 3http://data.linkedmdb.org/ |
| Dataset Splits | No | The paper discusses how groups are iteratively subdivided and evaluated, but does not specify train, validation, or test splits of the datasets themselves, nor does it refer to cross-validation or predefined splits. |
| Hardware Specification | Yes | We tested the running time of HIEDS on an Intel E3-1225 v3 with 30G memory for our Java program. |
| Software Dependencies | No | The paper mentions 'our Java program' and 'Lucene (lucene.apache.org)' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | In this experiment, groups were iteratively subdivided until each leaf group contained not more than 50 entities. We fixed µE = +1 and E = 0.01 (c.f. Algorithm 1), to focus on the effects of other parameters. The following parameters were fixed for simplicity: = β = 0.5, δ = 0, E = R = 0.01, k = 7, and not introducing constraint 4 . |