Constructing Minimal Perfect Hash Functions Using SAT Technology
Authors: Sean Weaver, Marijn Heule1668-1675
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental Evaluation We evaluated the effectiveness of our SAT-based approach to constructing MPHFs near the information-theoretic limit. The encoding and decoding tools are available at https://github.com/weaversa/MPHF. We used cube-and-conquer (Heule et al. 2012) to solve the resulting formulas as this approach demonstrated strong performance on the hardest MPHF instances. [...] Table 2: Runtime in seconds to compute the minimum-weight bipartite perfect matching on n keys, per the construction described above. [...] Table 3: Achieved bits per key (BPK) and seconds taken to compute a blocked MPHF on n keys, per the construction described above. |
| Researcher Affiliation | Collaboration | Sean A. Weaver Laboratory for Advanced Cybersecurity Research U.S. National Security Agency saweave@evoforge.org Marijn J.H. Heule Computer Science Department Carnegie Mellon University marijn@cmu.edu Amazon Scholar |
| Pseudocode | No | The paper describes algorithms in natural language and mathematical notation but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The encoding and decoding tools are available at https://github.com/weaversa/MPHF. |
| Open Datasets | No | The paper uses 'random keys' for its experiments but does not provide information about a publicly available or open dataset (e.g., a specific link, DOI, or formal citation). |
| Dataset Splits | No | The paper does not provide information on training/validation/test dataset splits, as its focus is on constructing minimal perfect hash functions for given sets of keys rather than training and evaluating a machine learning model on a pre-split dataset. |
| Hardware Specification | Yes | All runs were performed on an Amazon EC2 instance of type m5.4xlarge. Such an instance has 16 cores, 64 Gi B of memory, and currently costs 0.768 USD per hour. |
| Software Dependencies | Yes | We used glucose 3.0 (Audemard and Simon 2009) as conquer solver. |
| Experiment Setup | Yes | Our first construction encodes the MPHF construction as a Boolean formula and computes a solution for it using a SAT solver. [...] The parameter k should be chosen in a way that guarantees that G possesses a perfect matching. [...] optimal value for parameter k is max(3, ln n + ln(ln n)). [...] the XORSAT-filter parameters were raised from those suggested by Weaver et al. (Weaver, Roberts, and Smith 2018) to a block size of 4608. |