Constructing Minimal Perfect Hash Functions Using SAT Technology

Authors: Sean Weaver, Marijn Heule1668-1675

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental Evaluation We evaluated the effectiveness of our SAT-based approach to constructing MPHFs near the information-theoretic limit. The encoding and decoding tools are available at https://github.com/weaversa/MPHF. We used cube-and-conquer (Heule et al. 2012) to solve the resulting formulas as this approach demonstrated strong performance on the hardest MPHF instances. [...] Table 2: Runtime in seconds to compute the minimum-weight bipartite perfect matching on n keys, per the construction described above. [...] Table 3: Achieved bits per key (BPK) and seconds taken to compute a blocked MPHF on n keys, per the construction described above.
Researcher Affiliation Collaboration Sean A. Weaver Laboratory for Advanced Cybersecurity Research U.S. National Security Agency saweave@evoforge.org Marijn J.H. Heule Computer Science Department Carnegie Mellon University marijn@cmu.edu Amazon Scholar
Pseudocode No The paper describes algorithms in natural language and mathematical notation but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes The encoding and decoding tools are available at https://github.com/weaversa/MPHF.
Open Datasets No The paper uses 'random keys' for its experiments but does not provide information about a publicly available or open dataset (e.g., a specific link, DOI, or formal citation).
Dataset Splits No The paper does not provide information on training/validation/test dataset splits, as its focus is on constructing minimal perfect hash functions for given sets of keys rather than training and evaluating a machine learning model on a pre-split dataset.
Hardware Specification Yes All runs were performed on an Amazon EC2 instance of type m5.4xlarge. Such an instance has 16 cores, 64 Gi B of memory, and currently costs 0.768 USD per hour.
Software Dependencies Yes We used glucose 3.0 (Audemard and Simon 2009) as conquer solver.
Experiment Setup Yes Our first construction encodes the MPHF construction as a Boolean formula and computes a solution for it using a SAT solver. [...] The parameter k should be chosen in a way that guarantees that G possesses a perfect matching. [...] optimal value for parameter k is max(3, ln n + ln(ln n)). [...] the XORSAT-filter parameters were raised from those suggested by Weaver et al. (Weaver, Roberts, and Smith 2018) to a block size of 4608.