reproducibilityindex.ai

Parallel Empirical Evaluations: Resilience despite Concurrency

Authors: Johannes K. Fichte, Tobias Geibinger, Markus Hecher, Matthias Schlögel

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental evaluation shows that despite parallel execution our approach reduces the runtime instability on the majority of instances to one second.
Researcher Affiliation	Academia	Johannes K. Fichte1, Tobias Geibinger2, Markus Hecher3, Matthias Schlögel2 1 AIICS, IDA, Linköping University, Sweden 2 KBS Group, Institute for Logic and Computation, TU Wien, Austria 3 Massachusetts Institute of Technology, USA
Pseudocode	No	The paper describes the methods conceptually and with examples, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Therefore, we designed a tool, called copperbench1 that generates jobs from compact descriptions for experiments. Our tool creates a script that wraps the experimental task to resolve the aforementioned issues. After the job finished, we collect data, parse the output files, and compile a summary. In that way, experimenting can uniformly be automatized. 1github.com/tlyphed/copperbench
Open Datasets	Yes	We take the instance set set-asp-gauss, which contains 200 publicly available SAT instances from a variety of domains with increasing practical hardness (Hoos et al. 2013).
Dataset Splits	No	The paper focuses on evaluating combinatorial solvers on instances and repetitions, not on machine learning-style dataset splits (train/validation/test).
Hardware Specification	Yes	We run on a cluster consisting of 11 nodes. Each node is equipped with two Intel Xeon E5-2650v4 processors consisting of 12 physical cores running at a base frequency of 2.2GHz, 256GB shared RAM in total. Hyperthreading is disabled.
Software Dependencies	Yes	The operating system is a Ubuntu 22.04.2 LTS running a 5.19.0-41-generic Linux kernel. ... We take the solvers glucose (Audemard and Simon 2019), Ca Di Ca L (Biere 2019) and Kissat (Biere et al. 2020), which show robust performance.
Experiment Setup	Yes	We set timeouts to 900s and memouts to 10GB... We repeat each instance 5 times per solver... We compare varying number of occupied cores together with activated and deactivated cache partitioning... Hyperthreading is disabled... We implement caching partitioning by the resctrl Kernel feature. To set up the configuration for each SLURM job, we utilize a custom prolog script, which runs prior to the job execution. The script creates a new restcrl resource group, sets the bitmask according to the formula stated in the previous subsection, and inserts the identifier of the process into that group.