GoTube: Scalable Statistical Verification of Continuous-Depth Models

Authors: Sophie A. Gruenbacher, Mathias Lechner, Ramin Hasani, Daniela Rus, Thomas A. Henzinger, Scott A. Smolka, Radu Grosu6755-6764

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that Go Tube substantially outperforms state-of-the-art verification tools in terms of the size of the initial ball, speed, time-horizon, task completion, and scalability on a large set of experiments.
Researcher Affiliation Academia 1 TU Wien 2 IST Austria 3 CSAIL MIT 4 Stony Brook University
Pseudocode Yes Algorithm 1: Go Tube
Open Source Code Yes Code / Appendix: https://github.com/Daten Vorsprung/Go Tube
Open Datasets No The paper names standard benchmarks like 'Cart Pole-v1' and classical dynamical systems. While these are commonly understood to be public, the paper does not provide concrete access information (specific links, DOIs, or formal citations with authors/year) for these datasets/environments themselves within the text.
Dataset Splits No The paper does not provide specific training/validation/test dataset splits. The experiments are conducted on continuous dynamical systems and neural models, where the concept of data splits as in typical supervised learning tasks does not directly apply.
Hardware Specification Yes We run our evaluations on a standard workstation machine setup (12 v CPUs, 64GB memory) equipped with a single GPU for a per-run timeout of 1 hour (except for runtimes reported in Figure 4).
Software Dependencies No The paper mentions 'JAX' as an implementation tool but does not provide specific version numbers for JAX or any other software dependencies. It also refers to 'advanced automatic differential toolboxes'.
Experiment Setup No The paper mentions some general settings like 'ยต = 1.1 as the tightness factor' and '99% confidence level' but does not provide a comprehensive experimental setup, including specific hyperparameters (e.g., learning rates, batch sizes), model initialization details, or other system-level training settings.