Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On the Sample Complexity of Lipschitz Constant Estimation
Authors: Julien Walden Huang, Stephen J. Roberts, Jan-Peter Calliess
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As a first theoretical contribution, we derive novel lower bounds on the sample complexity of this problem for both noise-free and noisy settings under mild assumptions. Moreover, we propose a simple Lipschitz learning algorithm called Lipschitz Constant Estimation by Least Squares Regression (referred to as LCLS). We show that LCLS is asymptotically consistent for general noise assumptions and offers finite sample guarantees that can be translated to new upper bounds on the sample complexity of the Lipschitz learning problem. Our analysis shows that the sample complexity rates derived in this paper are optimal in both the noise-free setting and in the noisy setting when the noise is assumed to follow a Gaussian distribution and that LCLS is a sample-optimal algorithm in both cases. Finally, we show that by design, the LCLS algorithm is computationally faster than existing theoretically consistent methods, and can be readily adapted to various noise assumptions with little to no prior knowledge of the target function properties or noise distribution. In Section 3.4, we illustrate and compare the empirical performance of the LCLS algorithm against Strongin-based Lipschitz learning algorithms on a set of test functions. |
| Researcher Affiliation | Academia | Julien Walden Huang EMAIL University of Oxford Stephen Roberts EMAIL University of Oxford Jan-Peter Calliess EMAIL University of Oxford |
| Pseudocode | Yes | Algorithm 1 General LCLS Input: Ω(Oracle), (HI)I N (Partition Sequence) Output: {ˆLI} (Lipschitz Estimates) procedure: LCLS( Ω, (HI)I N) initialise: I 1 repeat ... Algorithm 2 Hypercube LCLS8on [0, M]d Input: Ω(Oracle), K (Bound from (2), η (covering constant), σ2 (noise variance), (ϵ, δ) (precision) Output: ˆL (Lipschitz Constant Estimation) procedure: LCLS( Ω, K, η, σ2, (ϵ, δ)) initialise: ˆL 0 I (C1(d) MK ηϵ ), NI (C2(d, q) σ2 M2ϵ2 ) H hypercube partition of [0, M]d with side-length M I for H H do (XH, f H) DH generated by Ω ˆβH (XH XH) 1XH f H ˆL max( ˆβH q, ˆL) end return ˆL |
| Open Source Code | No | The paper does not provide a direct link to a source-code repository or an explicit statement about code release. It only discusses the algorithm's properties and performance. |
| Open Datasets | No | Table 1 provides an overview of the four test functions that are used in the experiments discussed in this section. The choice of these functions represents different testing points that are of interest: Function (a) reaches the maximum of the normed gradient in a single unique point of the input space, Function (b) is a classical optimisation testing function which we have also defined to have large second degree partial derivatives, Function (c) is a trigonometric function which provides an illustration of convergence for simple target functions and finally, Function (d) is a higher dimensional version of Function (a) with 3 dimensional inputs. |
| Dataset Splits | Yes | Figure 6: ... The predictions of the trained methods are plotted in orange and the training data in light blue (800 observations). Figure 7: ... The values shown in the plot are computed on a test set containing 500 independently sampled observations. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments. It focuses on the theoretical computational complexity and empirical performance without specifying any underlying hardware. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. It mentions implementing methods but gives no details on the environment or libraries used. |
| Experiment Setup | Yes | As benchmarks we utilise the classical Strongin Lipschitz learning algorithm (Strongin (1973)) in the noiseless setting and the popular modified Strongin-based Lipschitz constant estimator in the bounded noise setting (see in particular Novara et al. (2013), Calliess et al. (2020) and Khajenejad et al. (2021) for applications in control problems). We note that this modified Strongin estimator is strongly dependent on a precise estimate of the smallest upper bound of the noise b > 0 in order to properly specify e R+ hyper-parameter. Indeed, if e is smaller than b, then the Lipschitz constant estimates generated by the modified Strongin estimator converge to + as the number of observations increases. In contrast, if e is bigger then b then the generated Lipschitz constant estimates will converge to an underestimate of L p(f) and never be feasible. Benchmarking algorithms: (Noiseless Setting) Strongin Estimator: ˆL := maxi =j |fi fj| xi xj . (Noisy Setting) Modified Strongin Estimator: ˆL := maxi =j |fi fj| 2 e where e is a hyper-parameter that estimates the tightest upper bound b on the noise. We consider modified Strongin Lipschitz estimators with a correctly specified hyper-parameter ( e = b) and a hyper-parameter that is slightly larger than the true upper bound ( e = 1.2 b) as benchmarks. |