Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Nonparametric Inference under B-bits Quantization
Authors: Kexuan Li, Ruiqi Liu, Ganggang Xu, Zuofeng Shang
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive simulation studies together with a real-data analysis are used to demonstrate the validity and effectiveness of the proposed tests. |
| Researcher Affiliation | Collaboration | Kexuan Li EMAIL Global Biometrics and Data Sciences Bristol Myers Squibb Princeton Pike, NJ 08648, USA; Ruiqi Liu EMAIL Department of Mathematics and Statistics Texas Tech University Lubbock, TX 79409, USA; Ganggang Xu EMAIL Department of Management Science University of Miami Coral Gables, FL 33146, USA; Zuofeng Shang EMAIL Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, USA |
| Pseudocode | Yes | Algorithm 1: Two-Stage Quantization; Algorithm 2: Quantization Estimation of Variance |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | In this section, we apply the proposed methods to the Combined Cycle Power Plant Data (Kaya et al., 2012; T ufekci, 2014), which can be downloaded at http://https://archive.ics.uci.edu/ml/datasets/Combined+Cycle+Power+Plant. |
| Dataset Splits | No | The paper mentions generating data for simulations with various sample sizes (e.g., n = 1000, 2000, 3000, 5000, 10000) and using a real dataset of n = 9568 observations. However, it does not specify any training, validation, or test splits for these datasets. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the simulation studies or the real-data analysis experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | For all simulation studies, we consider the uniform quantization scheme outlined in Section 2.2. Specifically, for the data quantization step, for a given bits budget B, we choose c, k following the approach suggested in Section 2.5 with a Tn/σ2 = 2.5 log(n). For each simulation, the quantization ranges t1, tk 1 are defined as t1 = µ0 p 2.5σ2 log(n), tk 1 = µ0+ p 2.5σ2 log(n), where µ0 = R 1 0 g0(x)dx with g0( ) being the regression function in model (1). The target significance level was chosen as α = 0.1. ... The tuning parameter λ was set as λ = bλGCV/ log(c) with bλGCV being picked by GCV. |