Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Interaction Hard Thresholding: Consistent Sparse Quadratic Regression in Sub-quadratic Time and Space

Authors: Shuo Yang, Yanyao Shen, Sujay Sanghavi

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also demonstrate its value via synthetic experiments. Moreover, we numerically show that Int HT can be extended to higher-order regression problems, and also theoretically analyze an SVRG variant of Int HT. ... 5 Synthetic Experiments To examine the sub-quadratic time and space complexity, we design three tasks to answer the following three questions:
Researcher Affiliation	Academia	Shuo Yang Department of Computer Science University of Texas at Austin Austin, TX 78712 EMAIL Yanyao Shen ECE Department University of Texas at Austin Austin, TX 78712 EMAIL Sujay Sanghavi ECE Department University of Texas at Austin Austin, TX 78712 EMAIL
Pseudocode	Yes	Algorithm 1 INTERACTION HARD THRESHOLDING (INTHT) ... Algorithm 2 APPROXIMATED TOP ELEMENTS EXTRACTION (ATEE)
Open Source Code	No	The paper does not provide any links to open-source code or state that the code for their method is publicly available.
Open Datasets	No	The paper states: "We generate feature vectors xi, whose coordinates follow i.i.d. uniform distribution on [−1, 1]. ... The output yis, are generated following x⊤i Θ xi." This indicates the use of synthetically generated data, not a publicly available dataset with concrete access information.
Dataset Splits	No	The paper uses synthetically generated data and mentions experiments are "averaged over 3 independent runs" or "averaged over 5 independent runs," but it does not specify explicit training, validation, or test dataset splits in terms of percentages or sample counts.
Hardware Specification	No	The paper does not mention any specific hardware specifications (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies, libraries, or their version numbers used for the implementation or experiments.
Experiment Setup	Yes	Experimental setting ... by default, we set p = 200, d = 3, K = 20, k = 3K, η = 0.2. Support recovery results with different b-K combinations are averaged over 3 independent runs, results for m-p combinations are averaged over 5 independent runs. All experiments are terminated after 150 iterations.