Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Robust Nonparametric Regression under Poisoning Attack
Authors: Puning Zhao, Zhiguo Wan
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we show some numerical experiments. In particular, we show the curve of the growth of mean square error over the attacked sample size q. More numerical results are shown in the full paper (Zhao and Wan 2023). |
| Researcher Affiliation | Industry | Zhejiang Lab Hangzhou, Zhejiang, China EMAIL |
| Pseudocode | No | The computational complexity is higher than kernel regression up to a ln(M/ϵ) factor. |
| Open Source Code | No | More numerical results are shown in the full paper (Zhao and Wan 2023)... The detailed implementation and results are shown in Appendix I in the full paper (Zhao and Wan 2023). |
| Open Datasets | No | For each case, we generate N = 10000 training samples, with each sample follows uniform distribution in [0, 1]d. We have also conducted numerical experiments using real data... The detailed implementation and results are shown in Appendix I in the full paper (Zhao and Wan 2023). |
| Dataset Splits | No | For each case, we generate N = 10000 training samples, with each sample follows uniform distribution in [0, 1]d. |
| Hardware Specification | No | No specific hardware details for running the experiments are mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers are mentioned in the paper. |
| Experiment Setup | Yes | For the initial estimator (5), the parameters are T = 1 and M = 3. The corrected estimator is described in the full paper (Zhao and Wan 2023). For d = 1, the grid count is m = 50. For d = 2, m1 = m2 = 20. Consider that the optimal bandwidth (h in (5)) need to increase with the dimension, in (4), the bandwidths of all these four methods are set to be h = 0.03 for one dimensional distribution, and h = 0.1 for two dimensional case. |