Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Gaussian Processes for Shuffled Regression
Authors: Masahiro Kohjima
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark datasets confirm the effectiveness of our GPSR proposal. |
| Researcher Affiliation | Industry | Masahiro Kohjima NTT, Inc. EMAIL |
| Pseudocode | Yes | Algorithm 1 Inference Procedure of GPSR Algorithm 2 Inference of SS-GPSR (BSLR using Random Fourier Features) |
| Open Source Code | No | Answer: [No] Justification: We do not provide the code but provide implementation details of proposed methods and baselines in 6 and Appendix G. Also, the experiment is conducted in open-access data (provided in UCI machine learning repository). |
| Open Datasets | Yes | We evaluated GPSR and SS-GPSR using four publicly available data sets found in the UCI machine learning repository6: airfoil data (Airfoil), concrete compressive strength data (Concrete), Boston housing data (Housing), auto-MPG data (MPG). |
| Dataset Splits | Yes | We prepared 5 data sets by randomly dividing the data and using 60% for training, 20% for validation, and 20% for testing. |
| Hardware Specification | Yes | Experiments were run on a computer with Apple M1. |
| Software Dependencies | No | Specifically, LR uses sklearn.linear_model.Linear Regression and GPR uses sklearn.gaussian_process.Gaussian Process Regressor with default settings (we adopted Gaussian kernel for GPR). Details of DR are presented in the next paragraph. Neural network-based methods (DR and SDR): As stated in the Experiments section, we used a one-hidden-layer feedforward neural network with the Re LU activation function for neural networkbased methods (DR and SDR). Hyperparameters of these methods were set following [12]. ... The above was implemented using Py Torch [46]. |
| Experiment Setup | Yes | The number of units was set to 20 for all problems. The parameters of DR were optimized using Adam [45] with a learning rate of 0.001. The parameters of SDR were optimized by the stochastic sparse EM algorithm using Adam with a learning rate of 0.001. The mini-batch size of DR and that of SDR were 32 and 32/L, respectively. The maximum number of epochs was 2000 in common. ... For GPSR, the maximum number of iterations Tmax was set to 100. The parameter for simulated annealing was configured with Smax = 10 * N * log2(L), an initial temperature of T = 1.0 and a cooling rate γ = 0.99. In the case of SS-GPSR, the maximum number of iterations Tmax was set to 20. The dimension size of RFF H was fixed at 100 for all datasets. |