Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
How to Learn a Star: Binary Classification with Starshaped Polyhedral Sets
Authors: Marie-Charlotte Brandenburg, Katharina Jochemko
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted small-scale experiments where we tested Algorithm 1, implemented in Sage Math 10.5 [The Sage Developers, 2024], on two-dimensional synthetic data. The computations were done on a Mac Book Pro equipped with an M2 Pro chip and 32 GB of RAM. For comparison, we also applied several standard binary classification methods leading to convex optimization problems on the same dataset, as well as a Re LU neural network. The computation running time ranged from few seconds to one hour. |
| Researcher Affiliation | Academia | Marie-Charlotte Brandenburg Ruhr Universität Bochum Universitätsstr. 150, 44801 Bochum, Germany EMAIL Katharina Jochemko KTH Royal Institute of Technology 100 44 Stockholm, Sweden EMAIL |
| Pseudocode | Yes | Algorithm 1 Computation of the maximum likelihood estimator Input: , X = {(x(i), y(i))}m i=1, λ Output: a 1: determine AX 2: solve argmaxa>0 Pm i=1 y(i) log 1 e λ(AXa)i + (1 y(i))( λ)(AXa)i |
| Open Source Code | No | Answer: [No] Justification: The small experiments in this article are not central to the contribution and easily reproducible with the description provided in the article. |
| Open Datasets | No | Figure 3a illustrates 500 data points sampled from a given star-shaped region (in green) defined on eight rays. The data was generated as follows: we randomly selected the xand y-coordinates of all points from the interval [ 1, 1] using a uniform distribution and discarded any resulting points (x, y) lying outside the unit circle. This was done to achieve a near rotational symmetry of the data set. For each remaining point, we then checked whether it lies inside or outside the star-shaped region. The corresponding label was assigned accordingly, with a 90% probability of being correct. |
| Dataset Splits | No | The data was generated as follows: we randomly selected the xand y-coordinates of all points from the interval [ 1, 1] using a uniform distribution and discarded any resulting points (x, y) lying outside the unit circle. This was done to achieve a near rotational symmetry of the data set. For each remaining point, we then checked whether it lies inside or outside the star-shaped region. The corresponding label was assigned accordingly, with a 90% probability of being correct. |
| Hardware Specification | Yes | The computations were done on a Mac Book Pro equipped with an M2 Pro chip and 32 GB of RAM. |
| Software Dependencies | Yes | We conducted small-scale experiments where we tested Algorithm 1, implemented in Sage Math 10.5 [The Sage Developers, 2024], on two-dimensional synthetic data. |
| Experiment Setup | Yes | Running Algorithm 1 on the synthetic data set, the optimal value of the regularization parameter was found to be approximately λ = 0.83, yielding an accuracy of 0.852. The resulting optimal star classifier is shown in Figure 10a. For comparison, we also tested standard implementations of SVMs (with linear, polynomial, RBF, and sigmoid kernels), logistic regression, and a Re LU neural network with two hidden layers of sizes 5 and 2, respectively. |