Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Convex Relaxation for Solving Large-Margin Classifiers in Hyperbolic Space

Authors: Sheng Yang, Peihan Liu, Cengiz Pehlevan

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	From extensive empirical experiments, these methods are shown to achieve better classification accuracies than the projected gradient descent approach in most of the synthetic and real two-dimensional hyperbolic embedding dataset under the one-vs-rest multiclass-classification scheme.
Researcher Affiliation	Academia	Sheng Yang EMAIL John A. Paulson School of Engineering and Applied Sciences Harvard University Peihan Liu EMAIL John A. Paulson School of Engineering and Applied Sciences Harvard University Cengiz Pehlevan EMAIL John A. Paulson School of Engineering and Applied Sciences Center for Brain Science Kempner Institute for the Study of Natural and Artificial Intelligence Harvard University
Pseudocode	No	The paper provides detailed mathematical formulations and transformations of the problem (e.g., Equation (7), (8), (13), (14)), but it does not include any clearly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	The code to our implmentations is https://github.com/yangshengaa/hsvm-relax.
Open Datasets	Yes	Regarding real datasets, our experiments include two machine learning benchmark datasets, CIFAR-10 Krizhevsky et al. (2009) and Fashion-MNIST Xiao et al. (2017) with their hyperbolic embeddings obtained through standard hyperbolic embedding procedure (Chien et al., 2021; Khrulkov et al., 2020; Klimovskaia et al., 2020) to assess image classification performance. Additionally, we incorporate three graph embedding datasets, such as football, karate, and polbooks obtained from Chien et al. (2021), to evaluate the effectiveness of our methods on graph-structured data. We also explore cell embedding datasets, including Paul Myeloid Progenitors developmental dataset (Paul et al., 2015), Olsson Single-Cell RNA sequencing dataset (Olsson et al., 2016), Krumsiek Simulated Myeloid Progenitors dataset(Krumsiek et al., 2011), and Moignard blood cell developmental trace dataset from single-cell gene expression (Moignard et al., 2015)
Dataset Splits	Yes	The primary metrics for assessing model performance are average training and testing loss, accuracy, and weighted F1 score under a stratified 5-fold train-test split scheme.
Hardware Specification	Yes	All experiments are run and timed on a machine with 8 Intel Broadwell/Ice Lake CPUs and 40GB of memory.
Software Dependencies	Yes	We use MOSEK (Ap S, 2022) in Python as our optimization solver without any intermediate parser...Our Python code also uses some common publicly available packages, including Num Py (Harris et al., 2020)... Matplotlib (Hunter, 2007)... Pandas (Mc Kinney et al., 2010) under a BSD license, scikit-learn (Pedregosa et al., 2011)
Experiment Setup	Yes	The PGD implementation follows from adapting the MATLAB code in Cho et al. (2019), with learning rate 0.001 and 2000 epochs for synthetic and 4000 epochs for real dataset and warm-started with a Euclidean SVM solution... We first report performances of three models using one-vs-rest training scheme, described in Appendix D, in Tables 6 to 8 for C {0.1, 1.0, 10} respectively