Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Diffeomorphic Learning

Authors: Laurent Younes

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present diverse applications, mostly with synthetic examples, demonstrating the potential of the approach, as well as some insight on how it can be improved. Keywords: Diffeomorphisms, Reproducing kernel Hilbert Spaces, Classification... We now provide a few experiments that illustrate some of the advantages of the proposed diffeomorphic learning method, and some of its limitations as well.
Researcher Affiliation	Academia	Laurent Younes EMAIL Department of Applied Mathematics and Statistics and Center for Imaging Science Johns Hopkins University 3400 N.Charles st. Baltimore, MD 21209
Pseudocode	No	The paper describes algorithms and methods verbally and mathematically but does not include any explicit pseudocode blocks or figures labeled as such.
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing the code or a link to a code repository.
Open Datasets	Yes	To conclude this section we provide (in Table 9) classification results on a subset of the MNIST digit recognition dataset (Le Cun, 1998), with 10 classes and 100 examples per class for training.
Dataset Splits	Yes	The classification rates that were reported were evaluated on a test set containing 2,000 examples per class (except for MNIST, for which we used the test set available with this data). We used the scikit-learn Python package (Pedregosa et al., 2011) with the following parameters (most being default in scikit-learn).
Hardware Specification	Yes	With our implementation (on a four-core Intel i7 laptop), we were able to handle up to 2,000 training samples in 100 dimensions with 10 time steps, which required about one day for 2,000 gradient iterations.
Software Dependencies	No	We used the scikit-learn Python package (Pedregosa et al., 2011) with the following parameters (most being default in scikit-learn). The paper mentions a software package (scikit-learn) but does not provide a specific version number.
Experiment Setup	Yes	We used the scikit-learn Python package (Pedregosa et al., 2011) with the following parameters (most being default in scikit-learn). Linear SVM: ℓ2 penalty with default weight C = 1, with one-vs-all multi-class strategy when relevant. Kernel SVM: ℓ2 penalty with weight C estimated as described in section 7. Gaussian (i.e., RBF) kernel with coefficient γ identical to that used for for the kernel in diffeomorphic learning. Random forests: 100 trees, with Gini entropy splitting rule, with the default choice (d) of number of features at each node. k-nearest neighbors: with the standard Euclidean metric and the default (five) number of neighbors. Multi-layer perceptrons: with ReLU activations, ADAM solver, constant learning rate and 10,000 maximal iterations, using 1, 2 or 5 hidden layers each composed of 100 units. Logistic regression: ℓ2 penalty with weight C = 1.