Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FineMorphs: Affine-Diffeomorphic Sequences for Regression

Authors: Michele Lohr, Laurent Younes

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on real data sets from the UCI repository are presented, with favorable results in comparison with state-of-the-art in the literature, neural ordinary differential equation models, and densely-connected neural networks in Tensor Flow.
Researcher Affiliation Academia Michele Lohr EMAIL Laurent Younes EMAIL Department of Applied Mathematics and Statistics Center for Imaging Science The Johns Hopkins University Baltimore, MD 21218-2683, USA
Pseudocode No The paper describes mathematical derivations and conditions for optimality, but it does not contain any clearly labeled pseudocode or algorithm blocks. The implementation section describes the overall approach but not in a pseudocode format.
Open Source Code Yes The Fine Morphs code and data sets used in the experiments are available at https://github.com/diffeomorphic-learning/finemorphs.
Open Datasets Yes We test our diffeomorphic regression models on real data sets from the UCI repository (Dua and Graff, 2017), with favorable results in comparison with the literature and with neural ODEs (NODEs) (Chen et al., 2018; Dupont et al., 2019) and densely-connected NNs (DNNs) in Tensor Flow (Abadi et al., 2015).
Dataset Splits Yes For the standard splits, 20 randomized train-test splits (90% train, 10% test) of each data set are provided, with the exception of the larger Protein (5 splits) and Year (1 split) data sets. ... For the gap splits, d X train-test splits of each data set are provided, each split corresponding to one of the d X dimensions of that data set. These splits are generated by creating gaps in the training data, by first sorting the data set in increasing order in the dimension of interest, then assigning the outer two-thirds to the training set and the middle third to the test set (Foong et al., 2019).
Hardware Specification Yes Kernel computations are performed using Py Ke Ops with an NVIDIA RTX A5000 GPU with CUDA 12.1. The NODE and DNN experiments are executed on an NVIDIA Ge Force GTX 1050 GPU with Cuda 10.1.
Software Dependencies Yes The model is implemented in Python using an Euler discretization approach... Kernel computations are performed using Py Ke Ops with an NVIDIA RTX A5000 GPU with CUDA 12.1. The DNN models, implemented in Tensor Flow...
Experiment Setup Yes We set λ = 1 and assign dimensions s = 1, r = 0, and d1 = d X + s... For module D1, we set T1 = 10... The optimization algorithm is the limited-memory Broyden Fletcher Goldfarb Shanno algorithm (L-BFGS) with Wolfe conditions on the line search. ... Parameters for the NODE experiments follow those used in Dupont et al. (2019), including hidden layer size of 32, learning rate of 0.001, batch size of 256, and augmented dimension of 5 for ANODE. ... In Tensor Flow, we use the Adam optimizer (Kingma and Ba, 2014), MSE loss, and 400 training epochs. Default values are assumed for all other Tensor Flow parameters, including learning rate of 0.001, batch size of 32...