Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Riemannian Consistency Model

Authors: Chaoran Cheng, Yusong Wang, Yuxin Chen, Xiangxin Zhou, Nanning Zheng, Ge Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, we manifest the superior generative quality of RCM in few-step generation on various non-Euclidean manifolds, including flat-tori, spheres, and the 3D rotation group SO(3), spanning a variety of crucial real-world applications such as RNA and protein generation. ... Section 4 Experiments: To demonstrate the effectiveness of the RCM framework on Riemannian manifolds, we carry out extensive experiments on various non-Euclidean settings.
Researcher Affiliation Academia 1University of Illinois Urbana-Champaign, 2Xi an Jiaotong University, 3University of Chinese Academy of Sciences
Pseudocode Yes Algorithm 1 Simplified Riemannian Consistency Distillation (s RCD) Algorithm 2 Simplified Riemannian Consistency Training (s RCT)
Open Source Code No Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: All datasets used in our paper are publicly available. We will also publicize our code once our paper gets accepted.
Open Datasets Yes For our experiments on spherical manifolds, we utilize the real-world data comprising four distinct earth location datasets: volcanic eruptions [40], earthquakes [39], floods [4], and wildfires [46], collected by [36]. ... We evaluate our RCM framework on flat tori using a synthetic checkerboard dataset1 as well as pre-processed protein [32] and RNA [38] datasets, whose torsion angles can be represented on the 2D and 7D tori, respectively.
Dataset Splits Yes For the checkerboard data, we randomly generated 100k sampling points for training. The protein dataset contains 166,305 samples, and the RNA dataset contains 9,473 samples. ... For each dataset, 100k samples are generated for training the Riemannian flow matching and consistency models. During evaluation, 10k rotations are sampled for each model for MMD calculation. ... 5k points are sampled as the training dataset, and the same number of samples is generated for parameter estimation.
Hardware Specification Yes All experiments were carried out on a single A100.
Software Dependencies No For the estimation of kernel density on the 2-sphere, we use the off-the-shelf implementation from Scikit-Learn2 with the haversine distance and the von Mises-Fisher kernel to match the spherical manifold. ... No specific version numbers for software dependencies are provided.
Experiment Setup Yes In all our experiments, we use 4 blocks, each with dimensions [256, 512, 512, 256], and the dimension of time embedding is 256. For flow matching, we use a learning rate of 10 3, while for the consistency model, we use a learning rate of 10 4 with the Adam optimizer. The batch size varies depending on the dataset size and is typically 512 or 1024. We do not use dropout, nor do we employ other tricks such as learning rate decay. We train our model using a total of 50 million data samples.