Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Robust Point Matching with Distance Profiles

Authors: YoonHaeng Hur, Yuehaw Khoo

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the performance of the proposed methods using simulations and real data applications to complement the theoretical ﬁndings.
Researcher Affiliation	Academia	Yoon Haeng Hur EMAIL Department of Statistics University of Chicago Chicago, IL 60637, USA Yuehaw Khoo EMAIL Department of Statistics University of Chicago Chicago, IL 60637, USA
Pseudocode	Yes	Algorithm 1 Distance Proﬁle Matching Algorithm 2 Distance Proﬁle Matching by Assignment
Open Source Code	No	The paper does not explicitly provide any links to open-source code repositories or make a clear statement about releasing the code for the described methodology.
Open Datasets	Yes	We apply it to the shape correspondence using CAPOD data (Papadakis, 2014). The cryo-EM data of human γ-secretase (Bai et al., 2015) essentially represents a probability density function of the intensities of the object, stored as voxels on a 3D grid. Figures 6(b)-(d) show the projections of the cryo-EM data along the x, y, and z axes, respectively, which clearly demonstrate high noise levels. Alignment problems in cryo-EM are often tackled by using the density functions (Singer and Yang, 2024; Harpaz and Shkolnisky, 2023) with suitable transformations or by obtaining the point clouds from the density functions (Riahi et al., 2023, 2025), say, utilizing the topology representing network algorithm (Martinetz and Schulten, 1994). The proposed algorithm can be applied to the latter setting. Here, we sample two point clouds from the cryo-EM data, one from the original density function and the other from the density function after applying a rigid transformation; see Figure 7(a). The goal is to recover the rigid transformation from the point clouds. The proposed partial matching Algorithm 1 provides correspondences between the points. Once we have the correspondences, we can estimate the rigid transformation using the standard orthogonal Procrustes problem after centering between matrices X, Y Rn d whose rows are the point clouds X1, . . . , Xn and Yπ(1), . . . , Yπ(n), respectively, where π is the output of Algorithm 1. Figure 7(b) shows the aligned point clouds, which seems reasonable. However, for better results, it is beneﬁcial to subset the good matches by using the threshold ρ in Algorithm 1. The crux is that we only need to partially match the points for alignment, and focusing on good matches can improve the alignment results. Figures 7(c) and 7(d) apply Algorithm 1 with ρ being the 0.5 and 0.1 quantiles of W(1, π(1)), . . . , W(n, π(n)) to use the best 50% and 10% of the correspondences, respectively. In these cases, we obtain the set of index I from the output of Algorithm 1 and Listed as 5A63 at the Protein Data Bank (PDB) https://www.rcsb.org/structure/5A63 and as EMD3061 at the Electron Microscopy Data Bank (EMDB) https://www.ebi.ac.uk/emdb/EMD-3061. We repeat the same process for the cryo-EM data of the voltage-gated calcium channel (Cav1.1) of rabbit (Wu et al., 2016), which plays a role as the sensor for excitationcontraction coupling of skeletal muscles.
Dataset Splits	No	The paper does not provide specific details on training, validation, or test dataset splits for any of the datasets used in the experiments. It describes how point clouds are sampled or used but does not specify data partitioning.
Hardware Specification	No	The paper does not explicitly provide any specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments or simulations.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., libraries, frameworks, or solvers) used in the implementation of the proposed methods or for running experiments.
Experiment Setup	Yes	Figure 2(a) shows the probability of the perfect matching, namely, the left-hand side of (11). Here, d = 3 and K = 10, and we consider the clean case t = s = K and the case with one outlier component t = s = K + 1 for the sample size n = m {500, 1000}; also, we let pk = qk = 1/t and consider Gaussian mixtures, namely, µk = N(θk, σ2Id) and νk = N(ηk, σ2Id), where we vary σ. We can observe that the probability of the perfect matching decreases as σ grows for both cases. Next, we provide simulation results demonstrating the noise stability discussed in Section 3.3. We ﬁrst obtain n locations θ1, . . . , θn Rd and deﬁne the other locations η1, . . . , ηn as in (15) using some rigid transformation T and a permutation π . Then, we generate the points according to (16) and obtain ˆπ by applying Algorithm 2; we estimate P(ˆπ = π ) by repeating this for 100 times. Figure 3 shows the results, where the noise variables ξi s and ζi s follow N(0, σ2Id), and we vary σ and compare with the probability P(ˆπOT = π ) for ˆπOT obtained by the procedure (18). As shown in Figure 3(a), when the rigid transformation is based on a rotation matrix that only aﬀects the ﬁrst two coordinates, both methods can recover the true permutation to some extent for small enough σ; in this case, as the rotation matrix is relatively close to the identity matrix, the standard optimal transport method (18) shows good performance. For a rotation matrix that changes all the coordinates, however, the standard optimal transport method fails to recover the true permutation as shown in Figure 3(b), while the distance proﬁle matching method performs almost as in Figure 3(a). Lastly, we brieﬂy comment that the plots of Figure 2(a) and Figure 3 suggest the existence of a phase transition in the noise level, where the performance of the matching methods might change dramatically under certain settings. We leave this for future work. We apply k-means clustering to both point clouds with k = 7 to cluster them based on the body parts. Figures 7(c) and 7(d) apply Algorithm 1 with ρ being the 0.5 and 0.1 quantiles of W(1, π(1)), . . . , W(n, π(n)) to use the best 50% and 10% of the correspondences, respectively.