Adjust Pearson's $r$ to Measure Arbitrary Monotone Dependence

Authors: Xinbo Ai

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Simulation experiments and real-life investigations show that the rearrangement correlation is more accurate in measuring nonlinear monotone dependence than the three classical correlation coefficients, and other recently proposed dependence measures.
Researcher Affiliation Academia Xinbo Ai School of Intelligent Engineering and Automation Beijing University of Posts and Telecommunications Beijing 100876, China axb@bupt.edu.cn
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Simulation procedure is implemented in the recor R package, which is available in supplemental materials. The recor package is available as recor_1.0.2.tar.gz in supplemental materials. For a latest version, please visit https://github.com/byaxb/recor.
Open Datasets Yes In addition to simulated scenarios, we also investigate the performance of these measures on real life scenarios provided by NIST (National Institute of Standards and Technology, 2003). Data and details about these scenarios are available publicly at: https://www.itl.nist.gov/div898/strd/nls/nls_main.shtml
Dataset Splits No The paper describes generating data for simulations and using real-life scenarios, but it does not specify explicit training, validation, or test dataset splits or percentages. It does not mention any split methodology like k-fold cross-validation or specific portions reserved for validation.
Hardware Specification Yes Hardware environment configuration for this study was: DELL Opti Plex 7070 Tower, equipped with 8-core CPU Core i7-9700 @ 3.00GHz, 24G DDR4 2666MHz RAM.
Software Dependencies No All the experiments are implemented with the R language (R Core Team, 2024), along with several add-on packages. The paper lists specific R packages used (e.g., recor::loose_pearson(), stats::cor(), dHSIC::dhsic(), energy::dcor(), minerva::mine_stat(), XICOR::calculateXI()), but does not specify their version numbers.
Experiment Setup Yes for each scenario y = f (x), we generated 512 pairs of (x, y) from the regression model y = f (x) + ε, and computed the values of different measures between x and y at different R levels. In the regression model, the x sample is uniformly distributed on the unit interval (0, 1), and the noise is normally distributed as ε N (0, σ), with σ controlling R to a certain level. For the sake of robustness, the computation process is repeated 10 times for each measure at each R level, and the mean value is adopted.