reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Marginal Distance and Hilbert-Schmidt Covariances-Based Independence Tests for Multivariate Functional Data

Authors: Mirosław Krzyśko, Łukasz Smaga, Piotr Kokoszka

JAIR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In extensive simulation studies and examples based on a real economic data set, we investigate and compare the performance of the tests in terms of size control and power. An important ﬁnding is that tests based on distance and Hilbert-Schmidt covariances are usually more powerful than their marginal versions under linear dependence, while the reverse is true under non-linear dependence.
Researcher Affiliation	Academia	Miros law Krzy sko EMAIL Interfaculty Institute of Mathematics and Statistics Calisia University-Kalisz Nowy Swiat 4, 62-800 Kalisz, Poland Lukasz Smaga (corresponding author) EMAIL Faculty of Mathematics and Computer Science Adam Mickiewicz University Uniwersytetu Pozna nskiego 4, 61-614 Pozna n, Poland Piotr Kokoszka EMAIL Department of Statistics Colorado State University Fort Collins, CO 80523, USA
Pseudocode	No	The paper describes mathematical formulations and theoretical justifications for the test procedures, along with simulation studies and real data examples. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format.
Open Source Code	No	All numerical experiments in this paper were performed in the R program (R Core Team, 2020). The code is available from the authors upon request.
Open Datasets	Yes	The data set was constructed based on the annual reports of the World Economic Forum (WEF) (http://www.weforum.org). The countries are listed in Table 3 of G orecki et al. (2020).
Dataset Splits	No	For the simulation studies, the paper describes how data was generated (e.g., 'n = 20 observations', 'dimensions p = p1 = p2 {3, 6}'). For the real data example, it mentions using '38 European countries in the period 2008-2015'. However, there is no explicit mention of splitting these datasets into training, validation, or test sets for model evaluation or reproduction, nor are there details on cross-validation folds or random seeds for splitting.
Hardware Specification	No	A part of calculations for simulation study and real data example was made at the Pozna n Supercomputing and Networking Center.
Software Dependencies	Yes	All numerical experiments in this paper were performed in the R program (R Core Team, 2020).
Experiment Setup	Yes	In Model 1 below, we used the B-spline basis as the basis representation of the functional data, since the Fourier basis was applied to generate simulation data. On the other hand, in Models 2 and 3, we considered both Fourier and B-spline bases. For simplicity, all numbers of basis functions Bij were set equal to ﬁve. The coeﬃcients of the basis representation were estimated by the least squares method.