Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Marginal Distance and Hilbert-Schmidt Covariances-Based Independence Tests for Multivariate Functional Data
Authors: Mirosław Krzyśko, Łukasz Smaga, Piotr Kokoszka
JAIR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In extensive simulation studies and examples based on a real economic data set, we investigate and compare the performance of the tests in terms of size control and power. An important finding is that tests based on distance and Hilbert-Schmidt covariances are usually more powerful than their marginal versions under linear dependence, while the reverse is true under non-linear dependence. |
| Researcher Affiliation | Academia | Miros law Krzy sko EMAIL Interfaculty Institute of Mathematics and Statistics Calisia University-Kalisz Nowy Swiat 4, 62-800 Kalisz, Poland Lukasz Smaga (corresponding author) EMAIL Faculty of Mathematics and Computer Science Adam Mickiewicz University Uniwersytetu Pozna nskiego 4, 61-614 Pozna n, Poland Piotr Kokoszka EMAIL Department of Statistics Colorado State University Fort Collins, CO 80523, USA |
| Pseudocode | No | The paper describes mathematical formulations and theoretical justifications for the test procedures, along with simulation studies and real data examples. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | No | All numerical experiments in this paper were performed in the R program (R Core Team, 2020). The code is available from the authors upon request. |
| Open Datasets | Yes | The data set was constructed based on the annual reports of the World Economic Forum (WEF) (http://www.weforum.org). The countries are listed in Table 3 of G orecki et al. (2020). |
| Dataset Splits | No | For the simulation studies, the paper describes how data was generated (e.g., 'n = 20 observations', 'dimensions p = p1 = p2 {3, 6}'). For the real data example, it mentions using '38 European countries in the period 2008-2015'. However, there is no explicit mention of splitting these datasets into training, validation, or test sets for model evaluation or reproduction, nor are there details on cross-validation folds or random seeds for splitting. |
| Hardware Specification | No | A part of calculations for simulation study and real data example was made at the Pozna n Supercomputing and Networking Center. |
| Software Dependencies | Yes | All numerical experiments in this paper were performed in the R program (R Core Team, 2020). |
| Experiment Setup | Yes | In Model 1 below, we used the B-spline basis as the basis representation of the functional data, since the Fourier basis was applied to generate simulation data. On the other hand, in Models 2 and 3, we considered both Fourier and B-spline bases. For simplicity, all numbers of basis functions Bij were set equal to five. The coefficients of the basis representation were estimated by the least squares method. |