Systematically Exploring Associations among Multivariate Data

Authors: Lifeng Zhang6786-6794

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The mechanisms of these measures are proved in theory and demonstrated with numerical analyses. Subsequently, empirical studies are performed to evaluate the effectiveness of the new statistics and make comparisons with previous approaches.
Researcher Affiliation Academia Lifeng Zhang School of Information, Renmin University of China 59, Zhongguancun Street, Haidian Beijing, P.R.China, 100872 l.zhang@ruc.edu.cn
Pseudocode Yes Algorithm 1 NN algorithm based data reordering. Input: Euclidean distance matrix of sample data {x(t)}, denoted by [λpq]N N where λpq = x(p) x(q) ; Output: concomitants {y[k:N]|1 k N}; Start on data point t 1 as the current data point, set n(1) 1 and y[1:N] y(t); for k 1 to N 1 do Find out the shortest distance connecting the current data point t and an unvisited data point i / {n(1), , n(k)} that i arg min i λit; Move the current data point to t i , set n(k+1) i and y[k+1:N] y(i ); end for
Open Source Code No The paper does not explicitly state that source code for the described methodology is publicly available, nor does it provide a direct link to a code repository.
Open Datasets Yes Examples 5, called two-spirals problem, is a benchmark task for nonlinear classification, which consists of two spirals each with 200 samples in a 2-D space. n Cor based statistics were used to explore a real-world data set that consists of 357 social, economic, health, and political indicators for 202 countries around the world for the time period from 1960 through 2005. It was originally collected from the World Health Organization (WHO) and partner organizations (Rosling 2008; W.H.O. 2009).
Dataset Splits No No specific training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined split references) are explicitly provided in the paper. It mentions generating data of length 1000 and the two-spirals problem with 200 samples, but not how they are partitioned for training, validation, or testing.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments are mentioned in the paper.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CPLEX 12.4) are mentioned in the paper. It refers to various methods (MIC, d Cor, MI, CODCF, RDC) and the use of 'linear regression and feedforward artificial neural network (ANN)' but without version details.
Experiment Setup No The paper does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs) or other system-level training settings for the models (e.g., ANNs) used in the empirical studies. It only states that '10 ANNs was trained for each case' but lacks further configuration details.