A Dataset Complexity Measure for Analogical Transfer

Authors: Fadi Badra

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Three experiments were run. The first one tests the hypothesis that the complexity measure Γ is an indicator of the quality of the similarity measure σS. The second one evaluates the performance of Co AT on a regression task, and the third one evaluates the performance of Co AT on classification tasks.
Researcher Affiliation Academia Fadi Badra Universit e Sorbonne Paris Nord, Laboratoire d Informatique M edicale et d Ing enierie des Connaissances en e-Sant e LIMICS, INSERM, UMR 1142, F-93000, Bobigny, France badra@sorbonne-paris-nord.fr
Pseudocode Yes Algorithm 1 Optimize weights in a weighted sum" and "Algorithm 2 Complexity-based analogical transfer" are present.
Open Source Code No The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The PIMA indian diabetes dataset2 includes data about 768 native American women. The prediction task consists in predicting if a person suffers from diabetes (taken as a binary class) from the value of 8 continuous attributes. 2https://kaggle.com/uciml/pima-indians-diabetes-database" and "The Automobile dataset4 includes data about 205 automobiles, from which were kept only the 159 instances that contain no missing values. 4https://archive.ics.uci.edu/ml/datasets/Automobile" and "6 classical datasets of the UCI repository5 (Tab. 2): the Monks datasets (monks1, monks2, and monks3), the User Modeling dataset (user), the Iris dataset (iris), and the Zoo dataset (zoo). 5https://archive.ics.uci.edu/ml/
Dataset Splits Yes The quality of each similarity scale is estimated by the accuracy of the k-Nearest Neighbor (k-NN) algorithm, with k = 5, computed using 10-fold cross validation.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for ancillary software components, libraries, or solvers used in the experiments.
Experiment Setup Yes The quality of each similarity scale is estimated by the accuracy of the k-Nearest Neighbor (k-NN) algorithm, with k = 5, computed using 10-fold cross validation." Also, details about similarity measure construction and weight optimization are provided: "We choose σR = σprice p2,1000 and the similarity measure σS is assumed to be a weighted sum of the similarity σnb rooms p2,6 according to the number of rooms and the similarity σarea = according to the location area: σS(uv) = w σnb rooms p2,6 (uv) + (1 w) σarea = (uv)" and "The feature scale σϕ = was used for each binary feature ϕ, and a polynomial scale was used for each continuous feature. The weights are set by the method proposed in Sec. 4."