A Dataset Complexity Measure for Analogical Transfer
Authors: Fadi Badra
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Three experiments were run. The first one tests the hypothesis that the complexity measure Γ is an indicator of the quality of the similarity measure σS. The second one evaluates the performance of Co AT on a regression task, and the third one evaluates the performance of Co AT on classification tasks. |
| Researcher Affiliation | Academia | Fadi Badra Universit e Sorbonne Paris Nord, Laboratoire d Informatique M edicale et d Ing enierie des Connaissances en e-Sant e LIMICS, INSERM, UMR 1142, F-93000, Bobigny, France badra@sorbonne-paris-nord.fr |
| Pseudocode | Yes | Algorithm 1 Optimize weights in a weighted sum" and "Algorithm 2 Complexity-based analogical transfer" are present. |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | The PIMA indian diabetes dataset2 includes data about 768 native American women. The prediction task consists in predicting if a person suffers from diabetes (taken as a binary class) from the value of 8 continuous attributes. 2https://kaggle.com/uciml/pima-indians-diabetes-database" and "The Automobile dataset4 includes data about 205 automobiles, from which were kept only the 159 instances that contain no missing values. 4https://archive.ics.uci.edu/ml/datasets/Automobile" and "6 classical datasets of the UCI repository5 (Tab. 2): the Monks datasets (monks1, monks2, and monks3), the User Modeling dataset (user), the Iris dataset (iris), and the Zoo dataset (zoo). 5https://archive.ics.uci.edu/ml/ |
| Dataset Splits | Yes | The quality of each similarity scale is estimated by the accuracy of the k-Nearest Neighbor (k-NN) algorithm, with k = 5, computed using 10-fold cross validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for ancillary software components, libraries, or solvers used in the experiments. |
| Experiment Setup | Yes | The quality of each similarity scale is estimated by the accuracy of the k-Nearest Neighbor (k-NN) algorithm, with k = 5, computed using 10-fold cross validation." Also, details about similarity measure construction and weight optimization are provided: "We choose σR = σprice p2,1000 and the similarity measure σS is assumed to be a weighted sum of the similarity σnb rooms p2,6 according to the number of rooms and the similarity σarea = according to the location area: σS(uv) = w σnb rooms p2,6 (uv) + (1 w) σarea = (uv)" and "The feature scale σϕ = was used for each binary feature ϕ, and a polynomial scale was used for each continuous feature. The weights are set by the method proposed in Sec. 4." |