reproducibilityindex.ai

Weighted distance nearest neighbor condensing

Authors: Lee-Ad Gottlieb, Timor Sharabi, Roi Weiss

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study the theoretical properties of this new model, and show that it can produce dramatically better condensing than the standard nearest neighbor rule... We then suggest a condensing heuristic for our new problem... We demonstrate Bayes consistency for this heuristic, and also show promising empirical results. ... Section 5. Experimental Results
Researcher Affiliation	Academia	1Department of Computer Science, Ariel University, Ariel, Israel. Correspondence to: Lee-Ad Gottlieb <leead@ariel.ac.il>, Timor Sharabi <timorsharabi@gmail.com>, Roi Weiss <roiw@ariel.ac.il>.
Pseudocode	Yes	Algorithm 1 Greedy weighted heuristic Input: Point set S Initialize solution set T , S S, weight function w : S {1} while S = do x argmaxx S \|B(x, dne(x)) S \| S S \ B(x, dne(x)) T T {x} w(x) dne(x) end while return T, w.
Open Source Code	No	The paper does not provide any specific links or explicit statements about the public availability of its source code.
Open Datasets	Yes	As proof of concept, we selected representative datasets from the condensing experiments of (Garcia et al., 2012)... appearing in Table 1... Magic, Sat Image, Spambase, Twonorm, Phoneme, Segment, Shuttle... Accordingly, we ran trials on the small banana, circle and iris data sets... Iris. This is the very popular data set of the UCI Machine Learning Repository.
Dataset Splits	No	For each data set, we randomly split it into training samples (70%) and testing samples (30%). No separate validation split is explicitly mentioned.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions implementing a program 'using the Python cvxpy library' but does not provide specific version numbers for Python, cvxpy, or any other software dependencies.
Experiment Setup	No	The paper describes the data split percentages (70% training, 30% testing) and mentions the use of an integer programming solver implemented with the Python cvxpy library, but it does not provide specific hyperparameters, detailed training configurations, or other system-level settings for the algorithms used.