Weighted distance nearest neighbor condensing
Authors: Lee-Ad Gottlieb, Timor Sharabi, Roi Weiss
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study the theoretical properties of this new model, and show that it can produce dramatically better condensing than the standard nearest neighbor rule... We then suggest a condensing heuristic for our new problem... We demonstrate Bayes consistency for this heuristic, and also show promising empirical results. ... Section 5. Experimental Results |
| Researcher Affiliation | Academia | 1Department of Computer Science, Ariel University, Ariel, Israel. Correspondence to: Lee-Ad Gottlieb <leead@ariel.ac.il>, Timor Sharabi <timorsharabi@gmail.com>, Roi Weiss <roiw@ariel.ac.il>. |
| Pseudocode | Yes | Algorithm 1 Greedy weighted heuristic Input: Point set S Initialize solution set T , S S, weight function w : S {1} while S = do x argmaxx S |B(x, dne(x)) S | S S \ B(x, dne(x)) T T {x} w(x) dne(x) end while return T, w. |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the public availability of its source code. |
| Open Datasets | Yes | As proof of concept, we selected representative datasets from the condensing experiments of (Garcia et al., 2012)... appearing in Table 1... Magic, Sat Image, Spambase, Twonorm, Phoneme, Segment, Shuttle... Accordingly, we ran trials on the small banana, circle and iris data sets... Iris. This is the very popular data set of the UCI Machine Learning Repository. |
| Dataset Splits | No | For each data set, we randomly split it into training samples (70%) and testing samples (30%). No separate validation split is explicitly mentioned. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions implementing a program 'using the Python cvxpy library' but does not provide specific version numbers for Python, cvxpy, or any other software dependencies. |
| Experiment Setup | No | The paper describes the data split percentages (70% training, 30% testing) and mentions the use of an integer programming solver implemented with the Python cvxpy library, but it does not provide specific hyperparameters, detailed training configurations, or other system-level settings for the algorithms used. |