reproducibilityindex.ai

GeomCA: Geometric Evaluation of Data Representations

Authors: Petra Poklukar, Anastasiia Varava, Danica Kragic

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate its applicability by analyzing representations obtained from a variety of scenarios, such as contrastive learning models, generative models and supervised learning models. We apply Geom CA to different practical setups. First, we consider a contrastive learning scenario and evaluate the structural similarity between the encodings belonging to different classes of the training and validation datasets (Section 4). Second, we evaluate generative models by comparing the connected components of the training and generated datasets (Section 5). Finally, we apply Geom CA to investigate if features extracted by a supervised model are separated according to their respective classes (Section 6).
Researcher Affiliation	Academia	Petra Poklukar 1 Anastasia Varava 1 Danica Kragic 1 1KTH Royal Institute of Technology, Stockholm, Sweden.
Pseudocode	Yes	Algorithm 1 Geom CA
Open Source Code	Yes	Our code is available on Git Hub2. 2https://github.com/petrapoklukar/Geom CA
Open Datasets	Yes	We used a Style GAN trained on FFHQ dataset (Karras et al., 2019) and replicated the truncation experiment as performed in (Kynk a anniemi et al., 2019). We applied Geom CA to VGG16 representations of the Image Net dataset. We evaluated two models for learning contrastive representations, Siamese and Sim CLR, on an image dataset introduced by (Chamzas et al., 2020).
Dataset Splits	No	Each dataset consists of 5000 training images and 5000 test images not used during training. No explicit mention of a separate validation dataset or split percentages for it was found.
Hardware Specification	No	The paper mentions libraries used for implementation and discusses computational efficiency, implying hardware capabilities, but does not provide specific details on the CPU, GPU, memory, or computing environment used for experiments.
Software Dependencies	No	We implemented Geom CA described in Algorithm 1 using GUDHI library (The GUDHI Project, 2020) which supports efﬁcient computation of geometric sparsiﬁcation, and Networkx library (Hagberg et al., 2008) for building and analyzing ε-graphs. While GUDHI's citation includes a version (3.4.0), Networkx does not, meaning not all key software dependencies are provided with specific version numbers.
Experiment Setup	Yes	Moreover, we used δ = ε / 2 to allow the homogeneous clusters also forming a component (see discussion in Sections 3.1 and 3.2), and chose ηc = 0.75, ηq = 0.45 in order to analyze only consistent components of high quality. Due to large dimensionality of the representations, we chose ε = ε(10), and ηc, ηq = 0. Since the generated representations E are in an ideal case well aligned with R, we chose δ = ε.