reproducibilityindex.ai

Persistence Homology Distillation for Semi-supervised Continual Learning

Authors: YanFan , Yu Wang, Pengfei Zhu, Dongyue Chen, Qinghua Hu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, experimental results on three widely used datasets validate that the new Ps HD outperforms state-of-the-art with 3.9% improvements on average, and also achieves 1.5% improvements while reducing 60% memory buffer size, highlighting the potential of utilizing unlabeled data in SSCL.
Researcher Affiliation	Academia	Yan Fan, Yu Wang , Pengfei Zhu, Dongyue Chen, Qinghua Hu College of Intelligence and Computing, Tianjin University, China Haihe Laboratory of Information Technology Application Innovation, China fyan_0411@tju.edu.cn, wang.yu@tju.edu.cn, zhupengfei@tju.edu.cn dyuechen@tju.edu.cn,huqinghua@tju.edu.cn
Pseudocode	Yes	Algorithm 1 Persistence Homology Distillation Input: Replayed batch of unlabeled data: Bo u. Parameter: Embedding encoder of current task fnew and old task fold, 1: E Bo u, i 0, S 2: repeat 3: xi Random Sample(E) 4: (f new xi , f old xi ) 5: Group the weighted k-hop neighborhood set N(xi, k) by f old xi 6: E E\N(xi, k), S S xi, i i + 1 7: Compute the Wasserstein distance d N k x (fold, fnew) according to Eq. 5 8: until E is empty 9: Minimize the total loss Lhd = 1 \|S\| Px S d N k x (fold, fnew)
Open Source Code	Yes	Our code is available: https://github.com/fanyan0411/Ps HD.
Open Datasets	Yes	Datasets. We evaluate our method on three datasets with different classes and resolutions. CIFAR10 [39] is a dataset containing colored images classified into 10 classes, which consists of 5,000 training samples and 1,000 testing samples of size 32 * 32 per class. CIFAR-100 [39] comprises 500 training images and 100 testing images per class, with the same size with CIFAR-10. Image Net-100 [40], a subset of Image Net-1000, is composed of 100 classes with 1300 images per class for training and 500 images per class for validation.
Dataset Splits	Yes	CIFAR10 [39] is a dataset containing colored images classified into 10 classes, which consists of 5,000 training samples and 1,000 testing samples of size 32 * 32 per class. CIFAR-100 [39] comprises 500 training images and 100 testing images per class, with the same size with CIFAR-10. Image Net-100 [40], a subset of Image Net-1000, is composed of 100 classes with 1300 images per class for training and 500 images per class for validation.
Hardware Specification	No	The paper provides training time in hours and memory size for examples, but does not specify the types of CPUs, GPUs, or other specific hardware components used for running the experiments.
Software Dependencies	No	The paper mentions using Fixmatch and the Guidh package, but does not provide specific version numbers for these or other key software dependencies like programming languages or deep learning frameworks.
Experiment Setup	Yes	For both CIFAR-10 and CIFAR-100, we train the models with different levels of supervision, i.e., λ {0.8%, 5%, 25%}. For instance, the label ratio corresponds to 4, 25, and 125 annotated samples per class in CIFAR-100. Regarding Image Net-100, we choose {1%, 7.7%} for label ratio. Following [13, 7], we set the buffer size as 500 and 2000. In our analysis p=1. We choose 1 for CIFAR10 and CIFAR100, and 1.5 for Image Net100 in our experiments.