Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards Free Data Selection with General-Purpose Models
Authors: Yichen Xie, Mingyu Ding, Masayoshi TOMIZUKA, Wei Zhan
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments verify the effectiveness of Free Sel on various computer vision tasks. |
| Researcher Affiliation | Academia | Yichen Xie, Mingyu Ding , Masayoshi Tomizuka, Wei Zhan UC Berkeley EMAIL |
| Pseudocode | Yes | Algorithm 1: Semantic Pattern Extraction |
| Open Source Code | Yes | Our code is available at https://github.com/yichen928/Free Sel. |
| Open Datasets | Yes | We carry out experiments on PASCAL VOC [14]. In line with prior work [1, 57], we combine the training and validation sets of PASCAL VOC 2007 and 2012 as the training data pool with 16, 551 images. |
| Dataset Splits | No | In line with prior work [1, 57], we combine the training and validation sets of PASCAL VOC 2007 and 2012 as the training data pool with 16, 551 images. The paper combines training and validation sets into a single training pool, but does not explicitly describe separate validation splits for model training or reproduction. |
| Hardware Specification | Yes | The time is estimated on a single NVIDIA TITAN RTX GPU. |
| Software Dependencies | No | The model is implemented based on mmdetection. We follow [57, 1] to train the model for 300 epochs with batch size 32 using SGD optimizer (momentum 0.9). No specific version numbers for mmdetection, PyTorch, or other libraries are provided. |
| Experiment Setup | Yes | The model is trained for 300 epochs with batch size 32 using SGD optimizer (momentum 0.9). The initial learning rate is 0.001, which decays to 0.0001 after 240 epochs. |