Internet Explorer: Targeted Representation Learning on the Open Web

Authors: Alexander Cong Li, Ellis Langham Brown, Alexei A Efros, Deepak Pathak

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Internet Explorer across several datasets and show that it outperforms or matches CLIP oracle performance by using just a single GPU desktop to actively query the Internet for 30 40 hours. Results, visualizations, videos, and code on our website: internet-explorer-ssl.github.io/
Researcher Affiliation Academia 1Carnegie Mellon University 2University of California, Berkeley.
Pseudocode Yes An overview of the Internet Explorer method is depicted in Figure 2 and described in Algorithm 1.
Open Source Code Yes Results, visualizations, videos, and code on our website: internet-explorer-ssl.github.io/ and Our code has been released at https://github.com/internet-explorer-ssl/internet-explorer, which we hope will clarify any remaining implementation details and make it easy for the community to reproduce and build on our work.
Open Datasets Yes We evaluate Internet Explorer on 4 popular small-scale fine-grained classification datasets: Birdsnap (Berg et al., 2014), Flowers-102 (Nilsback & Zisserman, 2008), Food101 (Bossard et al., 2014), and Oxford-IIT Pets (Parkhi et al., 2012). These small datasets consist of 2,040 to 75,750 training examples, making them ideal for testing whether Internet Explorer can efficiently find relevant useful data. We also evaluate on PASCAL VOC 2007 (Cls) (Everingham et al., 2010), a coarse-grained multi-label classification task, and Image Net-100 (Tian et al., 2020). Finally, we try FMo W (Christie et al., 2018), a satellite domain classification task.
Dataset Splits Yes Figure 5 shows how Internet Explorer improves the k-NN accuracy more efficiently than sampling queries uniformly at random from the concept vocabulary. (Referring to Figure 5, which shows "k-NN Val Accuracy (%)" on its y-axis, indicating a validation split.)
Hardware Specification Yes using just a single GPU desktop to actively query the Internet for 30 40 hours. (and) using only a single 3090 GPU desktop machine that runs for 30 40 hours
Software Dependencies No We use the Mo Co-v3 algorithm (Chen et al., 2021) and We use GPT-J-6B (Wang & Komatsuzaki, 2021), a free, open-source autoregressive language model and We use a pre-trained text embedding model (Reimers & Gurevych, 2019) to compute 384-dimensional text embeddings. While these name software/models, they lack specific version numbers for the implementation environment like Python version, PyTorch/TensorFlow version, or specific library releases.
Experiment Setup Yes Table 7 shows our hyperparameter values, which are shared across datasets. We perform minimal hyperparameter tuning and copy most of the values from the Mo Co-v3 (Chen et al., 2021) Res Net-50 configuration. (Table 7: Internet Explorer hyperparameters lists specific values for Batch size, Learning rate, Epochs per iteration, etc.)