Learning to Interpret Satellite Images using Wikipedia

Authors: Burak Uzkent, Evan Sheehan, Chenlin Meng, Zhongyi Tang, Marshall Burke, David Lobell, Stefano Ermon

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On the recently released f Mo W dataset, our pre-training strategies can boost the performance of a model pre-trained on Image Net by up to 4.5% in F1 score.
Researcher Affiliation Academia 1Department of Computer Science, Stanford University 2Department of Earth System Science, Stanford University
Pseudocode No The paper describes methods in text and uses diagrams (e.g., Figure 4) to illustrate workflows, but it does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Our implementation to perform image to text matching and weak supervision can be found in our repository2. 2https://github.com/ermongroup/Pretraining Wiki Sat Net
Open Datasets Yes we construct a novel dataset called Wiki Sat Net by pairing georeferenced Wikipedia articles with satellite imagery of their corresponding locations. ...Our resulting Wiki Sat Net multi-modal dataset is a set of tuples D = {(c1, x1, y1), ..., (c N, x N, y N)}... Wiki Sat Net contains N = 888, 696 article-image pairs. ... we use a recently released large-scale satellite image recognition dataset named f Mo W [Christie et al., 2018]. ... pre-training on Image Net [Russakovsky et al., 2015], (2) pre-training on CIFAR10... Additionally, we perform classification across 66 land cover classes using remote sensing images with 0.6m GSD obtained by the USDA s National Agriculture Imagery Program (NAIP).
Dataset Splits Yes The validation and test sets contain 14,241 and 16,948 bounding boxes and are left unchanged in our experiments. ... The final dataset consists of 100000 training and 50000 validation and test images.
Hardware Specification No The paper mentions training CNNs and using DenseNet architecture, but it does not specify any particular hardware (e.g., GPU models, CPU types, or cloud computing instances) used for experiments.
Software Dependencies No The paper mentions using Adam optimizer and Doc2Vec, but it does not specify software dependencies like programming languages, deep learning frameworks (e.g., TensorFlow, PyTorch), or their versions.
Experiment Setup Yes After experimentation, we set the learning rate and batch size to 0.0001 and 64, respectively, and the Adam optimizer is used to train the model [Kingma and Ba, 2014]. ... The learning rates for the weakly supervised and image to text matching model are set to 1e-4 and 1e-5 after experimentation. On the other hand, for the Image Net model the learning rate is set to 1e-4, while it is set to 1e-3 for the CIFAR10 and trained from scratch models.