Tile2Vec: Unsupervised Representation Learning for Spatially Distributed Data
Authors: Neal Jean, Sherrie Wang, Anshul Samar, George Azzari, David Lobell, Stefano Ermon3967-3974
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate empirically that Tile2Vec learns semantically meaningful representations for both image and non-image datasets. Our learned representations significantly improve performance in downstream classification tasks and, similarly to word vectors, allow visual analogies to be obtained via simple arithmetic in the latent space. |
| Researcher Affiliation | Academia | Neal Jean,1,2 Sherrie Wang,3,4 Anshul Samar,1 George Azzari,4 David Lobell,4 Stefano Ermon1 1Department of Computer Science, Stanford University, Stanford, CA 94305 2Department of Electrical Engineering, Stanford University, Stanford, CA 94305 3Institute of Computational and Mathematical Engineering, Stanford University, Stanford, CA 94305 4Department of Earth System Science, Stanford University, Stanford, CA 94305 {nealjean, sherwang, asamar, gazzari, dlobell}@stanford.edu, ermon@cs.stanford.edu |
| Pseudocode | Yes | Algorithm 1 Sample Tile Triplets(D, N, s, r) |
| Open Source Code | No | The paper includes a link to the arXiv version of the paper's appendix ("Appendix available at https://arxiv.org/abs/1805.02855") but does not provide an explicit statement or link for the source code for the methodology. |
| Open Datasets | Yes | The USDA's National Agriculture Imagery Program (NAIP) provides aerial imagery for public use... The Cropland Data Layer (CDL) is a raster geo-referenced land cover map collected by the USDA for the continental United States (USDA-NASS 2016). Offered at 30 m resolution, it includes 132 class labels... The USGS and NASA's Landsat 8 satellite provide moderate-resolution (30 m) multispectral imagery on a 16-day collection cycle. Landsat datasets are public and widely used... |
| Dataset Splits | Yes | To ensure that training and test sets are spatially disjoint, we split the area into a 12x12 grid of rectangular blocks, which we then partitioned randomly into train (104 blocks), validation (20 blocks), and test (20 blocks) (Fig. 2, right). |
| Hardware Specification | No | The paper mentions the use of a "ResNet-18 architecture" for the CNN, but it does not specify any details about the hardware (e.g., GPU model, CPU, memory, cloud instance) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers (e.g., libraries, frameworks, programming languages, solvers) used in the experiments. |
| Experiment Setup | Yes | We train Tile2Vec embeddings on 100k triplets sampled from the NAIP dataset. ... The Tile2Vec CNN is a ResNet-18 architecture (He et al. 2016) modified for 28x28 CIFAR-10 images (1) with an additional residual block to handle our larger input and (2) without the final classification layer. ... We tune the two main hyperparameters of Algorithm 1 by searching over a grid of tile sizes and neighborhoods. We run the CDL land cover classification experiment 20 times in total, using combinations of tile size in [25, 50, 75, 100] and neighborhood radius in [50, 100, 500, 1000, None]... Using a margin of 50, we trained Tile2Vec for 10 trials with different random initializations... |