Earth from Space

I recently completed the 12-week Metis Data Science bootcamp in Chicago. This post is focused on my capstone project, which is a client based project for DigiatlGlobe, a commercial vendor for satellite imagery. It involves analyzing DigitalGlobe’s high resolution satellite data to obtain a detailed representation of land use on Earth.

Why Satellite Images?

Satellite images are ground truth of what is happening on the planet. They can be used to detect and monitor changes on Earth’s surfaces due to both natural causes and human activity. The public and commercial interests are vast: from tracking landscape changes to estimating the number of vehicles in retail parking lots. Given the choice between sending field agents and analyzing satellite images, I’d say that satellite images are our future.

Mapping Land Cover

I analyzed images of the greater Las Vegas area, focusing on obtaining detailed distributions of land cover. In arid places like Vegas, a small loss of vegetation or water can have adverse effect on the natural ecosystem. Along with that, remote places often escape human radar and progressive changes get unnoticed. Satellite images can help us combat that lack of knowledge.

From the perspective of computer vision, this is a standard yet challenging problem: how can we take a representative set of training images and extract informative features to learn a robust classifier that automatically maps the terrain for us? It becomes even more complex when we do not have ground truth labels (polygonal areas or masks for land cover classes such as road, grass, water, etc.). The satellites capture the ground truth snapshots of Earth; they don’t label the ground truth for us.

Labeling images by hand (a standard for academic computer vision research) is just not realistic. We need to analyze massive amount of data in real time.

At present, the remote sensing and geospatial industry rely of third party resources for masks and labels as groudtruth. OpenStreetMap (OSM), an open source mapping project where users can access regional maps and tag landmarks such as roads, vegetation, buildings, and other points of interest, is the most extensive public database of map features to date. Although OSM makes the current industry standard, the data can be incomplete or inaccurate for satellite coverage. This was the case for most of my images.

A prerequisite of this project was obtaining labels for regions of different land cover classes. I came up with a simple but powerful method for generating masks and labels from the data itself. Then I was able to use supervised machine learning for land cover classification and segmentation. Without further due, next I present the key steps and findings of my three weeks of capstone project work.

1 Classification and Segmentation

Classification and segmentation reveal different levels of details of images. Classification aims to tag an image based on the presence or absence of a particular land cover class (e.g. is there vegetation?). Segmentation, on the other hand, aims to map or mask out the regions of land cover classes in an image (e.g. where is vegetation?).

img_seg_mask
Figure 1: Example of a satellite image with a masked region representing vegetation.

If I want to develop machine learning models for classification and segmentation, I need to provide ground truth masks for land cover classes. That way, the algorithm knows where the land covers are and learn their patterns.

2 Classification using OpenStreetMap Masks and Labels

Let’s look at couple of examples of OSM masks.

osm_mask1
Figure 2: The purple region overlaid on the image is the OpenStreetMap (OSM) mask for vegetation (combining tags such as grass, forest, wood etc. from the OSM database).
osm_mask2
Figure 3: The entire image is tagged as residential in the OSM database. ———————–

Note in Figure 2, the mask information is not complete for the satellite image coverage. Also note that the masked region shows bare ground. It is possible that this is not up-to-date. The land cover has changed since the user labeled it. In Figure 3, the entire image is tagged as residential. This is a coarse tag. It does not trace the boundaries of individual land cover classes. These are very common scenarios in the entire dataset.

I used OSM masks and labels, and trained a convolutional neural network (CNN) to perform classification. The classification results I got were poor (see Figure 4). This is probably because of the inadequate ground truth information. There is a lot of misclassifications like this and class assignments seem to be random. I was unable to see a systematic trend where a certain class is predominantly misclassified as some other class.

osm_result
Figure 4: Example result obtained from CNN classification model trained using OSM masks and labels. Many of the land cover classes are misclassified. For instance, part of the water body in the image has been misclassified as grass.

OSM, the industry standard, had failed me. What now?

3 Generating Masks from Satellite Data

I came up with a data-driven and reliable way to generate masks form the data itself. It does not rely on crowd-sourced or third party software like OpenStreetMap. This method simply uses RGB color information to distinguish between land cover types.

  • I extracted 16x16-pix patches from different objects such as grass, tree, house, road, etc. (see Figure 5).
obj_patch
Figure 5: Examples of 16x16-pix patches of different objects in a satellite image. Note that these patches are not drawn to scale.
  • Next, I defined a few classes that are differentiable in RGB space such as vegetation, residential, road, etc. I placed objects with similar RGB colors in the same class. For instance, in the image in Figure 5, grass and tree would go in the class vegetation while house and road would make their own category.
  • For each patch, I defined three features: average R, average G and average B color of the patch.
  • Now that I had features and class labels, I could train a K-nearest neighbor (KNN) classification model on the training patches.
  • The trained KNN model is then applied to classify individual patches of an image.

4 K-nearest Neighbor Classification Results

knn_result
Figure 6: Example of classification obtained using KNN classifier. The breakdown of land cover classes in this image is as follows: 60.61% vegetation, 19.03% water, 20.36% residential.

The classification obtained using K-nearest neighbor (KNN) is more accurate. Results like this provide a comprehensive summary of land cover. This might be useful for city planners. Based on this they can decide where they need to plant more trees or do some construction work.

The best part of this approach is that it’s data-driven. I does not rely on crowd-sources or third party like OpenStreetMap. I extracted 10 patches of each class meticulously. Labeling was a minimal effort. Classification can be done in real time!

5 Segmentation

The next step was segmentation, which is an extension of KNN classification results. So far, I have segmented one land cover type: vegetation. This can be easily extended to other classes.

For segmentation, I used U-net, a neural network that performs pixel-level classification. U-net is a relatively new addition to the neural network clan. It has shown success in segmenting biomedical images.

U-net takes an image and its corresponding mask as input. The output is a segmentation map (see Figure 7).

unet_input
Figure 7: From left to right: satellite image, KNN-mask representing patches of vegetation, segmentation of the vegetation class.
unet_input
Figure 8: Segmentation map showing probability of each pixel belonging to vegetation class.

The segmentation map is basically a probability map. The probability of each pixel representing vegetation. As this is a binary segmentation map, any non-zero value may be considered vegetation. This sort of mapping is useful for tracking progressive land cover change such as deforestation, sea level rise, etc.

6 Future Directions

These are my first results. While encouraging, there are definitely places I can improve them:

  • Including more features along with the RGB values may help distinguish between objects with color similarities (e.g. building and sand). Possible features could be average grayscale intensity, HoG, and multiband intensity and landcover specific metrics such as Normalized Difference Vegetation Index (NDVI), etc.
  • So far the non-overlapping patches have been considered for classification. Smoother classification results could be obtained considering overlapping patches.
  • All this can be extended to multiclass segmentation.

Thanks to Alan Schoen, a Metis alumnus who currently works at DigitalGlobe, for his support on this project. Alan put together a series of very comprehensive tutorials to introduce Metis students to geospatial data, remote sensing and satellite imagery. The tutorials gave me the software carpentry level capability and a kick start on domain knowledge.

Written on July 11, 2017