Chapter 13 Image Processing

The collection of imagery is challenging and time consuming, with sensor design and deployment often requiring the development of new technologies. Driven by the growing demand for relevant information about environmental change, great emphasis is placed on the creation of “the next new satellite” or “a smaller, yet more powerful drone.” Although innovation in these physical technologies is critical for the advancement of image collection, they don’t guarantee that the imagery will be useful. In reality, the pixel values collected by a sensor are not purely the reflectance values from the surface of interest. Depending on the wavelengths being observed, there may be variations in the amount of photons emitted from the light source and a variety of atmospheric effects, such as scattering. There may also be slight inconsistencies in images collected by the same sensor on the same day or adjacent areas, which would affect the quality of analyses.

Learning Objectives

Relate common issues of image collection to relevant image processing techniques
Understand the logic supporting the use of specific image processing techniques
Explore a variety of processing methodologies employed by published research.

Key Terms

pixel, spatial resolution, temporal resolution, radiometric resolution, spectral resolution, geometric correction, atmospheric correction,

13.1 Overview

The application of correction methods that address these inconsistencies are often described generally as “image processing.” In this chapter we will explore a variety of common image processing techniques and strive to understand the logic behind employing one, or more, to remote sensed imagery. Before diving into specific processing workflows that render imagery scientifically useful, however, it is important to review some key terms.

First and foremost, image processing, or digital image analysis, refers to any actions taken to improve the accuracy of one or more component of raw imagery. In remote sensing science, the goal of image processing is to generate a product that provides accurate and useful information for scientific pursuit. This is in contrast to image processing for artistic purposes, which could include many similar steps, but focus on generating a product that is visually appealing.

Noise is another common term associated with image processing and it refers to any element of the data that is not wanted. There are a variety of noise types, which we will discuss later. In contrast to noise, signal describes wanted components of the imagery. Combined, signal and noise provide guidance on what specific steps should be taken in relation to the data during image processing. Before they are addressed, however, it is also important to confirm the spatial accuracy of the data.

It is also important to review the elements of an image, or raster. The terms image and raster are interchangeable and refer to the a collection of spatial adjacent cells, or pixel. The terms cell and pixel are also interchangeable. An empty raster, or image, would contain pixels with no information. To say an image has been collected would simply mean that a sensor has collected information and stored that information in adjacent pixels. Although this seems straight forward there are a multitude of environmental and engineering factors that can affect the accurate collection of information. It is these confounding factors that image processing attempts to resolve in hopes of generating a data product that can be compared across space and time.

13.2 Geometric correction

By definition, remotely sensed data is collected by a sensor at some distance apart from the object(s) being observed. In many instances, the sensor is in motion, and at the very least it is subject to influence from environmental actors like wind. Interactions with phenomena like wind can cause the instantaneous field of view (IFOV) to move slightly during data collection, introducing spatial errors to the imagery. On top of issues relating to sensor displacement, sensors in motion may also observe adjacent areas with different topographic properties.

A combination of these two effects could be visualized by imagining an airplane flying over a forested hillside. As the sensor collects data the plane can be buffeted with wind, changing the direction of a sensor’s IFOV to an location that is not the original target. On top of issues with sensor movement, the elevation of the ground is constantly undulating, altering the distance at which the sensors observes the landscape below. These two issues that affect the spatial components of image collection compromise the accuracy of an image and need to be corrected. The general term used to refer to the spatial correction of an image is geometric correction.

13.2.1 Relief displacement

The first component of geometric correction we will discuss deals with the terrain being observed. Changes in the terrain over which an image is collected lead to inconsistencies in the distances at which information is collected. These differences lead to objects in the image appearing in location inaccurate with reality and can be rectified using the spatial information of the sensor, datum and object. Examples of this effect can be observed in any photograph containing tall structures, which would appear to be leaning outward from the center of the image, or principal point. The rectification of relief displacement can be represented by the equation:

\[\begin{equation} d = rh/H \tag{13.1} \end{equation}\]

where d = relief displacement, r = distance from the principal point to the image point of interest, h = difference in height between the datum and the point of interest and H = the height of the sensor above the datum.

Your turn!

What is the image displacement of a pixel that is 0.5 mm from the principal point, 57 m below the datum, and collected from a sensor that is 135 m above the datum?

13.2.2 Georeferencing

The removal of inaccuracies in the spatial location of an image can be conducted using a technique called georeferencing. The basic concept of georeferencing is to alter the coordinates of an image through the association of highly precise coordinates collected on site using a GPS. Coordinates collected in the field are often called ground, or control points, and form the base of successful georeferencing. In general, increasing the amount and accuracy of ground points leads to increased spatial accuracy of the image.

There are a variety of techniques used to transform the coordinates of an image based on control points, all of which are require some level of mathematics. The complexity of the polynomials used to transform the dataset, the more accurate the output spatial coordinates will be. Of course, the overall accuracy of the transformation depends on how accurate the control points are. Upon transforming a rasters coordinates it is important to evaluate how accurate the output raster is compared to the input raster. A common method of evaluating the success is through the calculating the square root of the mean of the square of all error, often called the Root Mean Squared Error (RMSE). In short, RMSE represents the average distance at which the output raster is from the ground or control points. The smaller the RMSE, the more accurate the transformation.

13.2.3 Georegistration (georectification)

Similar to georeferencing, georegistration involves adjusting the raw coordinates of an image to match more accurate ones. In the case of georegistration, however, ground points collected in the field are replaced with coordinates from a map or image that has been verified as spatially accurate. This method could be considered a matching of two products, enabling the two products to be analyzed together. An example would be matching two images collected one year apart. If the first image is georeferenced accurately the second image can simply be georegistered by identifying shared features, such as intersections or buildings, and linking them through the creation of control points in each image.

13.2.4 Resampling

Despite all the efforts to accurately place an image in space, it is likely that any spatial alterations also change the shape or alignment of the image’s pixels. This dislocation between pixel sizes within the image, as well as potential changes in their directionality alter the capacity to evaluate the radiation values stored within them. Imagine that a raw image is represented by a table cloth. On top of the table cloth are thousands of pixels, all of uniform height and width. The corners of this table cloth each have X and Y coordinates. Now imagine that you have to match these coordinates (the corners) to the four more accurate ground points that don’t match the table cloth. Completion of this task requires you to stretch one corner of the table cloth outwards, while the other three corners are shifted inside, leading the table cloth to alter it’s shape and, in doing so, altering the location and shape of the raw pixels.

This lack of uniformity in pixel size and shape could compromise future image analysis and need to be corrected for. To do so, users can implement a variety of methods to reassign the spatially accurate values to spatially uniform pixels. This process of transforming an image from one set of coordinates to another is called resampling (Parker 1983 Zotero not working).

The most important concept to understand about resampling is also the first step in the process, which is the creation of a new, empty raster in which all cells are of equal size and are aligned North (Insert demo image). The transformed raster is overlaid with the empty raster with a user defined cell size before each empty cell is assigned a value based on values from the transformed raster. Confusing? Perhaps, but there are a variety of process that can be used to assign cell values to the empty raster and we will discuss three in hopes of clarifying this methodology.

Nearest neighbor

Nearest neighbor (NN) is the most simple method of resampling as it looks only at one pixel from the transformed raster. This pixel is selected based on the proximity of it’s center to the center of the empty cell and the value is added without further transformation. The simplicity of this method makes it excellent at preserving categorical data, like land cover or aspect, but struggles to capture transitions between cells and can result in output rasters that appear somewhat crude and blockish.

Bilinear interpolation

In contrast to NN resampling, bilinear interpolation (BI) uses the values of multiple neighboring cells in the transformed raster to determine the value for a single cell in the empty raster. Essentially, the four cells with centers nearest to the center of the empty cell are selected as input values. A weighted average of these four values is calculated based on their distance from the empty cell and this averaged value becomes the values of the empty cell. The process of calculating an average means that the output value is likely not the same as any of the input values, but remains within their range. These features make BI ideal for datasets with continuous variables, like elevation, rather than categorical ones.

Cubic convolution

Similar to bilinear interpolation, cubic convolution (CC) uses multiple cells in the transformed raster to generate the output for a single cell in the empty raster. Instead of using four neighbors, however, this method uses 16. The idea supporting the use of the 16 nearest neighbors is that it results in an output raster with cell values that are more similar to each other than the values of the input raster. This effect is called smoothing and is effective at removing noise, which makes CC the ideal sampling method for imagery. There is one drawback of this smoothing effect, however, as the output value of a cell may be outside the range of the 16 input cell values.

Figure 13.1: Demonstration of neighbour selection (red) using cubic convolution resampling to determine the value of a single cell (yellow) in an empty raster.

13.3 Atmospheric correction

Following similar logic to that promoting the need for geometric correction, atmospheric correction is intended to minimize discrepancies in pixel values within and across images that occur due to interactions between observed radiation and atmosphere. The severity of impact that the atmosphere has on an image relates to the amount of heterogeneity that occurs in the atmosphere during image collection. The majority of impacts are caused by the three main types of scattering, which were presented in Chapter 11.

13.3.1 Atmospheric windows

A key characteristic of the earth’s atmosphere that impacts the collection of passive remotely sensed data is the impediment of certain wavelengths. If solar radiation of a specific wavelength cannot reach Earth’s surface, it is impossible for a sensor to detect the radiance of that wavelength. There are, however, certain regions of the electromagnetic spectrum (EMS), called atmospheric windows, that are less effected by absorption and scattering than others and it is the observation of these regions that remote sensing relies on. Upon reaching the Earth’s surface, however, there are a variety of atmospheric constituents that can affect image quality.

13.3.2 Clouds and shadows

Two of the most common culprits in the disturbance of remotely sensed imagery are clouds and shadows. Both are relatively transient, making the prediction of their inclusion in an image difficult and rendering their effects within an image relatively inconsistent. On top of issues of presence, each introduces unique challenges for image correction.

Clouds are an inherent component of Earth’s atmosphere and therefore should warrant respect and care in image processing, rather than sighs of frustration. You could imagine that a single image with 30% cloud cover may not be entirely useful, but their aforementioned permanent transience means that data users must work to reduce their effects, if not remove them entirely.

When approaching the removal of clouds, it is important to recall the physics that drive Mie scattering (Chapter 11. Essentially, water vapors in the atmosphere scatter visible and near infrared light and generate what appears to be white objects in the sky. Since the visible and near infrared regions of the EMS fall within an atmospheric window in which many sensor detect radiation, clouds can be recorded as part of an image.

The removal of clouds is often referred to as masking and can prove challenging depending on the region in question as they also generate cloud-shadows. You could imagine a study attempting to evaluate snow cover over a landscape using imagery comprised of wavelengths in the visible region of the EMS. If there was intermittent cloud cover, cloudy areas with no snow could be classified as having snow and snowy areas with cloud-shadows may be classified as no snow.

Martinuzzi et al 2007 published a masking method to remove cloud and cloud-shadow from Landsat imagery . They proposed the utilization of radiation values collected in the blue (Band 1) and thermal (Band 6) ranges of the EMS to differentiate clouds from objects on the Earth’s surface. To remove cloud-shadow, they suggested using information collected from Band 4, in the near infrared region, combined with the predicted location of shadows based on the spatial relationships of the cloud, sun and senor.

Although Martinuzzi et al. 2007 presented a straightforward method for making cloud and cloud-shadow in an image, they did not address the inclusion of shadows caused by landcover structures, such as trees or buildings (Martinuzzi, Gould, and Gonzalez 2007). Shadows present unique problems for image analysis as they can shade out underlying structures and also be classified as seperate, individual objects. The former issues presents problems for studies evaluating land cover, while the latter confounds machine learning algorithms attempting to identify unique classes in the image based on spectral similarities. Another confounding issue is that the location and size of shadows change throughout the day in accordance with the sun.

An important feature of any shadow is that the area shaded is still considered to be illuminated, but only by skylight . The exclusion of sunlight from the area creates a unique opportunity for shadow identification and removal (Finlayson and Hordley 2001). Finlayson et al.’s method of shadow removal is complex, using derivative calculus to capitalize on the fact that a illumination invariant function can be recognized based solely on surface reflectance. Although more complicated that Martinuzzi et al.’s approach, it may be worth reviewing (Finlayson and Hordley 2001) if you are interested in learning more about shadow removal.

13.3.3 Smoke and Haze

Smoke and haze present unique issues to image processing as they tend to vary in presence, consistency and density. They also represent different types of scattering, with smoke causing Mie scattering and haze causing non-selective scattering. Makarau et al., 2014 demonstrated that haze can be removed through the creation of a haze thickness map (HTM) (Makarau et al. 2014). This methodology is equally as complex as that of (Finlayson and Hordley 2001) and is perhaps beyond the scope of this book. It is important to note, though, that removal of shadows, clouds, smoke and haze relies on an understanding of how their respective scattering types affect incoming solar radiation. Successful removal, then, depends on using spectral bands that are not effected by the particular type of scattering occurring within the image.

13.4 Radiometric correction

We have discussed how the creation of an image by a remote sensor leads to slight variations in spatial and atmospheric properties between pixels and that these inconsistencies must be corrected for. In this section, we will discuss some issues affecting the information within a pixel and some common remedies. In essence, we will explore how the raw digital numbers collected by a sensor can be converted to radiance and reflectance.

13.4.1 Signal-to-noise

A key concept of radiometric correction is the ratio of desired information, or signal to background information (noise) within a pixel. The signal-to-noise ratio (SNR) is a common method of presenting this information and provides an overall statement about image quality. A common method of calculating SNR is to divide the mean (µ) signal value of the sensor by its standard deviation (𝛔), where signal represents an optical intensity. (Equation (13.2))

\[\begin{equation} SNR = µ _{signal} /𝛔 _{signal} \tag{13.2} \end{equation}\]

It is clear, through Equation (13.2), that the average signal value of an instrument represents the value that its designers desire to capture. It is also clear that an increase in signal leads to an improved SNR. What remains unclear, however, is what causes a sensor to observe and record undesired noise to be recorded. In reality, there are a variety of noise types that can affect the SNR of a sensor.

Photon shot noise

The first type of noise we will address is due to inconsistencies in the number of photons observed at different time intervals. Since photons are quantum particles (they are the smallest measurable amount of radiation), a sensor can only observe them as whole numbers. This, coupled with other random fluctuations in radiation properties, means that a sensor directed at the same location may observe 10 photons at one time interval and 12 a later interval. Over time, it is possible to determine the average amount of photons observed by a sensor, but creates inconsistencies across individual observations. Photon shot noise can be calculated as the square root, of the signal. This suggests that stronger signal leads to a relative decrease in shot noise.

Pattern noise

Also known as fixed-pattern noise, pattern noise arises in instruments that have multiple sensors along a single array. Pattern noise occurs due to one sensor collecting relatively higher values than others, resulting in heightened brightness in the pixels it creates.

Readout noise

Readout noise is created through the inconsistencies relating to the interaction of multiple physical measurement electronic devices. Since it is impossible to have a sensor without physical devices, readout noise is inherent in all sensors. Readout noise is therefore equal to any difference in pixel value when all sensors are exposed to identical levels of illumination. There are a variety of technical methods used to correct for this error, but the concepts and mathematics supporting them are perhaps beyond the scope of this book. If you are interested in learning more about readout noise, check out this webpage created by Michael Richmond.

Thermal noise

Another inherent type of sensor noise is thermal noise. Thermal noise occurs in any device using electricity and is caused by the vibrations of the devices charge carriers. This means that thermal noise can never fully be removed from an image, although it can be reduced by lowering the temperature of the environment at which the sensor is operating.

Your turn!

Calculate SNR or it’s associated values for various Landsat sensors:

OLI = signal 5288.1, standard deviation - 18.7
TM = 𝛔: 0.4, µ: 5.8
ETM+ = SNR: 22.3, µ: 13.4

13.4.2 Radiometric normalization

As discussed in Chapter 11, the normalization of radiometric values collected by a sensor enables the comparison of data within and across images. A common normalization output in remote sensing studies evaluating the environment is reflectance, which can be calculated simply by dividing the radiance value of a pixel collected when observing the object(s) of interest with the radiance value of a pixel observing a surface with 100% reflectance.

Case Study: An overview of Landsat Processing

Case study title max of forty characters

The Landsat program has proven to be a great success for NASA, providing global datasets that can be used to observe Earth.

Cras convallis erat ante, ut tristique est tempor elementum. Nam mollis, ipsum at vehicula vestibulum, est magna finibus nisl, et laoreet metus massa et ipsum. Proin eget eros ac odio euismod volutpat et ac diam. Cras viverra ut libero vel pulvinar. Duis nisi magna, sagittis id ligula ac, efficitur commodo eros. Nulla tincidunt id nulla in lobortis. Sed non mi eu mi fermentum cursus. In elit velit, semper sed gravida at, imperdiet sed nibh. Aliquam quis massa malesuada, venenatis nunc sed, malesuada nulla. Vestibulum malesuada purus ut ex ullamcorper, ut blandit lacus lobortis. Curabitur scelerisque velit justo, quis porttitor purus efficitur ut. Phasellus nec arcu vestibulum, consequat ante id, elementum velit. Proin arcu tortor, cursus vitae sem id, congue semper urna. Integer tempus in est eu consequat. Donec sodales, quam vel finibus faucibus, leo ante dictum quam, id tempus ligula dolor quis erat. Curabitur non elementum sem. Mauris placerat fermentum orci non lacinia. Ut imperdiet dui lectus, ac malesuada felis euismod sed. Nulla non volutpat dui, in suscipit turpis.

Case studies should have at least one image or map (no more than 2 total) and the written length should be around 300 words (shown above). Any references to external literature should by hyperlinked with the Digitial Object Identifier (DOI) permanent URL and entered into the bibliography. Avoid linking to external resources without a DOI and permanent URL. Contact Paul or try using the Leaflet package in R if you want to add an interactive web map.

13.5 Image enhancement

So far, in this Chapter, we have discussed methods of correcting spatial, atmospheric and radiometric errors that are commonly present in remotely sensed images. While the removal of these artifacts is necessary, it is also important to explore some common methods of enhancing an image once these aforementioned corrections have been made. Both image stretching and sharpening have roots in spatial and radiometric correction, so keep your mind open to the inherent links that arise.

13.5.1 Stretching

Image stretching refers to the adjustment of radiometric values of the input methods to better exploit the radiometric resolution of an image. In principle, the distribution of radiometric values within a image is altered in a manner that improves it’s capacity to perform a desired task (). For instance, if an image collected with an 8-bit radiometric resolution (256 radiometric values; 0-255). appears too dark, it is likely that the distribution of pixel values is centered on a radiometric value greater than 127 (middle of a 8-bit scale). In fact, of 0 represents white and 255 represents black, it is very likely that the majority of pixels are closer to 255. In this case, the lowest value(s) observed in the image can be adjusted to 0, the relationship of this change can be determined and then applied to all other observed values.

Figure 13.2: Example of how (a) the original distribution of radiometric values in a image is (b) stretched

13.5.2 Sharpening

Image sharpening is often used to preserve the spectral information of a pixel while improving it’s spatial resolution (King and Jianwen Wang, n.d.). This is particularly useful for applications concerned with classification, such as land cover evaluation, as transitions between pixels with different values become more clear. For example, there are a variety of studies demonstrating that Landsat 8 imagery can be sharpened using information collected by a panchromatic sensor operating in unison with the other multispectral sensors on board the satellite (Gilbertson, Kemp, and van Niekerk 2017). This is made possible because the spatial resolution of the panchromatic sensor is 15 m-2, while the multispectral sensors collect information at 30 m-2.

Call out

If you are interested in learning more about how sharpening can impact hyperspectral imagery you can check out (Inamdar et al. 2020). Their research demonstrates that recorded pixel values contain information from areas beyond the traditional spatial boundary of a cell. These findings have interesting implications for a variety of applications.

13.6 Summary

This chapter provided an overview of common image processing techniques and discussed the logic that supports their usage. Overall, each technique strives to create imagery that is consistent across time and space in order for individual pixel values to be evaluated and/or compared. Although necessary, these processes can take time and need to be applied in accordance with the desired application. Understanding the general workflow of image processing will allow you to determine what steps should be taken to create the highest quality imagery for your research.

Reflection Questions

Explain ipsum lorem.
Define ipsum lorem.
What is the role of ispum lorem?
How does ipsum lorem work?

References

Finlayson, Graham D., and Steven D. Hordley. 2001. “Color Constancy at a Pixel.” Journal of the Optical Society of America A 18 (2): 253. https://doi.org/10.1364/josaa.18.000253.

Gilbertson, Jason Kane, Jaco Kemp, and Adriaan van Niekerk. 2017. “Effect of Pan-Sharpening Multi-Temporal Landsat 8 Imagery for Crop Type Differentiation Using Different Classification Techniques.” Computers and Electronics in Agriculture 134 (March): 151–59. https://doi.org/10.1016/j.compag.2016.12.006.

Inamdar, Deep, Margaret Kalacska, George Leblanc, and J Pablo Arroyo Mora. 2020. “Characterizing and Mitigating Sensor Generated Spatial Correlations in Airborne Hyperspectral Imaging Data.” https://doi.org/10.3390/rs12040641.

King, R.L., and Jianwen Wang. n.d. “IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Symposium.” In. IEEE. https://doi.org/10.1109/igarss.2001.976657.

Makarau, Aliaksei, Rudolf Richter, Rupert Müller, and Peter Reinartz. 2014. “Haze Detection and Removal in Remotely Sensed Multispectral Imagery.” IEEE Transactions on Geoscience and Remote Sensing 52 (9): 5895–905. https://doi.org/10.1109/TGRS.2013.2293662.

Martinuzzi, Sebastián, William A. Gould, and Olga M. Ramos Gonzalez. 2007. “Creating Cloud-Free Landsat ETM+ Data Sets in Tropical Landscapes: Cloud and Cloud-Shadow Removal.” U.S. Department of Agriculture, Forest Service, International Institute of Tropical Forestry. Gen. Tech. Rep. IITF-32. 32. https://doi.org/10.2737/IITF-GTR-32.