Every part it’s good to find out about raster information, georeferencing, metadata and Rasterio Python library
Most aerial pictures and imagery from satellites are raster information.
This format is usually used to characterize real-world phenomena. In case you are working with geographic information, there’s a excessive probability it’s important to cope with it.
To make use of geographical raster information with Python, completely different theoretical ideas are required. Earlier than leaping to the programmatic half, I extremely suggest you observe the introductory sections.
Desk of content material:
- Introduction: first ideas.
- Purposes: the place are rasters used?
- Colormap: discrete and steady colormaps to visualise rasters.
- Georeferencing: CRS and Affine Transformations.
- Raster’s metadata: all the information related to rasters.
- Rasterio: learn, save, georeference and visualize raster information in Python.
Introduction
A raster consists of a matrix of cells (or pixels) organized into rows and columns the place every cell comprises a worth representing info
Every pixel in a geographical raster is related to a selected geographical location. Which means that if the raster has a 1m/px decision, each pixel covers an space of 1m². Extra particulars about this are given within the Georeferencing part.
Moreover,
A raster comprises a number of layers of the identical dimension, known as bands
Any sort of numerical worth can probably be saved in a cell. Based on the context, cells could comprise integer or floating level values in numerous ranges.
JPGs, PNGs, and Bitmaps are all raster information however they aren’t thought of on this information as they’re non-geographical. Geographical rasters are normally saved within the TIFF format.
Purposes
As rasters may be utilized in varied methods, listed here are the most typical purposes:
- Satellite tv for pc pictures
- Thematic maps
- Digital elevation fashions (DEM)
Satellite tv for pc pictures
Pictures from satellites are normally saved on multiband rasters. The electromagnetic spectrum is split into a number of parts which may be sensed by a satellite tv for pc. Not all of them belong to the seen spectrum, however typically some will probably be within the infrared and invisible to the human eye.
A Raster file completely fits this sort of imagery as a result of every electromagnetic spectrum portion sensed by the satellite tv for pc may be saved in a band.
Sentinel-2, which is likely one of the hottest satellites, takes photographs utilizing 13 spectral bands, one half from the seen spectrum and the opposite from the infrared. Consequently, every output file is a raster with 13 bands (3 of them are Purple, Inexperienced and Blue).
The next is a photograph taken from Sentinel-2 (solely the RGB bands are proven):
Thematic map
A thematic map is used to categorise a geographical space. Every zone is related to a selected class sharing some traits. For instance, we will classify an agricultural space based on the kind of plantations. Rasters are good for this process as a result of every cell can retailer the integer worth representing the category to which the pixel related space belongs.
Beneath is an instance of a thematic map from Lombardia, an Italian area. Every pixel shops a worth between 0 and 6 relying on the category:
Digital elevation mannequin (DEM)
A digital elevation mannequin is used to characterize floor reliefs. DEM is a sort of raster whose pixels comprise float values: the elevation values of the floor.
What’s right here represented is a DEM of a Mars floor space:
To permit the visualization of the primary file solely the seen bands have been confirmed, however the second and third information weren’t instantly visualizable as a result of the worth saved in every cell shouldn’t be a shade however a bit of data.
Within the subsequent part, the workaround required to visualise this type of raster information will probably be focussed on.
Colormap
As rasters haven’t any constraints on the kind and vary of numerical values you’ll be able to retailer, it’s not all the time potential to point out them visually. For instance, the final pictures I confirmed above are two single-band rasters: within the former, every pixel is an integer between 0 and 6 whereas within the latter, a float between -4611 and –3871. Their info doesn’t characterize a shade.
To visualise these sorts of rasters we use a colormap, i.e.
A operate which maps cell values to colours
Thus, when visualizing a raster by way of a colormap, its values will probably be changed by colours.
There are 2 fundamental sorts of colormaps: steady and non-continuous.
Non-continuous colormaps:
They’re made by defining a piecewise operate utilizing value-color pairs.
Within the thematic map instance, I outlined 7 value-color pairs: <0, black>, <1, purple>, <3, orange>, <4, yellow>, <5, blue>, <6, grey>, <2, inexperienced>.
This methodology is usually used when the raster features a small set of values.
Steady colormaps:
They’re made by associating the interval of the raster values with an interval of colours utilizing a steady operate.
Normally, earlier than making use of this sort of colormap, all of the raster values are scaled within the vary [0, 1] utilizing the next components:
Within the grayscale colormap, values within the [0,1] vary are related to grey values within the [0,255] vary through the use of a linear operate:
A colorbar reveals the lead to a correct method:
On this case the 0 worth is represented by the black shade, the 1 worth by white and all of the values between them by completely different shapes of grey.
To higher visualize a raster, we will additionally outline an RGB colormap, by which every raster worth will probably be related to a purple, inexperienced and blue worth.
The next is a well-liked colormap in literature generally known as Turbo:
This colormap begins with blue shades for the bottom values and ends with purple shades for the best:
Within the DEM instance, I used this sort of colormap to transform elevation info into colours. That is the form of colormap used when mapping a spread of values into colours (as a substitute of a small set like in non-continuous colormaps).
Georeferencing
Every cell in a geographical raster covers a selected geographical space and its coordinates, represented by the row and the column, may be transformed into real-world geographic coordinates. The interpretation course of makes use of two elements: the Coordinate Reference System (CRS) and the Affine Transformations.
Earlier than going ahead, it’s essential to know that the earth’s form is approximated by way of a geometrical determine, generally known as the ellipsoid of revolution or spheroid. As this determine is an approximation, a number of spheroids have been outlined over time utilizing axes with completely different sizes.
Coordinate Reference System (CRS)
CRS is a framework used to exactly measure places on the floor of the Earth as coordinates
Every CRS relies on a selected spheroid, thus, if two CRS use completely different spheroids, the identical coordinates refer to 2 completely different places.
CRS may be divided into:
- Geographic coordinate programs, which make the most of angular items (levels). The angular distances are measured from outlined origins.
- Projected coordinate programs, primarily based on a geographic coordinate system. It makes use of spatial projection, which is a set of mathematical calculations carried out to flatten the 3D information onto a 2D aircraft, to undertaking the spheroid. It makes use of linear items (toes, meters, and so on.) to measure the space (on each axes) of the situation from the origin of the aircraft.
Some of the widespread geographic coordinate programs is WGS84, also called EPSG:4326. It’s primarily based on a spheroid with a semi-minor axis (generally known as equatorial radius) equal to 6378137 m and a semi-major axis (generally known as polar radius) equal to 6356752 m. WGS84 makes use of Latitude to learn how far north or south a spot is from the Equator and Longitude to learn how far east or west a spot is from the Prime meridian.
e.g. 40° 43′ 50.1960’’ N, 73° 56′ 6.8712’’ W is the New York Metropolis place utilizing latitude and longitude.
Whereas one of the widespread projected coordinate programs is UTM/WGS84, also called EPSG:32632. It tasks the WGS84 spheroid right into a aircraft after which coordinates are outlined utilizing (x, y) in meters.
The CRS used within the raster file is determined by a number of components similar to when the information was collected, the geographic extent of the information and the aim of the information. Understand that you’ll be able to convert the coordinates of a CRS to a different. There are additionally CRS used to georeference surfaces outdoors our earth, similar to Moon and Mars.
Affine transformations
Georeferenced rasters use affine transformations to map from picture coordinates to real-world coordinates (within the format outlined by the CRS).
Affine transformations are used to map pixel positions into the chosen CRS coordinates
An affine transformation is any transformation that preserves collinearity (three or extra factors are mentioned to be collinear if all of them lie on the identical straight line) and the ratios of distances between factors on a line.
These are all of the affine transformations:
In georeferencing, more often than not, solely scaling and translation transformations are required. Making use of them with the appropriate coefficients is what permits translating raster cell coordinates into real-world coordinates. When studying a geographical raster, these coefficients are already outlined contained in the metadata.
The connection used for the conversion is:
If scale_x and scale_y are the pixel_width and pixel_height in CRS unit (levels, meters, toes, and so on.), r is the rotation of the picture in real-world, x_origin and y_origin are the coordinates of the highest left pixel of the raster in CRS unit, the parameters are:
- A = scale_x ∙ cos(r)
- B = scale_y ∙ sin(r)
- C = x_origin ∙ cos(r) + y_origin ∙ sin(r)
- D = scale_x ∙ sin(r)
- E = scale_y ∙ cos(r)
- F = x_origin ∙ sin(r) + y_origin ∙ cos(r)
Understand that a number of of scale_x, scale_y, x_origin and y_origin may be damaging relying on the CRS used.
As most pictures are north-up, and thus r = 0, parameters may be simplified:
- A = scale_x
- B = 0
- C = x_origin
- E = scale_y
- D = 0
- F = y_origin
A and E outline the scaling ratio whereas C and F the interpretation from the origin.
Metadata
Every geographical raster has metadata related. Listed here are crucial fields:
CRS
This area shops the knowledge of the Coordinate Reference System, such because the title, unit of measurement, spheroid axes and the coordinates of the origin.
Tranformation
It shops the coefficients A, B, C, D, E, F, used to map raster pixel coordinates to CRS coordinates.
Information sort
It’s normally generally known as dtype. It defines the kind of information saved within the raster similar to Float32, Float64, Int32, and so on.
NoData Worth
Every cell of a raster should maintain a worth and rasters don’t help null values. If for some cells the supply that generated the raster couldn’t present a worth, they’re stuffed utilizing the nodata worth. If the nodata worth is ready to 0, because of this when 0 is learn it’s not a worth to think about as a result of it signifies that the supply couldn’t present an accurate worth. Normally, rasters with Float32 information sort, set the nodata worth to ≈ -3.4028235 ∙ 10³⁸.
Width, Peak and Band depend
They’re respectively the width of every band, the peak of every band and the variety of bands of the raster.
Driver
A driver gives extra options to raster information. More often than not, geographical rasters use the GTiff driver, which permits georeferencing info to be built-in into the file.
Rasterio
The primary library ever made for accessing geographical raster information is Geospatial Information Abstration Library, GDAL. It was first developed in C after which prolonged to Python. This fashion, the Python model gives solely little abstraction from the C model. Rasterio which relies on GDAL, tries to unravel this downside by offering a neater and higher-level interface.
To put in Rasterio newest model from PyPI use:
pip set up rasterio
These are the required imports:
import rasterio
from rasterio.crs import CRS
from rasterio.enums import Resampling
from matplotlib import pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable
import numpy as np
from pprint import pprint
Studying a raster
To open a raster file use:
raster = rasterio.open('raster.tiff')
To print the variety of bands use:
print(raster.depend)
To learn all of the raster as a NumPy array use:
raster_array = raster.learn() # form = (n_bands x H x W)
Alternatively, to learn solely a selected band use:
first_band = raster.learn(1) # form = (H x W)
Understand that the band index begins from 1.
To learn all of the metadata related to a raster use:
metadata = dataset.meta
pprint(metadata)
Output:
{'depend': 1,
'crs': CRS.from_epsg(32632),
'driver': 'GTiff',
'dtype': 'uint8',
'top': 2496,
'nodata': 255.0,
'remodel': Affine(10.0, 0.0, 604410.0,
0.0, -10.0, 5016150.0),
'width': 3072}
When studying a raster, there may be nodata values, and it’s advisable to switch them with NaN values. To do that use:
first_band[first_band == metadata['nodata']] = np.nan
To resize a raster by a given issue, outline first the output form:
out_shape = (raster.depend, int(raster.top * 1.5), int(raster.width * 1.5))
Then use:
scaled_raster = raster.learn(out_shape=out_shape,
resampling=Resampling.bilinear)
Understand that after scaling a raster, A and F coefficients of the affine transformation have to be modified to the brand new pixel decision, in any other case, georeferencing will give the improper coordinates.
Visualization
To indicate a raster band with a colormap and a colorbar use:
fig, ax = plt.subplots()
im = ax.imshow(raster.learn(1), cmap='viridis')
divider = make_axes_locatable(ax)
cax = divider.append_axes('proper', dimension='5%', pad=0.10)
fig.colorbar(im, cax=cax, orientation='vertical')
plt.savefig('cmap_viz')
Output:
Within the case of multiple-band satellite tv for pc rasters, to point out the primary 3 seen bands use:
rgb_bands = raster.learn()[:3] # form = (3, H, W)
plt.imshow(rgb_bands)
Normally, satellite tv for pc imagery shops RGB within the first three bands. If this isn’t the case, the index needs to be modified based on the order of the bands.
Georeferencing
Coordinates from the next strategies will probably be returned and have to be offered utilizing the present CRS unit.
To search out the true coordinates of a pixel (i, j), the place i is the row and j the column, use:
x, y = raster.xy(i, j)
To do the other use:
i, j = raster.index(x, y)
To indicate the bounds of the information:
print(raster.bounds)
Output:
BoundingBox(left=604410.0, backside=4991190.0, proper=635130.0, high=5016150.0)
The raster on this instance was georeferenced through the use of ESPG:32632 as CRS, due to this fact the output coordinates are in meters.
Save a raster
Step 1 — Discover the EPSG code of the specified CRS, then retrieve its info:
crs = CRS.from_epsg(32632)
On this instance, EPSG:32632 is used, which was talked about within the fourth part of this information.
Step 2— Outline an Affine transformation:
transformation = Affine(10.0, 0.0, 604410.0, 0.0, -10.0, 5016150.0)
The aim of the coefficients A, B, C, D, E, F was defined within the fourth part of this information.
Step 3 — Save the raster:
# a NumPy array representing a 13-band raster
array = np.random.rand(13,3000,2000)with rasterio.open(
'output.tiff',
'w',
driver='GTiff',
depend=array.form[0], # variety of bands
top=array.form[1],
width=array.form[2],
dtype=array.dtype,
crs=crs,
remodel=remodel
) as dst:
dst.write(array)
In case georeferencing shouldn’t be required set crs and remodel to None.
There’s one other syntax of the write methodology:
dst.write(array_1, 1)
...
dst.write(array_13, 13)
However usually, it’s simpler to put in writing a single 3d array.
Conclusion
This information has proven how in a raster, a set of a number of same-size matrices known as bands, every cell comprises a bit of data. This info adjustments based on the duties, similar to satellite tv for pc imagery, thematic maps, and digital elevation mannequin. Moreover, relying on the appliance you would possibly want colormaps to visualise it. Additionally, you came upon that utilizing a coordinate reference system and affine transformations, it’s potential to map every cell place to real-world coordinates. Ultimately, you noticed that Rasterio makes simple to carry out learn, write, visualization and georeference operations. In case you want a program to open rasters, QGIS and ArcGIS are good choices.
Different sources:
Thanks for studying, I hope you discovered this handy.