RSGISLib Plotting Tools

Statistical Plots

rsgislib.tools.plotting.residual_plot(y_true, residuals, out_file, out_format='PNG', title=None)

A function to create a residual plot to investigate the normality and homoscedasticity of model residuals.

Parameters
  • y_true – A numpy 1D array containing true/observed values.

  • residuals – A numpy 1D array containing model residuals.

  • out_file – Path to the output file.

  • out_format – Output format supported by matplotlib (e.g. “PNG” or “PDF”). Default: PNG

  • title – A title for the plot. Optional, if None then ignored. (Default: None)

rsgislib.tools.plotting.quantile_plot(residuals, ylabel, out_file, out_format='PNG', title=None)

A function to create a Quantile-Quantile plot to investigate the normality of model residuals.

Parameters
  • residuals – A numpy 1D array containing model residuals.

  • ylabel – A string defining a label for the y axis

  • out_file – Path to the output file.

  • out_format – Output format supported by matplotlib (e.g. “PNG” or “PDF”). Default: PNG

  • title – A title for the plot. Optional, if None then ignored. (Default: None)

Image Plots

rsgislib.tools.plotting.plot_image_spectra(input_img, vec_file, vec_lyr, output_plot_file, wavelengths, plot_title, scale_factor=0.1, show_refl_std=True, refl_max=None)

A utility function to extract and plot image spectra.

Parameters
  • input_img – is the input image

  • vec_file – is the region of interest file as a vector file - if multiple polygons are defined the spectra for each will be added to the plot.

  • vec_lyr

  • output_plot_file – is the output PDF file for the plot which has been create

  • wavelengths – is list of numbers with the wavelength of each band (must have the same number of wavelengths as image bands)

  • plot_title – is a string with the title for the plot

  • scale_factor – is a float specifying the scaling to percentage (0 - 100). (Default is 0.1, i.e., pixel values are scaled between 0-1000; ARCSI default).

  • show_refl_std – is a boolean (default: True) to specify whether a shaded region showing 1 standard deviation from the mean on the plot alongside the mean spectra.

  • refl_max – is a parameter for setting the maximum reflectance value on the Y axis (if None the maximum value in the dataset is used

from rsgislib.tools import plotting

inputImage = 'injune_p142_casi_sub_utm.kea'
roiFile = 'spectraROI.shp'
outputPlotFile = 'SpectraPlot.pdf'
wavelengths = [446.0, 530.0, 549.0, 569.0, 598.0, 633.0, 680.0, 696.0, 714.0, 732.0, 741.0, 752.0, 800.0, 838.0]
plotTitle = "Image Spectral from CASI Image"

plotting.plot_image_spectra(inputImage, roiFile, outputPlotFile, wavelengths, plotTitle)
rsgislib.tools.plotting.plot_image_comparison(inputImage1, inputImage2, img1Band, img2Band, outputPlotFile, numBins=100, img1Min=None, img1Max=None, img2Min=None, img2Max=None, img1Scale=1, img2Scale=1, img1Off=0, img2Off=0, normOutput=False, plotTitle='2D Histogram', xLabel='X Axis', yLabel='Y Axis', ctable='jet', interp='nearest')

A function to plot two images against each other.

Parameters
  • inputImage1 – is a string with the path to the first image.

  • inputImage2 – is a string with the path to the second image.

  • img1Band – is an int specifying the band in the first image to be plotted.

  • img2Band – is an int specifying the band in the second image to be plotted.

  • outputPlotFile – is a string specifying the output PDF for the plot.

  • numBins – is an int specifying the number of bins within each axis of the histogram (default: 100)

  • img1Min – is a double specifying the minimum value to be used in the histogram from image 1. If value is None then taken from the image.

  • img1Max – is a double specifying the maximum value to be used in the histogram from image 1. If value is None then taken from the image.

  • img2Min – is a double specifying the minimum value to be used in the histogram from image 2. If value is None then taken from the image.

  • img2Max – is a double specifying the maximum value to be used in the histogram from image 2. If value is None then taken from the image.

  • img1Scale – is a double specifying the scale for image 1 (Default 1).

  • img2Scale – is a double specifying the scale for image 2 (Default 1).

  • img1Off – is a double specifying the offset for image 1 (Default 0).

  • img2Off – is a double specifying the offset for image 2 (Default 0).

  • normOutput – is a boolean specifying whether the histogram should be normalised (Default: False).

  • plotTitle – is a string specifying the title of the plot (Default: ‘2D Histogram’).

  • xLabel – is a string specifying the x axis label (Default: ‘X Axis’)

  • yLabel – is a string specifying the y axis label (Default: ‘Y Axis’)

  • ctable – is a string specifying the colour table to be used (Default: jet), list of available colour tables specified by matplotlib: http://matplotlib.org/examples/color/colormaps_reference.html

  • interp – is a string specifying the interpolation algorithm to be used (Default: ‘nearest’). Available values are ‘none’, ‘nearest’, ‘bilinear’, ‘bicubic’, ‘spline16’, ‘spline36’, ‘hanning’, ‘hamming’, ‘hermite’, ‘kaiser’, ‘quadric’, ‘catrom’, ‘gaussian’, ‘bessel’, ‘mitchell’, ‘sinc’, ‘lanczos’.

from rsgislib.tools import plotting

inputImage1 = 'LS5TM_20000613_lat10lon6217_r67p231_rad_sref_ndvi.kea'
inputImage2 = 'LS5TM_20000613_lat10lon6217_r67p231_rad_ndvi.kea'
outputPlotFile = 'ARCSI_RAD_SREF_NDVI.pdf'

plotting.plot_image_comparison(inputImage1, inputImage2, 1, 1, outputPlotFile, img1Min=-0.5, img1Max=1, img2Min=-0.5, img2Max=1, plotTitle='ARCSI SREF NDVI vs ARCSI RAD NDVI', xLabel='ARCSI SREF NDVI', yLabel='ARCSI RAD NDVI')
rsgislib.tools.plotting.plot_image_histogram(input_img, imgBand, outputPlotFile, numBins=100, imgMin=None, imgMax=None, normOutput=False, plotTitle='Histogram', xLabel='X Axis', colour='blue', edgecolour='black', linewidth=None)

A function to plot the histogram of an image.

Parameters
  • input_img – is a string with the path to the image.

  • imgBand – is an int specifying the band in the image to be plotted.

  • outputPlotFile – is a string specifying the output PDF for the plot.

  • numBins – is an int specifying the number of bins within each axis of the histogram (default: 100)

  • imgMin – is a double specifying the minimum value to be used in the histogram from the image. If value is None then taken from the image.

  • imgMax – is a double specifying the maximum value to be used in the histogram from the image. If value is None then taken from the image.

  • normOutput – is a boolean specifying whether the histogram should be normalised (Default: False).

  • plotTitle – is a string specifying the title of the plot (Default: ‘2D Histogram’).

  • xLabel – is a string specifying the x axis label (Default: ‘X Axis’)

  • colour – is the colour of the bars in the plot (see matplotlib documentation for how to specify, either keyword or RGB values (e.g., [1.0,0,0])

  • edgecolour – is the colour of the edges of the bars

  • linewidth – is the thickness of the edges of the bars in the plot.

from rsgislib.tools import plotting

plotting.plot_image_histogram("Baccini_Manaus_AGB_30.kea", 1, "BacciniHistogram.pdf", numBins=100, imgMin=0, imgMax=400, normOutput=True, plotTitle='Histogram of Baccini Biomass', xLabel='Baccini Biomass', color=[1.0,0.2,1.0], edgecolor='red', linewidth=0)

Visualise Raster Data

rsgislib.tools.plotting.get_gdal_raster_mpl_imshow(input_img: str, bands: Optional[List[int]] = None, bbox: Optional[List[float]] = None) Tuple[numpy.array, List[float]]

A function which retrieves image data as an array in an appropriate structure for use within the matplotlib imshow function. The extent is also returned. Note, this function assumes that the image pixels values are within an appropriate range for display.

Parameters
  • input_img – The input image file path.

  • bands – Optional list of image bands to be selected and returned. If not provided then all bands will be read. However, note that only 3 or 1 band(s) are valid for visualisation and an error will be thrown if the number of bands is not 3 or 1.

  • bbox – Optional bbox (xmin, xmax, ymin, ymax) used to subset the input image so only data for the subset are returned.

Returns

numpy.array either [n,m,3] or [n,m] and a bbox (xmin, xmax, ymin, ymax) specifying the extent of the image data.

img_sub_bbox = [554756, 577168, 9903924, 9944315]
input_img = "sen2_img_strch.kea"

img_data_arr, coords_bbox = get_gdal_raster_mpl_imshow(input_img,
                                                       bands=[8,9,3],
                                                       bbox=img_sub_bbox)


import matplotlib.pyplot as plt
fig, ax = plt.subplots()
im = ax.imshow(img_data_arr, extent=coords_bbox)
plt.show()
rsgislib.tools.plotting.get_gdal_thematic_raster_mpl_imshow(input_img: str, band: int = 1, bbox: Optional[List[float]] = None, out_patches=False, cls_names_lut=None) Tuple[numpy.array, List[float], list]

A function which retrieves thematic image data with a colour table as an array in an appropriate structure for use within the matplotlib imshow function. The image pixel values are converted from there thematic integer values to a three band array using the RGB values from the colour table. If the pixel values are required then use the get_gdal_raster_mpl_imshow function. The extent is also returned and optionally a list of matplotlib patches which can be used to create a legend.

Parameters
  • input_img – The input image file path.

  • band – The image band to be used for the visualisation (Default = 1).

  • bbox – Optional bbox (xmin, xmax, ymin, ymax) used to subset the input image so only data for the subset are returned.

  • out_patches – Boolean to specify whether patches should be returned to create a legend.

  • cls_names_lut – A dictionary LUT with labels for the classes. The dict key is the pixel value for the class and

Returns

numpy.array either [n,m,3], a bbox (xmin, xmax, ymin, ymax) specifying the extent of the image data and list of matplotlib patches, if out_patches=False then None is returned.

img_sub_bbox = [554756, 577168, 9903924, 9944315]
input_img = "class_img.kea"

cls_names_lut = dict()
cls_names_lut[1] = "Vegetation"
cls_names_lut[2] = "Non-Veg"
cls_names_lut[3] = "Productive Veg"

(img_data_arr,
coords_bbox,
lgd_patches) = get_gdal_thematic_raster_mpl_imshow(input_img,
                                                   band=1,
                                                   bbox=img_sub_bbox,
                                                   out_patches=True,
                                                   cls_names_lut=cls_names_lut)


import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.imshow(img_data_arr, extent=coords_bbox)
ax.legend(handles=lgd_patches)
plt.show()
rsgislib.tools.plotting.linear_stretch_np_arr(arr_data: numpy.array, no_data_val: Optional[float] = None, out_off: float = 0, out_gain: float = 1, out_int_type=False, min_out_val: float = 0, max_out_val: float = 1) numpy.array

A function which performs a linear stretch using the min-max values on a per band basis for a numpy array representing an image dataset. This function is useful in combination with get_gdal_raster_mpl_imshow for displaying raster data from an input image as a plot. By default this function returns values in a range 0 - 1 but if you prefer 0 - 255 then set the out_gain to 255 and the out_int_type to be True to get an 8bit unsigned integer value.

Parameters
  • arr_data – The numpy array as either [n,m,b] or [n,m] where n and m are the number of image pixels in the x and y axis’ and b is the number of image bands.

  • no_data_val – the no data value for the input data. If there isn’t a no data value then leave as None (default)

  • out_off – Output offset value (value * gain) + offset. Default: 0

  • out_gain – Output gain value (value * gain) + offset. Default: 1

  • out_int_type – False (default) and the output type will be float and True and the output type with be integers.

  • min_out_val – Minimum output value within the output array (default: 0)

  • max_out_val – Maximum output value within the output array (default: 1)

Returns

A number array with the rescaled values but same dimensions as the input numpy array.

img_sub_bbox = [554756, 577168, 9903924, 9944315]
input_img = "sen2_img_strch.kea"

img_data_arr, coords_bbox = get_gdal_raster_mpl_imshow(input_img,
                                                       bands=[8,9,3],
                                                       bbox=img_sub_bbox)

img_data_arr = linear_stretch_np_arr(img_data_arr, no_data_val=0.0)


import matplotlib.pyplot as plt
fig, ax = plt.subplots()
im = ax.imshow(img_data_arr, extent=coords_bbox)
plt.show()
rsgislib.tools.plotting.cumulative_stretch_np_arr(arr_data: numpy.array, no_data_val: Optional[float] = None, lower: int = 2, upper: int = 98, out_off: float = 0, out_gain: float = 1, out_int_type=False, min_out_val: float = 0, max_out_val: float = 1) numpy.array

A function which performs a cumulative stretch using an upper and lower percentile to define the min-max values. This analysis is on a per band basis for a numpy array representing an image dataset. This function is useful in combination with get_gdal_raster_mpl_imshow for displaying raster data from an input image as a plot. By default this function returns values in a range 0 - 1 but if you prefer 0 - 255 then set the out_gain to 255 and the out_int_type to be True to get an 8bit unsigned integer value.

Parameters
  • arr_data – The numpy array as either [n,m,b] or [n,m] where n and m are the number of image pixels in the x and y axis’ and b is the number of image bands.

  • no_data_val – the no data value for the input data. If there isn’t a no data value then leave as None (default)

  • lower – lower percentile (default: 2)

  • upper – upper percentile (default: 98)

  • out_off – Output offset value (value * gain) + offset. Default: 0

  • out_gain – Output gain value (value * gain) + offset. Default: 1

  • out_int_type – False (default) and the output type will be float and True and the output type with be integers.

  • min_out_val – Minimum output value within the output array (default: 0)

  • max_out_val – Maximum output value within the output array (default: 1)

Returns

A number array with the rescaled values but same dimensions as the input numpy array.

img_sub_bbox = [554756, 577168, 9903924, 9944315]
input_img = "sen2_img_strch.kea"

img_data_arr, coords_bbox = get_gdal_raster_mpl_imshow(input_img,
                                                       bands=[8,9,3],
                                                       bbox=img_sub_bbox)

img_data_arr = cumulative_stretch_np_arr(img_data_arr, no_data_val=0.0)

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
im = ax.imshow(img_data_arr, extent=coords_bbox)
plt.show()
rsgislib.tools.plotting.stdev_stretch_np_arr(arr_data: numpy.array, no_data_val: Optional[float] = None, n_stdevs: float = 2.0, out_off: float = 0, out_gain: float = 1, out_int_type=False, min_out_val: float = 0, max_out_val: float = 1) numpy.array

A function which performs a standard deviation stretch using an upper and lower (mean + n*std) and (mean - n*std) to define the min-max values. This analysis is on a per band basis for a numpy array representing an image dataset. This function is useful in combination with get_gdal_raster_mpl_imshow for displaying raster data from an input image as a plot. By default this function returns values in a range 0 - 1 but if you prefer 0 - 255 then set the out_gain to 255 and the out_int_type to be True to get an 8bit unsigned integer value.

Parameters
  • arr_data – The numpy array as either [n,m,b] or [n,m] where n and m are the number of image pixels in the x and y axis’ and b is the number of image bands.

  • no_data_val – the no data value for the input data. If there isn’t a no data value then leave as None (default)

  • n_stdevs – number of standard deviations to be used for the stretch. Default: 2.0

  • out_off – Output offset value (value * gain) + offset. Default: 0

  • out_gain – Output gain value (value * gain) + offset. Default: 1

  • out_int_type – False (default) and the output type will be float and True and the output type with be integers.

  • min_out_val – Minimum output value within the output array (default: 0)

  • max_out_val – Maximum output value within the output array (default: 1)

Returns

A number array with the rescaled values but same dimensions as the input numpy array.

img_sub_bbox = [554756, 577168, 9903924, 9944315]
input_img = "sen2_img_strch.kea"

img_data_arr, coords_bbox = get_gdal_raster_mpl_imshow(input_img,
                                                       bands=[8,9,3],
                                                       bbox=img_sub_bbox)

img_data_arr = stdev_stretch_np_arr(img_data_arr, no_data_val=0.0)

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
im = ax.imshow(img_data_arr, extent=coords_bbox)
plt.show()
rsgislib.tools.plotting.manual_stretch_np_arr(arr_data: numpy.array, min_max_vals: Union[Dict, List[Dict]], no_data_val: Optional[float] = None, out_off: float = 0, out_gain: float = 1, out_int_type=False, min_out_val: float = 0, max_out_val: float = 1) numpy.array

A function which performs a linear stretch using the min-max values provided on a per band basis for a numpy array representing an image dataset. This function is useful in combination with get_gdal_raster_mpl_imshow for displaying raster data from an input image as a plot. By default this function returns values in a range 0 - 1 but if you prefer 0 - 255 then set the out_gain to 255 and the out_int_type to be True to get an 8bit unsigned integer value.

Parameters
  • arr_data – The numpy array as either [n,m,b] or [n,m] where n and m are the number of image pixels in the x and y axis’ and b is the number of image bands.

  • min_max_vals – either a list of dicts each with a ‘min’ and ‘max’ key specifying the min and max value for the stretch of each band. Or, if just a single band then provide a single dict rather than a list. The number items in the list must equal the number of dimensions within the arr_data.

  • no_data_val – the no data value for the input data. If there isn’t a no data value then leave as None (default)

  • out_off – Output offset value (value * gain) + offset. Default: 0

  • out_gain – Output gain value (value * gain) + offset. Default: 1

  • out_int_type – False (default) and the output type will be float and True and the output type with be integers.

  • min_out_val – Minimum output value within the output array (default: 0)

  • max_out_val – Maximum output value within the output array (default: 1)

Returns

A number array with the rescaled values but same dimensions as the input numpy array.

img_sub_bbox = [554756, 577168, 9903924, 9944315]
input_img = "sen2_img_strch.kea"

img_data_arr, coords_bbox = get_gdal_raster_mpl_imshow(input_img,
                                                       bands=[8,9,3],
                                                       bbox=img_sub_bbox)

min_max_vals = list()
min_max_vals.append({'min':10, 'max':400})
min_max_vals.append({'min':22, 'max':300})
min_max_vals.append({'min':1, 'max':120})

img_data_arr = manual_stretch_np_arr(img_data_arr,
                                     min_max_vals,
                                     no_data_val=0.0)


import matplotlib.pyplot as plt
fig, ax = plt.subplots()
im = ax.imshow(img_data_arr, extent=coords_bbox)
plt.show()
rsgislib.tools.plotting.limit_range_np_arr(arr_data: numpy.array, min_thres: float = 0, min_out_val: float = 0, max_thres: float = 1, max_out_val: float = 1) numpy.array

A function which can be used to limit the range of the numpy array. For example, to mask values less than 0 to 0 and values greater than 1 to 1.

Parameters
  • arr_data – input numpy array.

  • min_thres – the threshold for the minimum value.

  • min_out_val – the value assigned to values below the min_thres

  • max_thres – the threshold for the maximum value.

  • max_out_val – the value assigned to the values above the max_thres

Returns

numpy array with output values.