RSGISLib Raster GIS Module

Utilities

rsgislib.rastergis.pop_rat_img_stats(clumps_img=string, add_clr_tab=boolean, calc_pyramids=boolean, ignore_zero=boolean, rat_band=int)

Populates header statics (e.g., builds histogram) and pyramids for thematic images. Note, this function expects that the image file format supports a raster attribute table (RAT) so will therefore not work with formats such as GTIFF

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • add_clr_tab – is a boolean to specify whether a colour table should created and added (colours will be random) (Optional, default = True)

  • calc_pyramids – is a boolean to specify where overview images could be created (Optional, default = True)

  • ignore_zero – is a boolean specifying whether zero should be ignored (i.e., set as a no data value). (Optional, default = True)

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
clumps='injune_p142_casi_sub_utm_segs_nostats_addstats.kea'
pyramids=True
colourtable=True
rastergis.pop_rat_img_stats(clumps, colourtable, pyramids)
rsgislib.rastergis.collapse_rat(clumps_img, select_col, output_img, gdalformat, rat_band)

Collapses the image and rat to a set of selected rows (defined with a value of 1 in the selected column).

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • select_col – is a string containing the name of the binary column used to selected the rows to which the RAT is to be collapsed to.

  • output_img – is a string with the output file name

  • gdalformat – is a string with the output image file format - note only KEA and HFA support RATs.

  • rat_band – is the image band with which the RAT is associated.

rsgislib.rastergis.get_rat_length(clumps_img: str, rat_band: int = 1) int

A function which returns the length (i.e., number of rows) within the RAT.

Parameters:
  • clumps_img – path to the image file with the RAT

  • rat_band – the band within the image file for which the RAT is to read.

Returns:

an int with the number of rows.

rsgislib.rastergis.get_rat_columns(clumps_img: str, rat_band: int = 1) List[str]

A function which returns a list of column names within the RAT.

Parameters:
  • clumps_img – path to the image file with the RAT

  • rat_band – the band within the image file for which the RAT is to read.

Returns:

list of column names.

rsgislib.rastergis.get_rat_columns_info(clumps_img: str, rat_band: int = 1)

A function which returns a dictionary of column names with type (GFT_Integer, GFT_Real, GFT_String) and usage (e.g., GFU_Generic, GFU_PixelCount, GFU_Name, etc.) within the RAT.

Parameters:
  • clumps_img – path to the image file with the RAT

  • rat_band – the band within the image file for which the RAT is to read.

Returns:

dict of column information.

rsgislib.rastergis.check_string_col_valid(clumps_img: str, str_col: str, rm_punc: bool = False, rm_spaces: bool = False, rm_non_ascii: bool = False, rm_dashs: bool = False)

A function which checks a string column to ensure nothing is invalid.

Parameters:
  • clumps_img – input clumps image.

  • str_col – the column to check

  • rm_punc – If True removes punctuation from column name other than dashs and underscores.

  • rm_spaces – If True removes spaces from the column name, replacing them with underscores.

  • rm_non_ascii – If True removes characters which are not in the ascii range of characters.

  • rm_dashs – If True then dashs are removed from the column name.

Attribute Clumps

rsgislib.rastergis.populate_rat_with_stats(input_img=string, clumps_img=string, band_stats=rsgislib.rastergis.BandAttStats, rat_band=int)

Populates an attribute table with statistics from an input values image.

Parameters:
  • input_img – is a string containing the name of the input image file from which the clumps are to populated.

  • clumps_img – is a string containing the name of the input clumps image file

  • band_stats – is a sequence of rsgislib.rastergis.BandAttStats objects that have attributes in line with rsgis.cmds.RSGISBandAttStatsCmds * band: int defining the image band to process * min_field: string defining the name of the field for min value * max_field: string defining the name of the field for max value * sum_field: string defining the name of the field for sum value * mean_field: string defining the name of the field for mean value * std_dev_field: string defining the name of the field for standard deviation value

  • rat_band is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
clumps='./TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_popstats.kea'
input='./Rasters/injune_p142_casi_sub_utm.kea'
bs = []
bs.append(rastergis.BandAttStats(band=1, min_field='b1Min', max_field='b1Max', mean_field='b1Mean', sum_field='b1Sum', std_dev_field='b1StdDev'))
bs.append(rastergis.BandAttStats(band=2, min_field='b2Min', max_field='b2Max', mean_field='b2Mean', sum_field='b2Sum', std_dev_field='b2StdDev'))
bs.append(rastergis.BandAttStats(band=3, min_field='b3Min', max_field='b3Max', mean_field='b3Mean', sum_field='b3Sum', std_dev_field='b3StdDev'))
rastergis.populate_rat_with_stats(input, clumps, bs)
rsgislib.rastergis.populate_rat_with_cat_proportions(in_cats_img=string, clumps_img=string, out_cols_name=string, maj_col_name=string, cp_cls_names=boolean, maj_cls_name_field=string cls_name_field=string, rat_band_clumps=int, rat_band_cats=int)

Populates the attribute table with the proportions of intersecting categories

Parameters:
  • in_cats_img – is a string containing the name of the categories (classification) image file from which the propotions are calculated

  • clumps_img – is a string containing the name of the input clump file to which the proportions are to be populated.

  • out_cols_name – is a string representing the base name for the output columns containing the proportions.

  • maj_col_name – is a string for name of the field which will hold the majority class.

  • cp_cls_names – is a boolean defining whether class names should be copied (Optional, Default = false).

  • maj_cls_name_field – is a string for the output column within the clumps image with the majority class names field (Optional, only used if copyClassNames == True)

  • cls_name_field – is a string with the name of the column within the categories image for the class names (Optional, only used if copyClassNames == True)

  • rat_band_clumps – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.

  • rat_band_cats – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the catagories image.

rsgislib.rastergis.populate_rat_with_percentiles(input_img=string, clumps_img=string, img_band=int, band_stats=rsgislib.rastergis.BandAttStats, n_hist_bins=int, rat_band=int)

Populates an attribute table with a percentile of the pixel values from an image.

Parameters:
  • input_img – is a string containing the name of the input image file

  • clumps_img – is a string containing the name of the input clump file

  • img_band – is an int which specifies the image band (from valsimage) for which the stats are to be calculated

  • band_stats – is a sequence of objects that have attributes matching rsgislib.rastergis.BandAttPercentiles * percentile: float defining the percentile to calculate (Valid range is 0 - 100) * field_name: string defining the name of the field to use for this percentile

  • n_hist_bins – is an optional (default = 200) integer specifying the number of bins within the histogram (note this governs the accuracy to which percentile can be calculated).

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

inputImage = './Rasters/injune_p142_casi_sub_utm.kea'
clumpsImage = './TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_popstats.kea'
band=1
bandPercentiles = []
bandPercentiles.append(rastergis.BandAttPercentiles(percentile=25.0, field_name='B1Per25'))
bandPercentiles.append(rastergis.BandAttPercentiles(percentile=50.0, field_name='B1Per50'))
bandPercentiles.append(rastergis.BandAttPercentiles(percentile=75.0, field_name='B1Per75'))
rastergis.populate_rat_with_percentiles(inputImage, clumpsImage, band, bandPercentiles)
rsgislib.rastergis.populate_rat_with_mode(input_img=string, clumps_img=string, out_cols_name=string, use_no_data=boolean, no_data_val=long, out_no_data=boolean, mode_band=uint, rat_band=uint)

Populates the attribute table with the mode of from a single band in the input image. Note this only makes sense if the input pixel values are integers.

Parameters:
  • input_img – is a string containing the name of the input image file from which the mode is calculated

  • clumps_img – is a string containing the name of the input clump file to which the mode will be populated.

  • out_cols_name – is a string representing the name for the output column containing the mode.

  • use_no_data – is a boolean defining whether the no data value should be ignored (Optional, Default = False).

  • no_data_val – is a long defining the no data value to be used (Optional, Default = 0)

  • out_no_data – is a boolean to specify that although the no data value should be used for the calculation it should not be outputted to the RAT as a output value unless there is no valid data within the clump. (Default = True)

  • mode_band – is an optional (default = 1) integer parameter specifying the image band for which the mode is to be calculated.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.

rsgislib.rastergis.populate_rat_with_prop_valid_pxls(input_img=string, clumps_img=string, out_col=string, no_data_val=float, rat_band=uint)

Populates the attribute table with the proportion of valid pixels within the clump.

Parameters:
  • input_img – is a string containing the name of the input image file from which the valid pixels are to be identified

  • clumps_img – is a string containing the name of the input clump file to which the proportion will be populated.

  • out_col – is a string representing the name for the output column containing the proportion.

  • no_data_val – is a float defining the no data value to be used.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.

rsgislib.rastergis.populate_rat_with_meanlit_stats(in_vals_img=string, clumps_img=string, mean_lit_img=string, mean_lit_band=int, mean_lit_col=string, pxl_count_col=string, band_stats=rsgislib.rastergis.BandAttStats, rat_band=int)

Populates an attribute table with statistics from an input values image where only the pixels with a band value above a defined threshold are used. This is something referred to as the mean-lit statistics, i.e., the sunlit pixels within the object.

Parameters:
  • in_vals_img – is a string containing the name of the input image file from which the clumps are to populated.

  • clumps_img – is a string containing the name of the input clumps image file

  • mean_lit_img – is a string containing the name of the input image containing the band to be used for the mean-lit stats.

  • mean_lit_band – is an unsigned integer specifying the image band to be used within the meanLitImage.

  • mean_lit_col – is a string specifying the column to be used for the ‘mean’ for each object in the mean-lit calculation

  • pxl_count_col – is a string specifying the output column in the RAT where the count for the number of pixels within each clump used for the stats is outputted.

  • band_stats – is a sequence of rsgislib.rastergis.BandAttStats objects that have attributes in line with rsgis.cmds.RSGISBandAttStatsCmds * band: int defining the image band to process * min_field: string defining the name of the field for min value * max_field: string defining the name of the field for max value * sum_field: string defining the name of the field for sum value * mean_field: string defining the name of the field for mean value * std_dev_field: string defining the name of the field for standard deviation value

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
inputImage = "RapidEye_20130625_lat53lon389_tid3063312_oid167771_rad_toa.kea"
segmentClumps = "RapidEye_20130625_lat53lon389_tid3063312_oid167771_rad_toa_segs.kea"
ndviImage = "RapidEye_20130625_lat53lon389_tid3063312_oid167771_rad_toa_ndvi.kea"
bandStats = []
bandStats.append(rastergis.BandAttStats(band=1, mean_field='BlueMeanML', std_dev_field='BlueStdDevML'))
bandStats.append(rastergis.BandAttStats(band=2, mean_field='GreenMeanML', std_dev_field='GreenStdDevML'))
bandStats.append(rastergis.BandAttStats(band=3, mean_field='RedMeanML', std_dev_field='RedStdDevML'))
bandStats.append(rastergis.BandAttStats(band=4, mean_field='RedEdgeMeanML', std_dev_field='RedEdgeStdDevML'))
bandStats.append(rastergis.BandAttStats(band=5, mean_field='NIRMeanML', std_dev_field='NIRStdDevML'))
rastergis.populate_rat_with_meanlit_stats(valsimage=inputImage, clumps=segmentClumps, meanLitImage=ndviImage, meanlitBand=1, meanLitCol='NDVIMean', pxlCountCol='MLPxlCount', bandstats=bandStats, rat_band=1)
rsgislib.rastergis.populate_rat_with_cat_vec_lyr(clumps_img: str, out_col_name: str, vec_file: str, vec_lyr: str, vec_att_col: str = None, no_data_val: int = 0, mode_use_no_data: bool = True, tmp_dir: str = 'tmp_dir')

A function which populates a vector layer to a raster attribute table. This function rasterises the vector layer to the same pixel grid as the clumps file and then using the mode to populate the clumps. If a vector attribute column is provided then that column will be rasterised otherwise a binary mask will be used.

Parameters:
  • clumps_img – Input clumps file path.

  • out_col_name – the output column name.

  • vec_file – input vector file path

  • vec_lyr – input vector layer

  • vec_att_col – optional attribute within the vector layer to be used to populate the clumps image. This must be an integer variable.

  • no_data_val – the no data value to use.

  • mode_use_no_data – use the no data value when calculating the mode.

  • tmp_dir – a temporary directory to outputs. If tmp_dir path does not exist it will be created and then deleted at the end of the function.

rsgislib.rastergis.str_class_majority(base_clumps_img, info_clumps_img, base_class_col, info_class_col, ignore_zero=True, rat_band_base=1, rat_band_info=1)

Finds the majority for class (string - field) from a set of small objects to large objects

Parameters:
  • base_clumps_img – is a the base clumps file, to be attribured.

  • info_clumps_img – is the file to take attributes from.

  • base_class_col – the output column name in the baseSegment file.

  • info_class_col – is the colum name in the infoSegment file.

  • ignore_zero – is a boolean specifying if zeros should be ignored in input layer. If set to false values of 0 will be included when calculating the class majority, otherwise the majority calculation will only consider objects with a value greater than 0.

  • rat_band_base – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the base clumps.

  • rat_band_info – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the info clumps.

from rsgislib import rastergis
clumps='./TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_popstats.kea'
classRAT='./TestOutputs/RasterGIS/reInt_rat.kea'
rastergis.str_class_majority(clumps, classRAT, 'class_dst', 'class_src')
rsgislib.rastergis.define_class_names(clumps_img: str, class_num_col: str, class_name_col: str, class_names_dict: dict)
A function to create a class names column in a RAT based on segmented clumps

where a number of clumps have the same number class.

Parameters:
  • clumps_img – input clumps image.

  • class_num_col – column specifying the class number (e.g., where clumps are segments in a segmentation)

  • class_name_col – the output column name where a string will be created if it doesn’t already exists.

  • class_names_dict – Dictionary to look up the class names. The key needs to the integer number for the class

rsgislib.rastergis.set_column_data(clumps_img: str, col_name: str, col_data: array)

A function to read a column of data from a RAT.

Parameters:
  • clumps_img – Input clumps image

  • col_name – Name of the column to be written.

  • col_data – Data to be written to the column.

rsgislib.rastergis.create_uid_col(clumps_img: str, col_name: str = 'UID')

A function which adds a unique ID value (starting at 0) to each clump within a RAT.

Parameters:
  • clumps_img – Input clumps image

  • col_name – The output column name (default is UID).

Calculate Spatial Relationships

rsgislib.rastergis.calc_dist_to_classes(clumps_img: str, class_col: str, out_img_base: str, tmp_dir: str = 'tmp', tile_size: int = 2000, max_dist: int = 1000, no_data_val: int = 1000, n_cores: int = -1)

A function which will calculate proximity rasters for a set of classes defined within the RAT.

Parameters:
  • clumps_img – is a string specifying the input image with the associated RAT

  • class_col – is the column in the RAT which has the classification

  • out_img_base – is the base name of the output image - output files will be KEA files.

  • tmp_dir – is a directory to be used for storing the image tiles and other temporary files - if not directory does not exist it will be created and deleted on completion (Default: tmp).

  • tile_size – is an int specifying in pixels the size of the image tiles used for processing (Default: 2000)

  • max_dist – is the maximum distance in units of the geographic units of the projection of the input image (Default: 1000).

  • no_data_val – is the value applied to the pixels outside of the maxDist threshold (Default: 1000; i.e., the same as maxDist).

  • n_cores – is the number of processing cores which are available to be used for this processing. If -1 all available cores will be used. (Default: -1)

rsgislib.rastergis.calc_dist_between_clumps(clumps_img: str, out_col_name: str, tmp_dir: str = 'tmp', use_idx: bool = False, max_dist_thres: float = 10)

Calculate the distance between all clumps

Parameters:
  • clumps_img – image clumps for which the distance will be calculated.

  • out_col_name – output column within the clumps image.

  • tmp_dir – directory out temporary files will be outputted to.

  • use_idx – use a spatial index when calculating the distance between clumps (needed for large number of clumps).

  • max_dist_thres – if using an index than an upper limit on the distance between clumps can be defined.

rsgislib.rastergis.calc_dist_to_large_clumps(clumps_img: str, out_col_name: str, size_thres: float, tmp_dir: str = 'tmp', use_idx: bool = False, max_dist_thres: float = 10)

Calculate the distance from each small clump to a large clump. Split defined by the threshold provided.

Parameters:
  • clumps_img – image clumps for which the distance will be calculated.

  • out_col_name – output column within the clumps image.

  • size_thres – is a threshold to seperate the sets of large and small clumps.

  • tmp_dir – directory out temporary files will be outputted to.

  • use_idx – use a spatial index when calculating the distance between clumps (needed for large number of clumps).

  • max_dist_thres – if using an index than an upper limit on the distance between clumps can be defined.

rsgislib.rastergis.calc_border_length(clumps_img, out_col, ignore_zero_edges)

Calculate the border length of clumps

Parameters:
  • clumps_img – is a string containing the name of the input image file

  • out_col – is a string with the output column name

  • ignore_zero_edges – is a boolean specifying whether zero edges (i.e., no data) should be ignored

rsgislib.rastergis.calc_rel_border(clumps_img, out_col, class_names_col, class_name, ignore_zero_edges)

Calculates the relative border length of the clumps to a class

Parameters:
  • inputImage – is a string containing the name of the input image file

  • out_col – is a string specifying the output column name

  • class_names_col – is a string specifying the column which holds the class names

  • class_name – is a string specifying the class for which the relative boarder is to be calculated.

  • ignore_zero_edges – is a boolean specifying whether zero edges (i.e., no data) should be ignored

rsgislib.rastergis.calc_rel_diff_neigh_stats(clumps_img, field_stats, use_abs_diff, rat_band)

Calculates the difference (relative or absolute) between each clump and it’s neighbours. The differences can be summarised as min, max, mean, std dev or sum.

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • field_stats – has the following fields * field: string defining the field in the RAT to compare to. * min_field: string defining the name of the field for min value * max_field: string defining the name of the field for max value * sum_field: string defining the name of the field for sum value * mean_field: string defining the name of the field for mean value * std_dev_field: string defining the name of the field for standard deviation value

  • use_abs_diff – calculate the absolute difference.:param rat_band: is the image band with which the RAT is associated.

import rsgislib.rastergis
inputImage = './RapidEye_20130625_lat53lon389_tid3063312_oid167771_rad_toa_segs_neigh.kea'
ratBand = 1
rsgislib.rastergis.find_neighbours(inputImage, ratBand)
fieldInfo = rsgislib.rastergis.FieldAttStats(field='NIRMean', min_field='MinNIRMeanDiff', max_field='MaxNIRMeanDiff')
rsgislib.rastergis.calc_rel_diff_neigh_stats(inputImage, fieldInfo, False, ratBand)
rsgislib.rastergis.define_border_clumps(clumps_img, out_col)

Defines the clumps which are on the border within the file of the clumps using a mask

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • out_col – is a string containing the name of the output column where a value of 1 indicates a border clumps

rsgislib.rastergis.define_clump_tile_positions(clumps_img, tile_img, out_col, tile_overlap, tile_boundary, tile_body)

Defines the position within the file of the clumps.

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • tile_img – is a string containing the name of the input tile image

  • out_col – is a string containing the name of the output column

  • tile_overlap – is an unsigned int defining the overlap between tiles

  • tile_boundary – is an unsigned int

  • tile_body – is an unsigned int

rsgislib.rastergis.find_boundary_pixels(clumps_img, output_img, gdalformat, rat_band)

Identifies the pixels on the boundary of the clumps

Parameters:
  • clumps_img – is a string containing the name of the input image file

  • output_img – is a string containing the name of the output file

  • gdalformat – is a string containing the GDAL format for the output file - (Optional, Default = ‘KEA’)

  • rat_band – is an int containing band for which the neighbours are to be calculated for (Optional, Default = 1)

rsgislib.rastergis.find_neighbours(clumps_img, rat_band)

Finds the clump neighbours from an image

Parameters:
  • clumps_img – is a string containing the name of the input image file

  • rat_band – is an int containing band for which the neighbours are to be calculated for (Optional, Default = 1)

rsgislib.rastergis.clumps_spatial_location(clumps_img=string, eastings=string, northings=string, rat_band=int)

Adds spatial location columns to the attribute table

Parameters:
  • inputImage – is a string containing the name of the input image file

  • eastings – is a string containing the name of the eastings field

  • northings – is a string containing the name of the northings field

  • rat_band – is an integer containing the band number for the RAT (Optional, default = 1)

from rsgislib import rastergis
image = 'injune_p142_casi_sub_utm_segs_spatloc_eucdist.kea'
eastings = 'Easting'
northings = 'Northing'
rastergis.clumps_spatial_location(image, eastings, northings)
rsgislib.rastergis.clumps_spatial_extent(clumps_img=string, min_xx=string, min_xy=string, max_xx=string, max_xy=string, min_yx=string, min_yy=string, max_yx=string, max_yy=string, rat_band=int)

Adds spatial extent for each clump to the attribute table

Parameters:
  • clumps_img – is a string containing the name of the input image file

  • min_xx – is a string containing the name of the min X X field

  • min_xy – is a string containing the name of the min X Y field

  • max_xx – is a string containing the name of the max X X field

  • max_xy – is a string containing the name of the max X Y field

  • min_yx – is a string containing the name of the min Y X field

  • min_yy – is a string containing the name of the min Y Y field

  • max_yx – is a string containing the name of the max Y X field

  • max_yy – is a string containing the name of the max Y Y field

  • rat_band – is an integer containing the band number for the RAT (Optional, default = 1)

from rsgislib import rastergis
image = 'injune_p142_casi_sub_utm_segs_spatloc_eucdist.kea'
minX_X = 'minXX'
minX_Y = 'minXY'
maxX_X = 'maxXX'
maxX_Y = 'maxXY'
minY_X = 'minYX'
minY_Y = 'minYY'
maxY_X = 'maxYX'
maxY_Y = 'maxYY'
rastergis.clumps_spatial_extent(image, minX_X, minX_Y, maxX_X, maxX_Y, minY_X, minY_Y, maxY_X, maxY_Y)

Read RAT

rsgislib.rastergis.get_column_data(clumps_img: str, col_name: str) array

A function to read a column of data from a RAT.

Parameters:
  • clumps_img – Input clumps image

  • col_name – Name of the column to be read.

Returns:

numpy array with values from the clumpsImg

rsgislib.rastergis.read_rat_neighbours(clumps_img: str, start_row: int = None, end_row: int = None, rat_band: int = 1) List[List[int]]

A function which returns a list of clumps neighbours from a KEA RAT. Note, the neighbours are popualted using the function rsgislib.rastergis.findNeighbours. By default the whole datasets of neightbours is read to memory but the start_row and end_row variables can be used to read a subset of the RAT.

Parameters:
  • clumps_img – path to the image file with the RAT

  • start_row – the row within the RAT to start reading, if None will start at 0 (Default: None).

  • end_row – the row within the RAT to end reading, if None will end at n_rows within the RAT. (Default: None)

  • rat_band – the band within the image file for which the RAT is to read.

Returns:

list of lists with neighbour indexes.

Sampling

rsgislib.rastergis.histo_sampling(clumps_img=string, val_col=string, out_sel_col=string, prop_sample=float, bin_width=float, cls_col=string, class_val=string, rat_band=int)

This function performs a histogram based sampling of the RAT for a specific column. The output is a binary column within the RAT where rows with a value of 1 are the selected clumps.

Parameters:
  • clumps_img – is a string containing the name of the input clumps image file

  • val_col – is a string containing the name of the field with the values used for the sampling.

  • out_sel_col – is a string containing the name of the field where the binary output will be written (1 for selected clumps).

  • prop_sample – is a float specifying the proportion of the datasets which should be within the outputted sample. Values range of 0-1. 0.5 would be a 50% sample.:param bin_width: is a float specifying the width of each histogram bin.

  • cls_col – is a string specifying a field within which classes have been defined. This can be used to only apply the sampling to a thematic subset of the RAT. If set as None then this is ignored. (Default = None)

  • class_val – is a string specifying the class it will be limited to.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis

rastergis.histo_sampling(clumps='N00E103_10_grid_knn.kea', varCol='HH', outSelectCol='HHSampling', propOfSample=0.25, binWidth=0.01, classColumn='Class', classVal='2')
rsgislib.rastergis.take_random_sample(clumps_img: str, in_col_name: str, in_col_val: float, out_col_name: str, sample_ratio: float, rnd_seed: int = 0)

A function to take a random sample of an input column.

Parameters:
  • clumps_img – clumps image.

  • in_col_name – input column name.

  • in_col_val – numeric value for which the random sample is to be taken for.

  • out_col_name – output column where value of 1 is selected within the random sample and 0 is not selected.

  • sample_ratio – the size of the sample (0 - 1.0; i.e., 10% = 0.1) to be taken of the number of rows within input value.

  • rnd_seed – is the seed for the random number generation (optional; default is 0).

rsgislib.rastergis.select_clumps_on_grid(clumps_img, in_sel_col, out_sel_col, eastings_col, northings_col, metric_col, method, rows, cols)

Selects a segment within a regular grid pattern across the scene. The clump is selected based on the minimum, maximum or closest to the mean.

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • in_sel_col – is a string which defines the column name where a value of 1 defines the clumps which will be included in the analysis.

  • out_sel_col – is a string which defines the column name where a value of 1 defines the clumps selected by the analysis.

  • eastings_col – is a string which defines a column with a eastings for each clump.

  • northings_col – is a string which defines a column with a northings for each clump.

  • metric_col – is a string which defines a column with a value for each clump which will be used for the distance, min, or max anaylsis.

  • method – is a string which defines whether the minimum, maximum or mean method of selecting a clump will be used (values can be either min, max or mean).

  • rows – is an unsigned integer which defines the number of rows within which a clump will be selected.

  • cols – is an unsigned integer which defines the number of columns within which a clump will be selected.

Classification

rsgislib.rastergis.identify_small_units(clumps_img: str, class_col: str, tmp_dir: str, out_col_name: str, small_clumps_thres: float, use_tiled_clump: bool = False, n_cores: int = 1, tile_width: int = 2000, tile_height: int = 2000)

Identify small connected units within a classification. The threshold to define small is provided by the user in pixels. Note, the outColName and smallClumpsThres variables can be provided as lists to identify a number of thresholds of small units.

Parameters:
  • clumps_img – string for the clumps image file containing input classification

  • class_col – string for the column name representing the classification as integer values

  • tmp_dir – directory path where temporary layers are stored (if directory is created within the function it will be deleted once function is complete).

  • out_col_name – a list of output column names (i.e., one for each threshold)

  • small_clumps_thres – a list of thresholds for identifying small clumps.

  • use_tiled_clump – a boolean to specify whether the tiled clumping algorithm should be used (Default is False; select True for large datasets)

  • n_cores – if the tiled version of the clumping algorithm is being used then there is an option to use multiple processing cores; specify the number to be used (Default is 2).

  • tile_width – is the width of the image tile (in pixels) if tiled clumping is used.

  • tile_height – is the height of the image tile (in pixels) if tiled clumping is used.

Example:

import rsgislib.rastergis

clumpsImg = "LS2MSS_19750620_lat10lon6493_r67p250_rad_srefdem_30m_clumps.kea"
tmpPath = "tmp/"
classCol = "OutClass"
outColName = ["SmallUnits25", "SmallUnits50", "SmallUnits100"]
smallClumpsThres = [25, 50, 100]
rastergis.identify_small_units(clumpsImg, classCol, tmpPath,
                               outColName, smallClumpsThres)
rsgislib.rastergis.class_split_fit_hist_gausian_mixture_model(clumps_img=string, out_col=string, val_col=string, bin_width=float, cls_col=string, cls_val=string, rat_band=int)

This function fits a Gaussian mixture model to the histogram for a variable in the RAT and uses it to split the class into a series of subclasses.

Parameters:
  • clumps_img – is a string containing the name of the input clumps image file

  • out_col – is a string for a HDF5 with the fitted Gaussians.

  • val_col – is a string containing the name of the field with the values used for the sampling.

  • bin_width – is a float specifying the width of each histogram bin.

  • cls_col – is a string specifying a field within which classes have been defined.

  • cls_val – is a string specifying the class it will be limited to.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis

rastergis.class_split_fit_hist_gausian_mixture_model(clumps='FrenchGuiana_10_ALL_sl_HH_lee_UTM_mosaic_dB_segs.kea', outCol='MangroveSubClass', varCol='HVdB', binWidth=0.1, classColumn='Classes', classVal='Mangroves')

Extrapolation

rsgislib.rastergis.apply_rat_knn(clumps_img=string, in_extrap_col=string, out_extrap_col=string, train_regions_col=string, apply_regions_col=string, val_cols=list<string>, k_feat=uint, dist_knn=int, summerise_knn=int, dist_thres=float, rat_band=int)

This function uses the KNN algorithm to allow data values to be extrapolated to segments.

Parameters:
  • clumps_img – is a string containing the name of the input clumps image file

  • in_extrap_col – is a string containing the name of the field with the values used for the extrapolation.

  • out_extrap_col – is a string containing the name of the field where the extrapolated values will be written to.

  • train_regions_col – is a string containing the name of the field specifying the clumps to be used as training - binary column (1 == training region).

  • apply_regions_col – is a string containing the name of the field specifying the regions for which KNN is to be applued - binary column (1 == regions to be calculated). If None then ignored and applied to all.:param val_cols: is a list of strings specifying the fields which will be used to calculate distance.

  • k_feat – is an unsigned integer specifying the number of nearest features (i.e., K) to be used (Default: 12)

  • dist_knn – specifies how the distance to identify NN is calculated (rsgislib.DIST_EUCLIDEAN, rsgislib.DIST_MANHATTEN, rsgislib.DIST_MAHALANOBIS, rsgislib.DIST_MINKOWSKI, rsgislib.DIST_CHEBYSHEV; Default: rsgislib.DIST_MAHALANOBIS).

  • summerise_knn – specifies how the extrapolation value is calculated (rsgislib.SUMTYPE_MODE, rsgislib.SUMTYPE_MEAN, rsgislib.SUMTYPE_MEDIAN, rsgislib.SUMTYPE_MIN, rsgislib.SUMTYPE_MAX, rsgislib.SUMTYPE_STDDEV; Default: rsgislib.SUMTYPE_MEDIAN). Mode is used for classification.

  • dist_thres – is a maximum distance threshold over which features will not be included within the ‘k’.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
from rsgislib import imageutils
import rsgislib

forestClumpsImg='./LS5TM_20110428_forestclumps.kea'

rastergis.apply_rat_knn(clumps=forestClumpsImg, inExtrapField='HP95', outExtrapField='HP95Pred', trainRegionsField='LiDARForest', applyRegionsField=None, fields=['RedRefl','GreenRefl','BlueRefl'], kFeat=12, distKNN=rsgislib.DIST_EUCLIDEAN, summeriseKNN=rsgislib.SUMTYPE_MEDIAN, distThres=25)

# Export predicted column to GDAL image
forestHeightImg='./LS5TM_20110428_forest95Height.kea'
rastergis.export_col_to_gdal_img(forestClumpsImg, forestHeightImg, 'KEA', rsgislib.TYPE_32FLOAT, 'HP95Pred')
imageutils.popImageStats(forestHeightImg,True,0.,True)

Change Detection

rsgislib.rastergis.get_global_class_stats(clumps_img, class_field, attributes, cls_chg_cols, rat_band)

Similar to ‘findChangeClumpsFromStdDev’ but rather than applying a threshold to calculate change clumps adds global (over all objects) class mean and standard deviation to RAT.

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • class_field – is a string providing the name of the column containing classes.

  • attributes – is a sequence of strings containing the columns to use when detecting change.

  • cls_chg_cols – is a sequence of python objects having the following attributes: * cls_name - The class name in which change is going to be search for

  • rat_band – is an int containing band for which the neighbours are to be calculated for (Optional, Default = 1)

from rsgislib import rastergis
clumpsImage='injune_p142_casi_sub_utm_segs_popstats.kea'
changeFeatVals = []
changeFeatVals.append(rastergis.ChangeFeat(cls_name='Forest'))
changeFeatVals.append(rastergis.ChangeFeat(cls_name='Scrub-Shrub))
rastergis.get_global_class_stats(clumpsImage, 'ClassName', ['NDVI'], changeFeatVals)

Statistics

rsgislib.rastergis.fit_hist_gausian_mixture_model(clumps_img=string, out_h5_file=string, out_hist_file=string, val_col=string, bin_width=float, cls_col=string, cls_val=string, rat_band=int)

This function fits a Gaussian mixture model to the histogram for a variable in the RAT.

Parameters:
  • clumps_img – is a string containing the name of the input clumps image file

  • out_h5_file – is a string for a HDF5 with the fitted Gaussians.

  • out_hist_file – is a string to output the Histrogram as a HDF5 file.

  • val_col – is a string containing the name of the field with the values used for the sampling.

  • bin_width – is a float specifying the width of each histogram bin.

  • cls_col – is a string specifying a field within which classes have been defined.

  • cls_val – is a string specifying the class it will be limited to.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis

rastergis.fit_hist_gausian_mixture_model(clumps='FrenchGuiana_10_ALL_sl_HH_lee_UTM_mosaic_dB_segs.kea', outH5File='gaufit.h5', outHistFile='histfile.h5', varCol='HVdB', binWidth=0.1, classColumn='Classes', classVal='Mangrove')
rsgislib.rastergis.calc_1d_jm_distance(clumps_img=string, val_col=string, bin_width=float, cls_col=string, class1=string, class2=string, rat_band=uint)

Calculate the Jeffries and Matusita distance for a single variable between two classes.

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • val_col – is a string specifying the name of the variable column.

  • bin_width – is a float specifying the bin width for the histogram.

  • cls_col – is a string specifying the column name with the class names.

  • class1 – is a string specifying the first class.

  • class2 – is a string specifying the second class.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.

Returns:

double for distance

rsgislib.rastergis.calc_2d_jm_distance(clumps_img=string, val1_col=string, val2_col=string, val1_bin_width=float, val2_bin_width=float, cls_col=string, class1=string, class2=string, rat_band=uint)

Calculate the Jeffries and Matusita distance for two variables between two classes.

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • val1_col – is a string specifying the name of the first variable column.

  • val2_col – is a string specifying the name of the second variable column.

  • val1_bin_width – is a float specifying the bin width for the histogram for variable 1.

  • val2_bin_width – is a float specifying the bin width for the histogram for variable 2.

  • cls_col – is a string specifying the column name with the class names.

  • class1 – is a string specifying the first class.

  • class2 – is a string specifying the second class.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.

Returns:

double for distance

rsgislib.rastergis.calc_bhattacharyya_distance(clumps_img=string, val_col=string, cls_col=string, class1=string, class2=string, rat_band=uint)

Calculate the Bhattacharyya distance for a single variable between two classes.

Parameters:
  • clumps_img – is a string containing the name of the input clump file

  • val_col – is a string specifying the name of the variable column.

  • cls_col – is a string specifying the column name with the class names.

  • class1 – is a string specifying the first class.

  • class2 – is a string specifying the second class.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.

Returns:

double for distance

Copy & Export

rsgislib.rastergis.export_rat_cols_to_ascii(clumps_img, out_file, fields, rat_band=1)

Exports selected columns from a GDAL RAT to ASCII file (comma separated). The first column is the object ID (FID).

Parameters:
  • clumps_img – is a string containing the name of the input RAT.

  • out_file – is a string containing the name of the output file.

  • fields – is a sequence of strings containing the field names.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
clumps='./RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea'
outfile='./TestOutputs/RasterGIS/injune_p142_casi_rgb_exportascii.txt'
fields = ['BlueAvg', 'GreenAvg', 'RedAvg']
rastergis.export_rat_cols_to_ascii(clumps, outfile, fields)
rsgislib.rastergis.export_col_to_gdal_img(clumps_img, output_img, gdalformat, datatype, field, rat_band=1)

Exports column of the raster attribute table as bands in a GDAL image.

Parameters:
  • clumps_img – is a string containing the name of the input image file with RAT

  • output_img – is a string containing the name of the output gdal file

  • gdalformat – is a string containing the GDAL format for the output file - eg ‘KEA’

  • datatype – is an int containing one of the values from rsgislib.TYPE_*

  • field – is a string, providing the name of the column to be exported.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
clumps='./RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea'
output_img='./TestOutputs/RasterGIS/injune_p142_casi_rgb_export.kea'
gdalformat = 'KEA'
datatype = rsgislib.TYPE_32FLOAT
field = 'RedAvg'
rastergis.export_col_to_gdal_img(clumps, output_img, gdalformat, datatype, field)
rsgislib.rastergis.export_cols_to_gdal_img(clumps_img: str, output_img: str, gdalformat: str, datatype: int, fields: List[str], rat_band: int = 1, tmp_dir: str = None)

Exports columns of the raster attribute table as bands in a GDAL image. Utility function, exports each column individually then stacks them.

Parameters:
  • clumps_img – is a string containing the name of the input image file with RAT

  • output_img – is a string containing the name of the output gdal file

  • gdalformat – is a string containing the GDAL format for the output file - eg ‘KEA’

  • datatype – is an int containing one of the values from rsgislib.TYPE_*

  • field – is a list of strings, providing the names of the column to be exported

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

Example:

from rsgislib import rastergis
clumps='RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea'
outimage='TestOutputs/RasterGIS/injune_p142_casi_rgb_export.kea'
gdalformat = 'KEA'
datatype = rsgislib.TYPE_32FLOAT
fields = ['RedAvg','GreenAvg','BlueAvg']
rastergis.export_cols_to_gdal_image(clumps, outimage, gdalformat,
                                    datatype, fields)
rsgislib.rastergis.export_clumps_to_images(clumps_img, out_img_base, bin_out, out_img_ext, gdalformat, rat_band=1)

Exports each clump to a seperate raster which is the minimum extent for the clump.

Parameters:
  • clumps_img – is a string containing the name of the input image file with RAT

  • out_img_base – is a string containing the base name of the output image file (C + FID will be added to identify files).

  • bin_out – is a boolean specifying whether the output images should be binary or if the pixel value should be the FID of the clump.

  • out_img_ext – is a sting with the output file extension (e.g., kea) without the preceeding dot to be appended to the file name.

  • gdalformat – is a string containing the GDAL format for the output file - eg ‘KEA’

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

import rsgislib
from rsgislib import rastergis
clumps='./DefineTiles.kea'
outimgbase='./Tiles/OutputImgTile_'
outimgext='kea'
gdalformat = 'KEA'
binaryOut = False
rastergis.export_clumps_to_images(clumps, outimgbase, binaryOut, outimgext, gdalformat, rat_band)
rsgislib.rastergis.copy_gdal_rat_columns(clumps_img, output_img, fields, copy_colours=True, copy_hist=True, rat_band=1)

Copies GDAL RAT columns from one image to another

Parameters:
  • clumps_img – is a string containing the name and path for the image with RAT from which columns are to copied from.

  • output_img – is a string containing the name of the file to which the columns are to be copied.

  • fields – is a sequence of strings containing the names of the fields to copy

  • copy_colours – is a bool specifying if the colour columns should be copied (default = True)

  • copy_hist – is a bool specifying if the histogram should be copied (default = True)

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
table = './RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea'
image = './TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_cpcols.kea'
fields = ['NIRAvg', 'BlueAvg', 'GreenAvg', 'RedAvg']
rastergis.copy_gdal_rat_columns(image, table, fields)

To copy a subset of columns from one RAT to a new file the following can be used:

import rsgislib
import rsgislib.imageutils
from rsgislib import rastergis
rat_band=1
table='inRAT.kea'
output='outRAT_nir_only.kea'
bands = [rat_band]
rsgislib.imageutils.selectImageBands(table, output,'KEA', rsgislib.TYPE_32INT, bands)
fields = ['NIRAvg']
rastergis.copy_gdal_rat_columns(table, output, fields, copycolours=True, copyhist=True, rat_band=rat_band)
rsgislib.rastergis.copy_rat(clumps_img, output_img, rat_band=1)

Copies a GDAL RAT from one image to another

Parameters:
  • clumps_img – is a string containing the name and path for the image with RAT from which columns are to copied from.

  • output_img – is a string containing the name of the file to which the columns are to be copied.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
clumps = './RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea'
output_img = './TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_cptab.kea'
rastergis.copy_rat(clumps, output_img)
rsgislib.rastergis.import_vec_atts(clumps_img, vec_file, vec_lyr, fid_col, col_names, rat_band=1)

Copies the attributes from an input shapefile to the RAT.

Parameters:
  • clumps_img – is a string containing the name of the input file with RAT

  • vec_file – is a string containing the file path of the input vector file

  • vec_lyr – is a string containing the layer name within the input vector file

  • fid_col – is a string with the name of a column which has the clumps pixel value associated with the vector feature.

  • col_names – is a list of strings specifying the columns to be copied to the RAT. If ‘None’ then all attributes will be copied.

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

from rsgislib import rastergis
clumps = 'clumpsFiles.kea'
vectorFile = 'vectorFile.shp'
veclyr = 'vectorFile'
rastergis.import_vec_atts(clumps, vectorFile, veclyr, 'pxlval', None)

Colour Tables

rsgislib.rastergis.colour_rat_classes(clumps_img, field, class_colours, rat_band)

Sets a colour table for a set of classes within the attribute table

Parameters:
  • clumps_img – is a string containing the name of the input file

  • field – is a string containing the name of the input class field (class can be a string or integer).

  • class_colours – is dict mapping int class ids to an object having the following attributes: * red: int defining the red colour component (0 - 255) * green: int defining the green colour component (0 - 255) * blue: int defining the bluecolour component (0 - 255) * alpha: int defining the alpha colour component (0 - 255)

  • rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.

import collections
from rsgislib import rastergis
clumps='./TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_col.kea'
field = 'outClass'
classcolours = {}
colourCat = collections.namedtuple('ColourCat', ['red', 'green', 'blue', 'alpha'])
classcolours[0] = colourCat(red=200, green=50, blue=50, alpha=255)
classcolours[1] = colourCat(red=200, green=240, blue=50, alpha=255)
rastergis.colour_rat_classes(clumps, field, classcolours)
rsgislib.rastergis.set_class_names_colours(clumps_img: str, class_names_col: str, class_info_dict: dict)

A function to define a class names column and define the class colours.

classInfoDict = dict() classInfoDict[1] = {‘classname’:’Forest’, ‘red’:0, ‘green’:255, ‘blue’:0} classInfoDict[2] = {‘classname’:’Water’, ‘red’:0, ‘green’:0, ‘blue’:255}

Parameters:
  • clumps_img – Input clumps image - expecting a classification (rather than segmentation) where the number is the pixel value.

  • class_names_col – The output column for the class names.

  • class_info_dict – a dict where the key is the pixel value for the class.

rsgislib.rastergis.get_rat_colours(clumps_img: str, cls_column: str = None) Dict[str, Dict]

A function which gets the colour table and optionally the class names from the rat and returns it as a dict which can be inputted into the rsgislib.rastergis.set_class_names_colours function. This is useful for copying the class names and colours between files. Output dict will have the following structure:

class_clr_info[1] = {‘classname’:’Forest’, ‘red’:0, ‘green’:255, ‘blue’:0} class_clr_info[2] = {‘classname’:’Water’, ‘red’:0, ‘green’:0, ‘blue’:255}

Parameters:
  • clumps_img – Input clumps image

  • cls_column – Optionally a class names column can be provided. If None then ignored and output dict doesn’t have a ‘classname’ field.

Returns:

dict of dicts

Data Structures / Enums

rsgislib.rastergis.BandAttStats(band, min_field=None, max_field=None, sum_field=None, std_dev_field=None, mean_field=None)

This is passed to the populate_rat_with_stats function

rsgislib.rastergis.FieldAttStats(field, min_field=None, max_field=None, sum_field=None, std_dev_field=None, mean_field=None)

This is passed to the calcRelDiffNeighStats function

rsgislib.rastergis.BandAttPercentiles(percentile: float, field_name: str)

This is passed to the populateRATWithPercentiles function

rsgislib.rastergis.ShapeIndex(col_name: str, idx: int, col_idx: int = 0)

This is passed to the calcShapeIndices function