RSGISLib Raster GIS Module
Utilities
- rsgislib.rastergis.pop_rat_img_stats(clumps_img=string, add_clr_tab=boolean, calc_pyramids=boolean, ignore_zero=boolean, rat_band=int)
Populates header statics (e.g., builds histogram) and pyramids for thematic images. Note, this function expects that the image file format supports a raster attribute table (RAT) so will therefore not work with formats such as GTIFF
- Parameters:
clumps_img – is a string containing the name of the input clump file
add_clr_tab – is a boolean to specify whether a colour table should created and added (colours will be random) (Optional, default = True)
calc_pyramids – is a boolean to specify where overview images could be created (Optional, default = True)
ignore_zero – is a boolean specifying whether zero should be ignored (i.e., set as a no data value). (Optional, default = True)
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis clumps='injune_p142_casi_sub_utm_segs_nostats_addstats.kea' pyramids=True colourtable=True rastergis.pop_rat_img_stats(clumps, colourtable, pyramids)
- rsgislib.rastergis.collapse_rat(clumps_img, select_col, output_img, gdalformat, rat_band)
Collapses the image and rat to a set of selected rows (defined with a value of 1 in the selected column).
- Parameters:
clumps_img – is a string containing the name of the input clump file
select_col – is a string containing the name of the binary column used to selected the rows to which the RAT is to be collapsed to.
output_img – is a string with the output file name
gdalformat – is a string with the output image file format - note only KEA and HFA support RATs.
rat_band – is the image band with which the RAT is associated.
- rsgislib.rastergis.get_rat_length(clumps_img: str, rat_band: int = 1) int
A function which returns the length (i.e., number of rows) within the RAT.
- Parameters:
clumps_img – path to the image file with the RAT
rat_band – the band within the image file for which the RAT is to read.
- Returns:
an int with the number of rows.
- rsgislib.rastergis.get_rat_columns(clumps_img: str, rat_band: int = 1) List[str]
A function which returns a list of column names within the RAT.
- Parameters:
clumps_img – path to the image file with the RAT
rat_band – the band within the image file for which the RAT is to read.
- Returns:
list of column names.
- rsgislib.rastergis.get_rat_columns_info(clumps_img: str, rat_band: int = 1)
A function which returns a dictionary of column names with type (GFT_Integer, GFT_Real, GFT_String) and usage (e.g., GFU_Generic, GFU_PixelCount, GFU_Name, etc.) within the RAT.
- Parameters:
clumps_img – path to the image file with the RAT
rat_band – the band within the image file for which the RAT is to read.
- Returns:
dict of column information.
- rsgislib.rastergis.check_string_col_valid(clumps_img: str, str_col: str, rm_punc: bool = False, rm_spaces: bool = False, rm_non_ascii: bool = False, rm_dashs: bool = False)
A function which checks a string column to ensure nothing is invalid.
- Parameters:
clumps_img – input clumps image.
str_col – the column to check
rm_punc – If True removes punctuation from column name other than dashs and underscores.
rm_spaces – If True removes spaces from the column name, replacing them with underscores.
rm_non_ascii – If True removes characters which are not in the ascii range of characters.
rm_dashs – If True then dashs are removed from the column name.
Attribute Clumps
- rsgislib.rastergis.populate_rat_with_stats(input_img=string, clumps_img=string, band_stats=rsgislib.rastergis.BandAttStats, rat_band=int)
Populates an attribute table with statistics from an input values image.
- Parameters:
input_img – is a string containing the name of the input image file from which the clumps are to populated.
clumps_img – is a string containing the name of the input clumps image file
band_stats – is a sequence of rsgislib.rastergis.BandAttStats objects that have attributes in line with rsgis.cmds.RSGISBandAttStatsCmds * band: int defining the image band to process * min_field: string defining the name of the field for min value * max_field: string defining the name of the field for max value * sum_field: string defining the name of the field for sum value * mean_field: string defining the name of the field for mean value * std_dev_field: string defining the name of the field for standard deviation value
rat_band is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis clumps='./TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_popstats.kea' input='./Rasters/injune_p142_casi_sub_utm.kea' bs = [] bs.append(rastergis.BandAttStats(band=1, min_field='b1Min', max_field='b1Max', mean_field='b1Mean', sum_field='b1Sum', std_dev_field='b1StdDev')) bs.append(rastergis.BandAttStats(band=2, min_field='b2Min', max_field='b2Max', mean_field='b2Mean', sum_field='b2Sum', std_dev_field='b2StdDev')) bs.append(rastergis.BandAttStats(band=3, min_field='b3Min', max_field='b3Max', mean_field='b3Mean', sum_field='b3Sum', std_dev_field='b3StdDev')) rastergis.populate_rat_with_stats(input, clumps, bs)
- rsgislib.rastergis.populate_rat_with_cat_proportions(in_cats_img=string, clumps_img=string, out_cols_name=string, maj_col_name=string, cp_cls_names=boolean, maj_cls_name_field=string cls_name_field=string, rat_band_clumps=int, rat_band_cats=int)
Populates the attribute table with the proportions of intersecting categories
- Parameters:
in_cats_img – is a string containing the name of the categories (classification) image file from which the propotions are calculated
clumps_img – is a string containing the name of the input clump file to which the proportions are to be populated.
out_cols_name – is a string representing the base name for the output columns containing the proportions.
maj_col_name – is a string for name of the field which will hold the majority class.
cp_cls_names – is a boolean defining whether class names should be copied (Optional, Default = false).
maj_cls_name_field – is a string for the output column within the clumps image with the majority class names field (Optional, only used if copyClassNames == True)
cls_name_field – is a string with the name of the column within the categories image for the class names (Optional, only used if copyClassNames == True)
rat_band_clumps – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.
rat_band_cats – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the catagories image.
- rsgislib.rastergis.populate_rat_with_percentiles(input_img=string, clumps_img=string, img_band=int, band_stats=rsgislib.rastergis.BandAttStats, n_hist_bins=int, rat_band=int)
Populates an attribute table with a percentile of the pixel values from an image.
- Parameters:
input_img – is a string containing the name of the input image file
clumps_img – is a string containing the name of the input clump file
img_band – is an int which specifies the image band (from valsimage) for which the stats are to be calculated
band_stats – is a sequence of objects that have attributes matching rsgislib.rastergis.BandAttPercentiles * percentile: float defining the percentile to calculate (Valid range is 0 - 100) * field_name: string defining the name of the field to use for this percentile
n_hist_bins – is an optional (default = 200) integer specifying the number of bins within the histogram (note this governs the accuracy to which percentile can be calculated).
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
inputImage = './Rasters/injune_p142_casi_sub_utm.kea' clumpsImage = './TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_popstats.kea' band=1 bandPercentiles = [] bandPercentiles.append(rastergis.BandAttPercentiles(percentile=25.0, field_name='B1Per25')) bandPercentiles.append(rastergis.BandAttPercentiles(percentile=50.0, field_name='B1Per50')) bandPercentiles.append(rastergis.BandAttPercentiles(percentile=75.0, field_name='B1Per75')) rastergis.populate_rat_with_percentiles(inputImage, clumpsImage, band, bandPercentiles)
- rsgislib.rastergis.populate_rat_with_mode(input_img=string, clumps_img=string, out_cols_name=string, use_no_data=boolean, no_data_val=long, out_no_data=boolean, mode_band=uint, rat_band=uint)
Populates the attribute table with the mode of from a single band in the input image. Note this only makes sense if the input pixel values are integers.
- Parameters:
input_img – is a string containing the name of the input image file from which the mode is calculated
clumps_img – is a string containing the name of the input clump file to which the mode will be populated.
out_cols_name – is a string representing the name for the output column containing the mode.
use_no_data – is a boolean defining whether the no data value should be ignored (Optional, Default = False).
no_data_val – is a long defining the no data value to be used (Optional, Default = 0)
out_no_data – is a boolean to specify that although the no data value should be used for the calculation it should not be outputted to the RAT as a output value unless there is no valid data within the clump. (Default = True)
mode_band – is an optional (default = 1) integer parameter specifying the image band for which the mode is to be calculated.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.
- rsgislib.rastergis.populate_rat_with_prop_valid_pxls(input_img=string, clumps_img=string, out_col=string, no_data_val=float, rat_band=uint)
Populates the attribute table with the proportion of valid pixels within the clump.
- Parameters:
input_img – is a string containing the name of the input image file from which the valid pixels are to be identified
clumps_img – is a string containing the name of the input clump file to which the proportion will be populated.
out_col – is a string representing the name for the output column containing the proportion.
no_data_val – is a float defining the no data value to be used.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.
- rsgislib.rastergis.populate_rat_with_meanlit_stats(in_vals_img=string, clumps_img=string, mean_lit_img=string, mean_lit_band=int, mean_lit_col=string, pxl_count_col=string, band_stats=rsgislib.rastergis.BandAttStats, rat_band=int)
Populates an attribute table with statistics from an input values image where only the pixels with a band value above a defined threshold are used. This is something referred to as the mean-lit statistics, i.e., the sunlit pixels within the object.
- Parameters:
in_vals_img – is a string containing the name of the input image file from which the clumps are to populated.
clumps_img – is a string containing the name of the input clumps image file
mean_lit_img – is a string containing the name of the input image containing the band to be used for the mean-lit stats.
mean_lit_band – is an unsigned integer specifying the image band to be used within the meanLitImage.
mean_lit_col – is a string specifying the column to be used for the ‘mean’ for each object in the mean-lit calculation
pxl_count_col – is a string specifying the output column in the RAT where the count for the number of pixels within each clump used for the stats is outputted.
band_stats – is a sequence of rsgislib.rastergis.BandAttStats objects that have attributes in line with rsgis.cmds.RSGISBandAttStatsCmds * band: int defining the image band to process * min_field: string defining the name of the field for min value * max_field: string defining the name of the field for max value * sum_field: string defining the name of the field for sum value * mean_field: string defining the name of the field for mean value * std_dev_field: string defining the name of the field for standard deviation value
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis inputImage = "RapidEye_20130625_lat53lon389_tid3063312_oid167771_rad_toa.kea" segmentClumps = "RapidEye_20130625_lat53lon389_tid3063312_oid167771_rad_toa_segs.kea" ndviImage = "RapidEye_20130625_lat53lon389_tid3063312_oid167771_rad_toa_ndvi.kea" bandStats = [] bandStats.append(rastergis.BandAttStats(band=1, mean_field='BlueMeanML', std_dev_field='BlueStdDevML')) bandStats.append(rastergis.BandAttStats(band=2, mean_field='GreenMeanML', std_dev_field='GreenStdDevML')) bandStats.append(rastergis.BandAttStats(band=3, mean_field='RedMeanML', std_dev_field='RedStdDevML')) bandStats.append(rastergis.BandAttStats(band=4, mean_field='RedEdgeMeanML', std_dev_field='RedEdgeStdDevML')) bandStats.append(rastergis.BandAttStats(band=5, mean_field='NIRMeanML', std_dev_field='NIRStdDevML')) rastergis.populate_rat_with_meanlit_stats(valsimage=inputImage, clumps=segmentClumps, meanLitImage=ndviImage, meanlitBand=1, meanLitCol='NDVIMean', pxlCountCol='MLPxlCount', bandstats=bandStats, rat_band=1)
- rsgislib.rastergis.populate_rat_with_cat_vec_lyr(clumps_img: str, out_col_name: str, vec_file: str, vec_lyr: str, vec_att_col: str = None, no_data_val: int = 0, mode_use_no_data: bool = True, tmp_dir: str = 'tmp_dir')
A function which populates a vector layer to a raster attribute table. This function rasterises the vector layer to the same pixel grid as the clumps file and then using the mode to populate the clumps. If a vector attribute column is provided then that column will be rasterised otherwise a binary mask will be used.
- Parameters:
clumps_img – Input clumps file path.
out_col_name – the output column name.
vec_file – input vector file path
vec_lyr – input vector layer
vec_att_col – optional attribute within the vector layer to be used to populate the clumps image. This must be an integer variable.
no_data_val – the no data value to use.
mode_use_no_data – use the no data value when calculating the mode.
tmp_dir – a temporary directory to outputs. If tmp_dir path does not exist it will be created and then deleted at the end of the function.
- rsgislib.rastergis.str_class_majority(base_clumps_img, info_clumps_img, base_class_col, info_class_col, ignore_zero=True, rat_band_base=1, rat_band_info=1)
Finds the majority for class (string - field) from a set of small objects to large objects
- Parameters:
base_clumps_img – is a the base clumps file, to be attribured.
info_clumps_img – is the file to take attributes from.
base_class_col – the output column name in the baseSegment file.
info_class_col – is the colum name in the infoSegment file.
ignore_zero – is a boolean specifying if zeros should be ignored in input layer. If set to false values of 0 will be included when calculating the class majority, otherwise the majority calculation will only consider objects with a value greater than 0.
rat_band_base – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the base clumps.
rat_band_info – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the info clumps.
from rsgislib import rastergis clumps='./TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_popstats.kea' classRAT='./TestOutputs/RasterGIS/reInt_rat.kea' rastergis.str_class_majority(clumps, classRAT, 'class_dst', 'class_src')
- rsgislib.rastergis.define_class_names(clumps_img: str, class_num_col: str, class_name_col: str, class_names_dict: dict)
- A function to create a class names column in a RAT based on segmented clumps
where a number of clumps have the same number class.
- Parameters:
clumps_img – input clumps image.
class_num_col – column specifying the class number (e.g., where clumps are segments in a segmentation)
class_name_col – the output column name where a string will be created if it doesn’t already exists.
class_names_dict – Dictionary to look up the class names. The key needs to the integer number for the class
- rsgislib.rastergis.set_column_data(clumps_img: str, col_name: str, col_data: array)
A function to read a column of data from a RAT.
- Parameters:
clumps_img – Input clumps image
col_name – Name of the column to be written.
col_data – Data to be written to the column.
- rsgislib.rastergis.create_uid_col(clumps_img: str, col_name: str = 'UID')
A function which adds a unique ID value (starting at 0) to each clump within a RAT.
- Parameters:
clumps_img – Input clumps image
col_name – The output column name (default is UID).
Calculate Spatial Relationships
- rsgislib.rastergis.calc_dist_to_classes(clumps_img: str, class_col: str, out_img_base: str, tmp_dir: str = 'tmp', tile_size: int = 2000, max_dist: int = 1000, no_data_val: int = 1000, n_cores: int = -1)
A function which will calculate proximity rasters for a set of classes defined within the RAT.
- Parameters:
clumps_img – is a string specifying the input image with the associated RAT
class_col – is the column in the RAT which has the classification
out_img_base – is the base name of the output image - output files will be KEA files.
tmp_dir – is a directory to be used for storing the image tiles and other temporary files - if not directory does not exist it will be created and deleted on completion (Default: tmp).
tile_size – is an int specifying in pixels the size of the image tiles used for processing (Default: 2000)
max_dist – is the maximum distance in units of the geographic units of the projection of the input image (Default: 1000).
no_data_val – is the value applied to the pixels outside of the maxDist threshold (Default: 1000; i.e., the same as maxDist).
n_cores – is the number of processing cores which are available to be used for this processing. If -1 all available cores will be used. (Default: -1)
- rsgislib.rastergis.calc_dist_between_clumps(clumps_img: str, out_col_name: str, tmp_dir: str = 'tmp', use_idx: bool = False, max_dist_thres: float = 10)
Calculate the distance between all clumps
- Parameters:
clumps_img – image clumps for which the distance will be calculated.
out_col_name – output column within the clumps image.
tmp_dir – directory out temporary files will be outputted to.
use_idx – use a spatial index when calculating the distance between clumps (needed for large number of clumps).
max_dist_thres – if using an index than an upper limit on the distance between clumps can be defined.
- rsgislib.rastergis.calc_dist_to_large_clumps(clumps_img: str, out_col_name: str, size_thres: float, tmp_dir: str = 'tmp', use_idx: bool = False, max_dist_thres: float = 10)
Calculate the distance from each small clump to a large clump. Split defined by the threshold provided.
- Parameters:
clumps_img – image clumps for which the distance will be calculated.
out_col_name – output column within the clumps image.
size_thres – is a threshold to seperate the sets of large and small clumps.
tmp_dir – directory out temporary files will be outputted to.
use_idx – use a spatial index when calculating the distance between clumps (needed for large number of clumps).
max_dist_thres – if using an index than an upper limit on the distance between clumps can be defined.
- rsgislib.rastergis.calc_border_length(clumps_img, out_col, ignore_zero_edges)
Calculate the border length of clumps
- Parameters:
clumps_img – is a string containing the name of the input image file
out_col – is a string with the output column name
ignore_zero_edges – is a boolean specifying whether zero edges (i.e., no data) should be ignored
- rsgislib.rastergis.calc_rel_border(clumps_img, out_col, class_names_col, class_name, ignore_zero_edges)
Calculates the relative border length of the clumps to a class
- Parameters:
inputImage – is a string containing the name of the input image file
out_col – is a string specifying the output column name
class_names_col – is a string specifying the column which holds the class names
class_name – is a string specifying the class for which the relative boarder is to be calculated.
ignore_zero_edges – is a boolean specifying whether zero edges (i.e., no data) should be ignored
- rsgislib.rastergis.calc_rel_diff_neigh_stats(clumps_img, field_stats, use_abs_diff, rat_band)
Calculates the difference (relative or absolute) between each clump and it’s neighbours. The differences can be summarised as min, max, mean, std dev or sum.
- Parameters:
clumps_img – is a string containing the name of the input clump file
field_stats – has the following fields * field: string defining the field in the RAT to compare to. * min_field: string defining the name of the field for min value * max_field: string defining the name of the field for max value * sum_field: string defining the name of the field for sum value * mean_field: string defining the name of the field for mean value * std_dev_field: string defining the name of the field for standard deviation value
use_abs_diff – calculate the absolute difference.:param rat_band: is the image band with which the RAT is associated.
import rsgislib.rastergis inputImage = './RapidEye_20130625_lat53lon389_tid3063312_oid167771_rad_toa_segs_neigh.kea' ratBand = 1 rsgislib.rastergis.find_neighbours(inputImage, ratBand) fieldInfo = rsgislib.rastergis.FieldAttStats(field='NIRMean', min_field='MinNIRMeanDiff', max_field='MaxNIRMeanDiff') rsgislib.rastergis.calc_rel_diff_neigh_stats(inputImage, fieldInfo, False, ratBand)
- rsgislib.rastergis.define_border_clumps(clumps_img, out_col)
Defines the clumps which are on the border within the file of the clumps using a mask
- Parameters:
clumps_img – is a string containing the name of the input clump file
out_col – is a string containing the name of the output column where a value of 1 indicates a border clumps
- rsgislib.rastergis.define_clump_tile_positions(clumps_img, tile_img, out_col, tile_overlap, tile_boundary, tile_body)
Defines the position within the file of the clumps.
- Parameters:
clumps_img – is a string containing the name of the input clump file
tile_img – is a string containing the name of the input tile image
out_col – is a string containing the name of the output column
tile_overlap – is an unsigned int defining the overlap between tiles
tile_boundary – is an unsigned int
tile_body – is an unsigned int
- rsgislib.rastergis.find_boundary_pixels(clumps_img, output_img, gdalformat, rat_band)
Identifies the pixels on the boundary of the clumps
- Parameters:
clumps_img – is a string containing the name of the input image file
output_img – is a string containing the name of the output file
gdalformat – is a string containing the GDAL format for the output file - (Optional, Default = ‘KEA’)
rat_band – is an int containing band for which the neighbours are to be calculated for (Optional, Default = 1)
- rsgislib.rastergis.find_neighbours(clumps_img, rat_band)
Finds the clump neighbours from an image
- Parameters:
clumps_img – is a string containing the name of the input image file
rat_band – is an int containing band for which the neighbours are to be calculated for (Optional, Default = 1)
- rsgislib.rastergis.clumps_spatial_location(clumps_img=string, eastings=string, northings=string, rat_band=int)
Adds spatial location columns to the attribute table
- Parameters:
inputImage – is a string containing the name of the input image file
eastings – is a string containing the name of the eastings field
northings – is a string containing the name of the northings field
rat_band – is an integer containing the band number for the RAT (Optional, default = 1)
from rsgislib import rastergis image = 'injune_p142_casi_sub_utm_segs_spatloc_eucdist.kea' eastings = 'Easting' northings = 'Northing' rastergis.clumps_spatial_location(image, eastings, northings)
- rsgislib.rastergis.clumps_spatial_extent(clumps_img=string, min_xx=string, min_xy=string, max_xx=string, max_xy=string, min_yx=string, min_yy=string, max_yx=string, max_yy=string, rat_band=int)
Adds spatial extent for each clump to the attribute table
- Parameters:
clumps_img – is a string containing the name of the input image file
min_xx – is a string containing the name of the min X X field
min_xy – is a string containing the name of the min X Y field
max_xx – is a string containing the name of the max X X field
max_xy – is a string containing the name of the max X Y field
min_yx – is a string containing the name of the min Y X field
min_yy – is a string containing the name of the min Y Y field
max_yx – is a string containing the name of the max Y X field
max_yy – is a string containing the name of the max Y Y field
rat_band – is an integer containing the band number for the RAT (Optional, default = 1)
from rsgislib import rastergis image = 'injune_p142_casi_sub_utm_segs_spatloc_eucdist.kea' minX_X = 'minXX' minX_Y = 'minXY' maxX_X = 'maxXX' maxX_Y = 'maxXY' minY_X = 'minYX' minY_Y = 'minYY' maxY_X = 'maxYX' maxY_Y = 'maxYY' rastergis.clumps_spatial_extent(image, minX_X, minX_Y, maxX_X, maxX_Y, minY_X, minY_Y, maxY_X, maxY_Y)
Read RAT
- rsgislib.rastergis.get_column_data(clumps_img: str, col_name: str) array
A function to read a column of data from a RAT.
- Parameters:
clumps_img – Input clumps image
col_name – Name of the column to be read.
- Returns:
numpy array with values from the clumpsImg
- rsgislib.rastergis.read_rat_neighbours(clumps_img: str, start_row: int = None, end_row: int = None, rat_band: int = 1) List[List[int]]
A function which returns a list of clumps neighbours from a KEA RAT. Note, the neighbours are popualted using the function rsgislib.rastergis.findNeighbours. By default the whole datasets of neightbours is read to memory but the start_row and end_row variables can be used to read a subset of the RAT.
- Parameters:
clumps_img – path to the image file with the RAT
start_row – the row within the RAT to start reading, if None will start at 0 (Default: None).
end_row – the row within the RAT to end reading, if None will end at n_rows within the RAT. (Default: None)
rat_band – the band within the image file for which the RAT is to read.
- Returns:
list of lists with neighbour indexes.
Sampling
- rsgislib.rastergis.histo_sampling(clumps_img=string, val_col=string, out_sel_col=string, prop_sample=float, bin_width=float, cls_col=string, class_val=string, rat_band=int)
This function performs a histogram based sampling of the RAT for a specific column. The output is a binary column within the RAT where rows with a value of 1 are the selected clumps.
- Parameters:
clumps_img – is a string containing the name of the input clumps image file
val_col – is a string containing the name of the field with the values used for the sampling.
out_sel_col – is a string containing the name of the field where the binary output will be written (1 for selected clumps).
prop_sample – is a float specifying the proportion of the datasets which should be within the outputted sample. Values range of 0-1. 0.5 would be a 50% sample.:param bin_width: is a float specifying the width of each histogram bin.
cls_col – is a string specifying a field within which classes have been defined. This can be used to only apply the sampling to a thematic subset of the RAT. If set as None then this is ignored. (Default = None)
class_val – is a string specifying the class it will be limited to.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis rastergis.histo_sampling(clumps='N00E103_10_grid_knn.kea', varCol='HH', outSelectCol='HHSampling', propOfSample=0.25, binWidth=0.01, classColumn='Class', classVal='2')
- rsgislib.rastergis.take_random_sample(clumps_img: str, in_col_name: str, in_col_val: float, out_col_name: str, sample_ratio: float, rnd_seed: int = 0)
A function to take a random sample of an input column.
- Parameters:
clumps_img – clumps image.
in_col_name – input column name.
in_col_val – numeric value for which the random sample is to be taken for.
out_col_name – output column where value of 1 is selected within the random sample and 0 is not selected.
sample_ratio – the size of the sample (0 - 1.0; i.e., 10% = 0.1) to be taken of the number of rows within input value.
rnd_seed – is the seed for the random number generation (optional; default is 0).
- rsgislib.rastergis.select_clumps_on_grid(clumps_img, in_sel_col, out_sel_col, eastings_col, northings_col, metric_col, method, rows, cols)
Selects a segment within a regular grid pattern across the scene. The clump is selected based on the minimum, maximum or closest to the mean.
- Parameters:
clumps_img – is a string containing the name of the input clump file
in_sel_col – is a string which defines the column name where a value of 1 defines the clumps which will be included in the analysis.
out_sel_col – is a string which defines the column name where a value of 1 defines the clumps selected by the analysis.
eastings_col – is a string which defines a column with a eastings for each clump.
northings_col – is a string which defines a column with a northings for each clump.
metric_col – is a string which defines a column with a value for each clump which will be used for the distance, min, or max anaylsis.
method – is a string which defines whether the minimum, maximum or mean method of selecting a clump will be used (values can be either min, max or mean).
rows – is an unsigned integer which defines the number of rows within which a clump will be selected.
cols – is an unsigned integer which defines the number of columns within which a clump will be selected.
Classification
- rsgislib.rastergis.identify_small_units(clumps_img: str, class_col: str, tmp_dir: str, out_col_name: str, small_clumps_thres: float, use_tiled_clump: bool = False, n_cores: int = 1, tile_width: int = 2000, tile_height: int = 2000)
Identify small connected units within a classification. The threshold to define small is provided by the user in pixels. Note, the outColName and smallClumpsThres variables can be provided as lists to identify a number of thresholds of small units.
- Parameters:
clumps_img – string for the clumps image file containing input classification
class_col – string for the column name representing the classification as integer values
tmp_dir – directory path where temporary layers are stored (if directory is created within the function it will be deleted once function is complete).
out_col_name – a list of output column names (i.e., one for each threshold)
small_clumps_thres – a list of thresholds for identifying small clumps.
use_tiled_clump – a boolean to specify whether the tiled clumping algorithm should be used (Default is False; select True for large datasets)
n_cores – if the tiled version of the clumping algorithm is being used then there is an option to use multiple processing cores; specify the number to be used (Default is 2).
tile_width – is the width of the image tile (in pixels) if tiled clumping is used.
tile_height – is the height of the image tile (in pixels) if tiled clumping is used.
Example:
import rsgislib.rastergis clumpsImg = "LS2MSS_19750620_lat10lon6493_r67p250_rad_srefdem_30m_clumps.kea" tmpPath = "tmp/" classCol = "OutClass" outColName = ["SmallUnits25", "SmallUnits50", "SmallUnits100"] smallClumpsThres = [25, 50, 100] rastergis.identify_small_units(clumpsImg, classCol, tmpPath, outColName, smallClumpsThres)
- rsgislib.rastergis.class_split_fit_hist_gausian_mixture_model(clumps_img=string, out_col=string, val_col=string, bin_width=float, cls_col=string, cls_val=string, rat_band=int)
This function fits a Gaussian mixture model to the histogram for a variable in the RAT and uses it to split the class into a series of subclasses.
- Parameters:
clumps_img – is a string containing the name of the input clumps image file
out_col – is a string for a HDF5 with the fitted Gaussians.
val_col – is a string containing the name of the field with the values used for the sampling.
bin_width – is a float specifying the width of each histogram bin.
cls_col – is a string specifying a field within which classes have been defined.
cls_val – is a string specifying the class it will be limited to.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis rastergis.class_split_fit_hist_gausian_mixture_model(clumps='FrenchGuiana_10_ALL_sl_HH_lee_UTM_mosaic_dB_segs.kea', outCol='MangroveSubClass', varCol='HVdB', binWidth=0.1, classColumn='Classes', classVal='Mangroves')
Extrapolation
- rsgislib.rastergis.apply_rat_knn(clumps_img=string, in_extrap_col=string, out_extrap_col=string, train_regions_col=string, apply_regions_col=string, val_cols=list<string>, k_feat=uint, dist_knn=int, summerise_knn=int, dist_thres=float, rat_band=int)
This function uses the KNN algorithm to allow data values to be extrapolated to segments.
- Parameters:
clumps_img – is a string containing the name of the input clumps image file
in_extrap_col – is a string containing the name of the field with the values used for the extrapolation.
out_extrap_col – is a string containing the name of the field where the extrapolated values will be written to.
train_regions_col – is a string containing the name of the field specifying the clumps to be used as training - binary column (1 == training region).
apply_regions_col – is a string containing the name of the field specifying the regions for which KNN is to be applued - binary column (1 == regions to be calculated). If None then ignored and applied to all.:param val_cols: is a list of strings specifying the fields which will be used to calculate distance.
k_feat – is an unsigned integer specifying the number of nearest features (i.e., K) to be used (Default: 12)
dist_knn – specifies how the distance to identify NN is calculated (rsgislib.DIST_EUCLIDEAN, rsgislib.DIST_MANHATTEN, rsgislib.DIST_MAHALANOBIS, rsgislib.DIST_MINKOWSKI, rsgislib.DIST_CHEBYSHEV; Default: rsgislib.DIST_MAHALANOBIS).
summerise_knn – specifies how the extrapolation value is calculated (rsgislib.SUMTYPE_MODE, rsgislib.SUMTYPE_MEAN, rsgislib.SUMTYPE_MEDIAN, rsgislib.SUMTYPE_MIN, rsgislib.SUMTYPE_MAX, rsgislib.SUMTYPE_STDDEV; Default: rsgislib.SUMTYPE_MEDIAN). Mode is used for classification.
dist_thres – is a maximum distance threshold over which features will not be included within the ‘k’.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis from rsgislib import imageutils import rsgislib forestClumpsImg='./LS5TM_20110428_forestclumps.kea' rastergis.apply_rat_knn(clumps=forestClumpsImg, inExtrapField='HP95', outExtrapField='HP95Pred', trainRegionsField='LiDARForest', applyRegionsField=None, fields=['RedRefl','GreenRefl','BlueRefl'], kFeat=12, distKNN=rsgislib.DIST_EUCLIDEAN, summeriseKNN=rsgislib.SUMTYPE_MEDIAN, distThres=25) # Export predicted column to GDAL image forestHeightImg='./LS5TM_20110428_forest95Height.kea' rastergis.export_col_to_gdal_img(forestClumpsImg, forestHeightImg, 'KEA', rsgislib.TYPE_32FLOAT, 'HP95Pred') imageutils.popImageStats(forestHeightImg,True,0.,True)
Change Detection
- rsgislib.rastergis.get_global_class_stats(clumps_img, class_field, attributes, cls_chg_cols, rat_band)
Similar to ‘findChangeClumpsFromStdDev’ but rather than applying a threshold to calculate change clumps adds global (over all objects) class mean and standard deviation to RAT.
- Parameters:
clumps_img – is a string containing the name of the input clump file
class_field – is a string providing the name of the column containing classes.
attributes – is a sequence of strings containing the columns to use when detecting change.
cls_chg_cols – is a sequence of python objects having the following attributes: * cls_name - The class name in which change is going to be search for
rat_band – is an int containing band for which the neighbours are to be calculated for (Optional, Default = 1)
from rsgislib import rastergis clumpsImage='injune_p142_casi_sub_utm_segs_popstats.kea' changeFeatVals = [] changeFeatVals.append(rastergis.ChangeFeat(cls_name='Forest')) changeFeatVals.append(rastergis.ChangeFeat(cls_name='Scrub-Shrub)) rastergis.get_global_class_stats(clumpsImage, 'ClassName', ['NDVI'], changeFeatVals)
Statistics
- rsgislib.rastergis.fit_hist_gausian_mixture_model(clumps_img=string, out_h5_file=string, out_hist_file=string, val_col=string, bin_width=float, cls_col=string, cls_val=string, rat_band=int)
This function fits a Gaussian mixture model to the histogram for a variable in the RAT.
- Parameters:
clumps_img – is a string containing the name of the input clumps image file
out_h5_file – is a string for a HDF5 with the fitted Gaussians.
out_hist_file – is a string to output the Histrogram as a HDF5 file.
val_col – is a string containing the name of the field with the values used for the sampling.
bin_width – is a float specifying the width of each histogram bin.
cls_col – is a string specifying a field within which classes have been defined.
cls_val – is a string specifying the class it will be limited to.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis rastergis.fit_hist_gausian_mixture_model(clumps='FrenchGuiana_10_ALL_sl_HH_lee_UTM_mosaic_dB_segs.kea', outH5File='gaufit.h5', outHistFile='histfile.h5', varCol='HVdB', binWidth=0.1, classColumn='Classes', classVal='Mangrove')
- rsgislib.rastergis.calc_1d_jm_distance(clumps_img=string, val_col=string, bin_width=float, cls_col=string, class1=string, class2=string, rat_band=uint)
Calculate the Jeffries and Matusita distance for a single variable between two classes.
- Parameters:
clumps_img – is a string containing the name of the input clump file
val_col – is a string specifying the name of the variable column.
bin_width – is a float specifying the bin width for the histogram.
cls_col – is a string specifying the column name with the class names.
class1 – is a string specifying the first class.
class2 – is a string specifying the second class.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.
- Returns:
double for distance
- rsgislib.rastergis.calc_2d_jm_distance(clumps_img=string, val1_col=string, val2_col=string, val1_bin_width=float, val2_bin_width=float, cls_col=string, class1=string, class2=string, rat_band=uint)
Calculate the Jeffries and Matusita distance for two variables between two classes.
- Parameters:
clumps_img – is a string containing the name of the input clump file
val1_col – is a string specifying the name of the first variable column.
val2_col – is a string specifying the name of the second variable column.
val1_bin_width – is a float specifying the bin width for the histogram for variable 1.
val2_bin_width – is a float specifying the bin width for the histogram for variable 2.
cls_col – is a string specifying the column name with the class names.
class1 – is a string specifying the first class.
class2 – is a string specifying the second class.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.
- Returns:
double for distance
- rsgislib.rastergis.calc_bhattacharyya_distance(clumps_img=string, val_col=string, cls_col=string, class1=string, class2=string, rat_band=uint)
Calculate the Bhattacharyya distance for a single variable between two classes.
- Parameters:
clumps_img – is a string containing the name of the input clump file
val_col – is a string specifying the name of the variable column.
cls_col – is a string specifying the column name with the class names.
class1 – is a string specifying the first class.
class2 – is a string specifying the second class.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated in the clumps image.
- Returns:
double for distance
Copy & Export
- rsgislib.rastergis.export_rat_cols_to_ascii(clumps_img, out_file, fields, rat_band=1)
Exports selected columns from a GDAL RAT to ASCII file (comma separated). The first column is the object ID (FID).
- Parameters:
clumps_img – is a string containing the name of the input RAT.
out_file – is a string containing the name of the output file.
fields – is a sequence of strings containing the field names.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis clumps='./RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea' outfile='./TestOutputs/RasterGIS/injune_p142_casi_rgb_exportascii.txt' fields = ['BlueAvg', 'GreenAvg', 'RedAvg'] rastergis.export_rat_cols_to_ascii(clumps, outfile, fields)
- rsgislib.rastergis.export_col_to_gdal_img(clumps_img, output_img, gdalformat, datatype, field, rat_band=1)
Exports column of the raster attribute table as bands in a GDAL image.
- Parameters:
clumps_img – is a string containing the name of the input image file with RAT
output_img – is a string containing the name of the output gdal file
gdalformat – is a string containing the GDAL format for the output file - eg ‘KEA’
datatype – is an int containing one of the values from rsgislib.TYPE_*
field – is a string, providing the name of the column to be exported.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis clumps='./RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea' output_img='./TestOutputs/RasterGIS/injune_p142_casi_rgb_export.kea' gdalformat = 'KEA' datatype = rsgislib.TYPE_32FLOAT field = 'RedAvg' rastergis.export_col_to_gdal_img(clumps, output_img, gdalformat, datatype, field)
- rsgislib.rastergis.export_cols_to_gdal_img(clumps_img: str, output_img: str, gdalformat: str, datatype: int, fields: List[str], rat_band: int = 1, tmp_dir: str = None)
Exports columns of the raster attribute table as bands in a GDAL image. Utility function, exports each column individually then stacks them.
- Parameters:
clumps_img – is a string containing the name of the input image file with RAT
output_img – is a string containing the name of the output gdal file
gdalformat – is a string containing the GDAL format for the output file - eg ‘KEA’
datatype – is an int containing one of the values from rsgislib.TYPE_*
field – is a list of strings, providing the names of the column to be exported
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
Example:
from rsgislib import rastergis clumps='RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea' outimage='TestOutputs/RasterGIS/injune_p142_casi_rgb_export.kea' gdalformat = 'KEA' datatype = rsgislib.TYPE_32FLOAT fields = ['RedAvg','GreenAvg','BlueAvg'] rastergis.export_cols_to_gdal_image(clumps, outimage, gdalformat, datatype, fields)
- rsgislib.rastergis.export_clumps_to_images(clumps_img, out_img_base, bin_out, out_img_ext, gdalformat, rat_band=1)
Exports each clump to a seperate raster which is the minimum extent for the clump.
- Parameters:
clumps_img – is a string containing the name of the input image file with RAT
out_img_base – is a string containing the base name of the output image file (C + FID will be added to identify files).
bin_out – is a boolean specifying whether the output images should be binary or if the pixel value should be the FID of the clump.
out_img_ext – is a sting with the output file extension (e.g., kea) without the preceeding dot to be appended to the file name.
gdalformat – is a string containing the GDAL format for the output file - eg ‘KEA’
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
import rsgislib from rsgislib import rastergis clumps='./DefineTiles.kea' outimgbase='./Tiles/OutputImgTile_' outimgext='kea' gdalformat = 'KEA' binaryOut = False rastergis.export_clumps_to_images(clumps, outimgbase, binaryOut, outimgext, gdalformat, rat_band)
- rsgislib.rastergis.copy_gdal_rat_columns(clumps_img, output_img, fields, copy_colours=True, copy_hist=True, rat_band=1)
Copies GDAL RAT columns from one image to another
- Parameters:
clumps_img – is a string containing the name and path for the image with RAT from which columns are to copied from.
output_img – is a string containing the name of the file to which the columns are to be copied.
fields – is a sequence of strings containing the names of the fields to copy
copy_colours – is a bool specifying if the colour columns should be copied (default = True)
copy_hist – is a bool specifying if the histogram should be copied (default = True)
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis table = './RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea' image = './TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_cpcols.kea' fields = ['NIRAvg', 'BlueAvg', 'GreenAvg', 'RedAvg'] rastergis.copy_gdal_rat_columns(image, table, fields)
To copy a subset of columns from one RAT to a new file the following can be used:
import rsgislib import rsgislib.imageutils from rsgislib import rastergis rat_band=1 table='inRAT.kea' output='outRAT_nir_only.kea' bands = [rat_band] rsgislib.imageutils.selectImageBands(table, output,'KEA', rsgislib.TYPE_32INT, bands) fields = ['NIRAvg'] rastergis.copy_gdal_rat_columns(table, output, fields, copycolours=True, copyhist=True, rat_band=rat_band)
- rsgislib.rastergis.copy_rat(clumps_img, output_img, rat_band=1)
Copies a GDAL RAT from one image to another
- Parameters:
clumps_img – is a string containing the name and path for the image with RAT from which columns are to copied from.
output_img – is a string containing the name of the file to which the columns are to be copied.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis clumps = './RATS/injune_p142_casi_sub_utm_clumps_elim_final_clumps_elim_final.kea' output_img = './TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_cptab.kea' rastergis.copy_rat(clumps, output_img)
- rsgislib.rastergis.import_vec_atts(clumps_img, vec_file, vec_lyr, fid_col, col_names, rat_band=1)
Copies the attributes from an input shapefile to the RAT.
- Parameters:
clumps_img – is a string containing the name of the input file with RAT
vec_file – is a string containing the file path of the input vector file
vec_lyr – is a string containing the layer name within the input vector file
fid_col – is a string with the name of a column which has the clumps pixel value associated with the vector feature.
col_names – is a list of strings specifying the columns to be copied to the RAT. If ‘None’ then all attributes will be copied.
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
from rsgislib import rastergis clumps = 'clumpsFiles.kea' vectorFile = 'vectorFile.shp' veclyr = 'vectorFile' rastergis.import_vec_atts(clumps, vectorFile, veclyr, 'pxlval', None)
Colour Tables
- rsgislib.rastergis.colour_rat_classes(clumps_img, field, class_colours, rat_band)
Sets a colour table for a set of classes within the attribute table
- Parameters:
clumps_img – is a string containing the name of the input file
field – is a string containing the name of the input class field (class can be a string or integer).
class_colours – is dict mapping int class ids to an object having the following attributes: * red: int defining the red colour component (0 - 255) * green: int defining the green colour component (0 - 255) * blue: int defining the bluecolour component (0 - 255) * alpha: int defining the alpha colour component (0 - 255)
rat_band – is an optional (default = 1) integer parameter specifying the image band to which the RAT is associated.
import collections from rsgislib import rastergis clumps='./TestOutputs/RasterGIS/injune_p142_casi_sub_utm_segs_col.kea' field = 'outClass' classcolours = {} colourCat = collections.namedtuple('ColourCat', ['red', 'green', 'blue', 'alpha']) classcolours[0] = colourCat(red=200, green=50, blue=50, alpha=255) classcolours[1] = colourCat(red=200, green=240, blue=50, alpha=255) rastergis.colour_rat_classes(clumps, field, classcolours)
- rsgislib.rastergis.set_class_names_colours(clumps_img: str, class_names_col: str, class_info_dict: dict)
A function to define a class names column and define the class colours.
classInfoDict = dict() classInfoDict[1] = {‘classname’:’Forest’, ‘red’:0, ‘green’:255, ‘blue’:0} classInfoDict[2] = {‘classname’:’Water’, ‘red’:0, ‘green’:0, ‘blue’:255}
- Parameters:
clumps_img – Input clumps image - expecting a classification (rather than segmentation) where the number is the pixel value.
class_names_col – The output column for the class names.
class_info_dict – a dict where the key is the pixel value for the class.
- rsgislib.rastergis.get_rat_colours(clumps_img: str, cls_column: str = None) Dict[str, Dict]
A function which gets the colour table and optionally the class names from the rat and returns it as a dict which can be inputted into the rsgislib.rastergis.set_class_names_colours function. This is useful for copying the class names and colours between files. Output dict will have the following structure:
class_clr_info[1] = {‘classname’:’Forest’, ‘red’:0, ‘green’:255, ‘blue’:0} class_clr_info[2] = {‘classname’:’Water’, ‘red’:0, ‘green’:0, ‘blue’:255}
- Parameters:
clumps_img – Input clumps image
cls_column – Optionally a class names column can be provided. If None then ignored and output dict doesn’t have a ‘classname’ field.
- Returns:
dict of dicts
Data Structures / Enums
- rsgislib.rastergis.BandAttStats(band, min_field=None, max_field=None, sum_field=None, std_dev_field=None, mean_field=None)
This is passed to the populate_rat_with_stats function
- rsgislib.rastergis.FieldAttStats(field, min_field=None, max_field=None, sum_field=None, std_dev_field=None, mean_field=None)
This is passed to the calcRelDiffNeighStats function
- rsgislib.rastergis.BandAttPercentiles(percentile: float, field_name: str)
This is passed to the populateRATWithPercentiles function
- rsgislib.rastergis.ShapeIndex(col_name: str, idx: int, col_idx: int = 0)
This is passed to the calcShapeIndices function