RSGISLib Vector Utils Module

Vector Attributes

rsgislib.vectorutils.vector_maths(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str, out_col: str, exp: str, vars: list, del_exist_vec: bool)

A command to calculate a number column from data in existing columns. The syntax for the expression is from the muparser library see here for available operations and syntax .

Parameters
  • vec_file – is a string containing the input vector file path

  • vec_lyr – is a string containing the name of the input vector layer name

  • out_vec_file – is a string containing the output vector file path

  • out_vec_lyr – is a string containing the name of the output vector layer name

  • out_format – is a string containing the output file format

  • out_col – is a string containing the name of the output column

  • exp – is a string containing the muparser expression to be calculated.

  • vars – is a list of rsgislib.vectorutils.VecColVar objects defining the names of the variables used within the expression and defining which columns they are in the vec_file.

  • del_exist_vec – is a bool, specifying whether to force removal of the output vector if it exists

rsgislib.vectorutils.copy_rat_cols_to_vector_lyr(vec_file: str, vec_lyr: str, rat_row_col: str, clumps_img: str, ratcols: list, out_col_names: Optional[list] = None, out_col_types: Optional[list] = None)

A function to copy columns from RAT to a vector layer. Note, the vector layer needs a column, which already exists, that specifies the row from the RAT the feature is related to. If you created the vector using the polygonise function then that column will have been created and called ‘PXLVAL’.

Parameters
  • vec_file – The vector file to be used.

  • vec_lyr – The name of the layer within the vector file.

  • rat_row_col – The column in the layer which specifies the RAT row the feature corresponds with.

  • clumps_img – The clumps image with the RAT from which information should be taken.

  • ratcols – The names of the columns in the RAT to be copied.

  • out_col_names – If you do not want the same column names as the RAT then you can specify alternatives. If None then the names will be the same as the RAT. (Default = None)

  • out_col_types – The data types used for the columns in vector layer. If None then matched to RAT. Default is None

rsgislib.vectorutils.perform_spatial_join(vec_base_file: str, vec_base_lyr: str, vec_join_file: str, vec_join_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', join_how: str = 'inner', join_op: str = 'within')

A function to perform a spatial join between two vector layers. This function uses geopandas so this needs to be installed. You also need to have the rtree package to generate the index used to perform the intersection.

For more information see: http://geopandas.org/mergingdata.html#spatial-joins

Parameters
  • vec_base_file – the base vector file with the geometries which will be outputted.

  • vec_base_lyr – the layer name for the base vector.

  • vec_join_file – the vector with the attributes which will be joined to the base vector geometries.

  • vec_join_lyr – the layer name for the join vector.

  • out_vec_file – the output vector file.

  • out_vec_lyr – the layer name for the output vector.

  • out_format – The output vector file format (Default GPKG)

  • join_how – Specifies the type of join that will occur and which geometry is retained. The options are [left, right, inner]. The default is ‘inner’

  • join_op – Defines whether or not to join the attributes of one object to another. The options are [intersects, within, contains] and default is ‘within’

class rsgislib.vectorutils.VecColVar(name: str, field_name: str)

A class for using the the vector_math function specifying the input columns and the variable name to be used in the expression.

Parameters
  • name – the name of the variable to be used within the expression

  • field_name – the name of the column in the attribute table.

  • name – the name of the variable to be used within the expression

  • field_name – the name of the column in the attribute table.

Create Vectors

rsgislib.vectorutils.createvectors.polygonise_raster_to_vec_lyr(out_vec_file: str, out_vec_lyr: str, out_format: str, input_img: str, img_band: int = 1, mask_img: Optional[str] = None, mask_band: int = 1, replace_file: bool = True, replace_lyr: bool = True, pxl_val_fieldname: str = 'PXLVAL', use_8_conn: bool = False)

A utility to polygonise a raster to a OGR vector layer.

Parameters
  • out_vec_file – is a string specifying the output vector file path. If it exists it will be deleted and overwritten.

  • out_vec_lyr – is a string with the name of the vector layer.

  • out_format – is a string with the driver

  • input_img – is a string specifying the input image file to be polygonised

  • img_band – is an int specifying the image band to be polygonised. (default = 1)

  • mask_img – is an optional string mask file specifying a no data mask (default = None)

  • mask_band – is an int specifying the image band to be used the mask (default = 1)

  • replace_file – is a boolean specifying whether the vector file should be replaced (i.e., overwritten). Default=True.

  • replace_lyr – is a boolean specifying whether the vector layer should be replaced (i.e., overwritten). Default=True.

  • pxl_val_fieldname – is a string to specify the name of the output column representing the pixel value within the input image.

  • use_8_conn – is a bool specifying whether 8 connectedness or 4 connectedness should be used (4 is RSGISLib/GDAL default)

rsgislib.vectorutils.createvectors.vectorise_pxls_to_pts(input_img: str, img_band: int, img_msk_val: int, out_vec_file: str, out_vec_lyr: Optional[str] = None, out_format: str = 'GPKG', out_epsg_code: Optional[int] = None, del_exist_vec: bool = False)

Function which creates a new output vector file for the pixels within the input image file with the value specified. Pixel locations will be the centroid of the the pixel

Parameters
  • input_img – the input image

  • img_band – the band within the image to use

  • img_msk_val – the image value selecting the pixels to be converted to points

  • out_vec_file – Output vector file

  • out_vec_lyr – output vector layer name.

  • out_format – output file format (default GPKG).

  • out_epsg_code – optionally provide an EPSG code for the output layer. If None then taken from input image.

  • del_exist_vec – remove output file if it exists.

rsgislib.vectorutils.createvectors.extract_image_footprint(input_img: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', tmp_dir: str = 'tmp', reproj_to: Optional[str] = None, no_data_val=None)

A function to extract an image footprint as a vector.

Parameters
  • input_img – the input image file for which the footprint will be extracted.

  • out_vec_file – output vector file path and name.

  • out_vec_lyr – output vector layer name.

  • tmp_dir – temp directory which will be used during processing. It will be created and deleted once processing complete.

  • reproj_to – optional if not None then an ogr2ogr command will be run and the input here is what is to go into the ogr2ogr command after -t_srs. E.g., -t_srs epsg:4326

rsgislib.vectorutils.createvectors.create_poly_vec_for_lst_bboxs(csv_file, out_vec_file, out_vec_lyr, out_format, epsg_code, min_x_col=0, max_x_col=1, min_y_col=2, max_y_col=3, ignore_rows=0, del_exist_vec=False)

This function takes a CSV file of bounding boxes (1 per line) and creates a polygon vector layer.

Parameters
  • csv_file – input CSV file.

  • out_vec_file – output vector file

  • out_vec_file – output vector layer

  • out_format – output vector file format (e.g., GPKG)

  • epsg_code – EPSG code specifying the projection of the data (4326 is WSG84 Lat/Long).

  • min_x_col – The index (starting at 0) for the column within the CSV file for the minimum X coordinate.

  • max_x_col – The index (starting at 0) for the column within the CSV file for the maximum X coordinate.

  • min_y_col – The index (starting at 0) for the column within the CSV file for the minimum Y coordinate.

  • max_y_col – The index (starting at 0) for the column within the CSV file for the maximum Y coordinate.

  • ignore_rows – The number of rows to ignore from the start of the CSV file (i.e., column headings)

  • del_exist_vec – If the output file already exists delete it before proceeding.

rsgislib.vectorutils.createvectors.define_grid(bbox, x_size, y_size, in_epsg_code, out_vec, out_vec_lyr, out_format='GPKG', out_epsg_code=None, utm_grid=False, utm_hemi=False)

Define a grid of bounding boxes for a specified bounding box. The output grid can be in a different projection to the inputted bounding box. Where a UTM grid is required and there are multiple UTM zones then the layer name will be appended with utmXX[n|s]. Note. this only works with formats such as GPKG which support multiple layers. A shapefile which only supports 1 layer will not work.

Parameters
  • bbox – a bounding box (xMin, xMax, yMin, yMax)

  • x_size – Output grid size in X axis. If out_epsg_code or utm_grid defined then the grid size needs to be in the output unit.

  • y_size – Output grid size in Y axis. If out_epsg_code or utm_grid defined then the grid size needs to be in the output unit.

  • in_epsg_code – EPSG code for the projection of the bbox

  • out_vec – output vector file.

  • out_vec_lyr – output vector layer name.

  • out_format – output vector file format (see OGR codes). Default is GPKG.

  • out_epsg_code – if provided the output grid is reprojected to the projection defined by this EPSG code. (note. the grid size needs to the in the unit of this projection). Default is None.

  • utm_grid – provide the output grid in UTM projection where grid might go across multiple UTM zones. Default is False. grid size unit should be metres.

  • utm_hemi – if outputting a UTM projected grid then decided whether to use hemispheres or otherwise. If False then everything will be projected northern hemisphere (e.g., as with landsat or sentinel-2). Default is False.

rsgislib.vectorutils.createvectors.create_wgs84_vector_grid(out_vec_file: str, out_vec_lyr: str, out_format: str, grid_x: int, grid_y: int, bbox: List[float], overlap: Optional[float] = None, tile_names_col: str = 'tile_names', tile_name_prefix: str = '')

A function which creates a regular grid across a defined area using the WGS84 (EPSG:4326) projection.

Parameters
  • out_vec_file – output vector file

  • out_vec_lyr – output vector layer name

  • out_format – the output vector file format.

  • grid_x – the size in the x axis of the grid cells.

  • grid_y – the size in the y axis of the grid cells.

  • bbox – the area for which cells will be defined (MinX, MaxX, MinY, MaxY).

  • overlap – the overlap added to each grid cell. If None then no overlap applied.

  • tile_names_col – The output column name for the tile names.

  • tile_name_prefix – A prefix for the tile names.

rsgislib.vectorutils.createvectors.create_poly_vec_bboxs(vec_file, vec_lyr, out_format, epsg_code, bboxs, atts=None, att_types=None, overwrite=True)

This function creates a set of polygons for a set of bounding boxes. When creating an attribute the available data types are ogr.OFTString, ogr.OFTInteger, ogr.OFTReal

Parameters
  • vec_file – output vector file/path

  • vec_lyr – output vector layer

  • out_format – the output vector layer type.

  • epsg_code – EPSG code specifying the projection of the data (e.g., 4326 is WSG84 Lat/Long).

  • bboxs – is a list of bounding boxes ([xMin, xMax, yMin, yMax]) to be saved to the output vector.

  • atts – is a dict of lists of attributes with the same length as the bboxs list. The dict should be named the same as the attTypes[‘names’] list.

  • att_types – is a dict with a list of attribute names (attTypes[‘names’]) and types (attTypes[‘types’]). The list must be the same length as one another and the number of atts. Example type: ogr.OFTString

  • overwrite – overwrite the vector file specified if it exists. Use False for GPKG where you want to add multiple layers.

rsgislib.vectorutils.createvectors.write_pts_to_vec(out_vec_file, out_vec_lyr, out_format, epsg_code, pts_x, pts_y, atts=None, att_types=None, replace=True, file_opts=[], lyr_opts=[])

This function creates a set of polygons for a set of bounding boxes. When creating an attribute the available data types are ogr.OFTString, ogr.OFTInteger, ogr.OFTReal

Parameters
  • out_vec_file – output vector file/path

  • out_vec_lyr – output vector layer

  • out_format – the output vector layer type.

  • epsg_code – EPSG code specifying the projection of the data (e.g., 4326 is WSG84 Lat/Long).

  • pts_x – is a list of x coordinates.

  • pts_y – is a list of y coordinates.

  • atts – is a dict of lists of attributes with the same length as the ptsX & ptsY lists. The dict should be named the same as the attTypes[‘names’] list.

  • att_types – is a dict with a list of attribute names (attTypes[‘names’]) and types (attTypes[‘types’]). The list must be the same length as one another and the number of atts. Example type: ogr.OFTString

  • replace – if the output vector file exists overwrite.

  • file_opts – Options passed when creating the file. Default: []. Common value might be [“OVERWRITE=YES”]

  • lyr_opts – Options passed when create the layer Default: []. Common value might be [“OVERWRITE=YES”]

rsgislib.vectorutils.createvectors.create_bboxs_for_pts(vec_file: str, vec_lyr: str, bbox_width: float, bbox_height: float, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', del_exist_vec: bool = False, epsg_code: Optional[int] = None)

A function which takes a set of points (from the input vector layer) and creates a set of boxes with the same height and width, one for each point.

Note, the geometry type for the input vector layer must be points.

Parameters
  • vec_file – the input vector file/path

  • vec_lyr – the name of the input vector layer.

  • bbox_width – width (in the units of the projection) for the output boxes

  • bbox_height – height (in the units of the projection) for the output boxes

  • out_vec_file – output vector file/path

  • out_vec_lyr – output vector layer name

  • out_format – output vector format.

  • del_exist_vec – If the output file already exists delete it before proceeding.

  • epsg_code – if not well defined specify the EPSG code for the projection.

rsgislib.vectorutils.create_lines_of_points(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str, step: float, del_exist_vec: bool)

A function to create a regularly spaced set of points following a set of lines.

Parameters
  • vec_file – is a string containing the input vector file path (must be lines)

  • vec_lyr – is a string containing the name of the input vector layer name

  • out_vec_file – is a string containing the output vector file path (will be points)

  • out_vec_lyr – is a string containing the name of the output vector layer name

  • out_format – is a string containing the output file format

  • step – is a float specifying the distance between points along the line.

  • del_exist_vec – is a bool, specifying whether to force removal of the output vector if it exists

rsgislib.vectorutils.create_copy_vector_lyr(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str, options: list = [], replace: bool = False, in_memory: bool = False)

A function which creates a copy of the input vector layer.

Parameters
  • vec_file – the file path to the vector file.

  • vec_lyr – the name of the vector layer. If None then first layer is returned.

  • out_vec_file – output vector file

  • out_vec_lyr – output vector layer within the input file.

  • out_format – the OGR driver for the output file.

  • options – provide a list of driver specific options (e.g., ‘OVERWRITE=YES’); see https://www.gdal.org/ogr_formats.html

  • replace – if true the output file is replaced (i.e., overwritten to anything in an existing file will be lost).

  • in_memory – If true vector layer will be read into memory and then outputted.

Vector I/O

rsgislib.vectorutils.open_gdal_vec_lyr(vec_file: str, vec_lyr: Optional[str] = None, readonly: bool = True) -> (<class 'osgeo.ogr.DataSource'>, <class 'osgeo.ogr.Layer'>)

A function which opens a GDAL/OGR vector layer and returns the Dataset and Layer objects. Note, the file must be closed by setting the dataset to None.

Parameters
  • vec_file – the file path to the vector file.

  • vec_lyr – the name of the vector layer. If None then first layer is returned.

  • readonly – if False then the layer will be opened and allow editing of the layer while if True (default) then it will be read only.

Returns

GDAL dataset, GDAL Layer

rsgislib.vectorutils.read_vec_lyr_to_mem(vec_file: str, vec_lyr: str) -> (<class 'osgeo.ogr.DataSource'>, <class 'osgeo.ogr.Layer'>)

A function which reads a vector layer to an OGR in memory layer.

Parameters
  • vec_file – input vector file

  • vec_lyr – input vector layer within the input file.

Returns

ogr_dataset, ogr_layer

rsgislib.vectorutils.get_mem_vec_lyr_subset(vec_file: str, vec_lyr: str, bbox: list) -> (<class 'osgeo.ogr.DataSource'>, <class 'osgeo.ogr.Layer'>)

Function to get an ogr vector layer for the defined bounding box. The returned layer is returned as an in memory ogr Layer object.

Parameters
  • vec_file – vector layer from which the attribute data comes from.

  • vec_lyr – the layer name from which the attribute data comes from.

  • bbox – region of interest (bounding box). Define as [xMin, xMax, yMin, yMax].

Returns

OGR Layer and Dataset objects.

rsgislib.vectorutils.write_vec_lyr_to_file(vec_lyr_obj: osgeo.ogr.Layer, out_vec_file: str, out_vec_lyr: str, out_format: str, options: list = [], replace: bool = False)

A function which reads a vector layer to an OGR in memory layer.

Parameters
  • vec_lyr_obj – OGR vector layer object

  • out_vec_file – output vector file

  • out_vec_lyr – output vector layer within the input file.

  • out_format – the OGR driver for the output file.

  • options – provide a list of driver specific options (e.g., ‘OVERWRITE=YES’); see https://www.gdal.org/ogr_formats.html

  • replace – if true the output file is replaced (i.e., overwritten to anything in an existing file will be lost).

rsgislib.vectorutils.vector_translate(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: Optional[str] = None, out_format: str = 'GPKG', drv_create_opts: list = [], lyr_create_opts: list = [], access_mode: Optional[str] = None, src_srs: Optional[osgeo.osr.SpatialReference] = None, dst_srs: Optional[osgeo.osr.SpatialReference] = None, del_exist_vec: bool = False)

A function which translates a vector file to another format, similar to ogr2ogr. If you wish to reproject the input file then provide a destination srs (e.g., “EPSG:27700”, or wkt string, or proj4 string).

Parameters
  • vec_file – the input vector file.

  • vec_lyr – the input vector layer name

  • out_vec_file – the output vector file.

  • out_vec_lyr – the name of the output vector layer (if None then the same as the input).

  • out_format – the output vector file format (e.g., GPKG, GEOJSON, etc.)

  • drv_create_opts – a list of options for the creation of the output file.

  • lyr_create_opts – a list of options for the creation of the output layer.

  • access_mode – default is None for creation but other but other options are: [None (creation), ‘update’, ‘append’, ‘overwrite’]

  • src_srs – provide a source spatial reference for the input vector file. Default=None. can be used to provide a projection where none has been specified or the information has gone missing. Can be used without performing a reprojection.

  • dst_srs – provide a spatial reference for the output vector file to be reprojected to. (Default=None) If specified then the file will be reprojected.

  • del_exist_vec – remove output file if it exists.

rsgislib.vectorutils.reproj_vector_layer(vec_file: str, out_vec_file: str, out_proj_wkt: str, out_format: str = 'GPKG', out_vec_lyr: Optional[str] = None, vec_lyr: Optional[str] = None, in_proj_wkt: Optional[str] = None, del_exist_vec: bool = False)

A function which reprojects a vector layer. You might also consider using rsgislib.vectorutils.vector_translate, particularly if you are reprojecting the data and changing between coordinate units (e.g., degrees to meters)

Parameters
  • vec_file – is a string with name and path to input vector file.

  • out_vec_file – is a string with name and path to output vector file.

  • out_proj_wkt – is a string with the WKT string for the output vector file.

  • out_format – is the output vector file format. Default is ESRI Shapefile.

  • out_vec_lyr – is a string for the output layer name. If None then ignored and assume there is just a single layer in the vector and layer name is the same as the file name.

  • vec_lyr – is a string for the input layer name. If None then ignored and assume there is just a single layer in the vector.

  • in_proj_wkt – is a string with the WKT string for the input shapefile (Optional; taken from input file if not specified).

rsgislib.vectorutils.reproj_vec_lyr_obj(vec_lyr_obj: osgeo.ogr.Layer, out_vec_file: str, out_epsg: int, out_format: str = 'MEMORY', out_vec_lyr: Optional[str] = None, in_epsg: Optional[int] = None, print_feedback: bool = True)

A function which reprojects a vector layer. You might also consider using rsgislib.vectorutils.vector_translate, particularly if you are reprojecting the data and changing between coordinate units (e.g., degrees to meters)

Parameters
  • vec_lyr_obj – is a GDAL vector layer object.

  • out_vec_file – is a string with name and path to output vector file - is created.

  • out_epsg – is an int with the EPSG code to which the input vector layer is to be reprojected to.

  • out_format – is the output vector file format. Default is MEMORY - i.e., nothing written to disk.

  • out_vec_lyr – is a string for the output layer name. If None then ignored and assume there is just a single layer in the vector and layer name is the same as the file name.

  • inLyrName – is a string for the input layer name. If None then ignored and assume there is just a single layer in the vector.

  • in_epsg – is an int with the EPSG code for the input vector file (Optional; taken from input file if not specified).

  • print_feedback – is a boolean (Default True) specifying whether feedback should be printed to the console.

Returns

Returns the output datasource and layer objects (result_ds, result_lyr). datasource needs to be set to None once you have finished using to free memory and if written to disk to ensure the whole dataset is written.

rsgislib.vectorutils.reproj_wgs84_vec_to_utm(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: Optional[str] = None, use_hemi: bool = True, out_format: str = 'GPKG', drv_create_opts: list = [], lyr_create_opts: list = [], access_mode: str = 'overwrite', del_exist_vec: bool = False)

A function which reprojects an input file projected in WGS84 (EPSG:4326) to UTM, where the UTM zone is automatically identified using the mean x and y.

Parameters
  • vec_file – the input vector file.

  • vec_lyr – the input vector layer name

  • out_vec_file – the output vector file.

  • out_vec_lyr – the name of the output vector layer (if None then the same as the input).

  • use_hemi – True differentiate between Southern and Northern hemisphere. False use Northern hemisphere.

  • out_format – the output vector file format (e.g., GPKG, GEOJSON, etc.)

  • drv_create_opts – a list of options for the creation of the output file.

  • lyr_create_opts – a list of options for the creation of the output layer.

  • access_mode – by default the function overwrites the output file but other options are: [‘update’, ‘append’, ‘overwrite’]

  • del_exist_vec – remove output file if it exists.

Create Rasters

rsgislib.vectorutils.createrasters.rasterise_vec_lyr(vec_file: str, vec_lyr: str, input_img: str, output_img: str, gdalformat: str = 'KEA', burn_val: int = 1, datatype=5, att_column=None, use_vec_extent=False, thematic=True, no_data_val=0)

A utility to rasterise a vector layer to an image covering the same region and at the same resolution as the input image.

Parameters
  • vec_file – is a string specifying the input vector file

  • vec_lyr – is a string specifying the input vector layer name.

  • input_img – is a string specifying the input image defining the grid, pixel resolution and area for the rasterisation (if None and vecExt is False them assumes output image already exists and just uses it as is burning vector into it)

  • output_img – is a string specifying the output image for the rasterised vector file

  • gdalformat – is the output image format (Default: KEA).

  • burn_val – is the value for the output image pixels if no attribute is provided.

  • datatype – of the output file, default is rsgislib.TYPE_8UINT

  • att_column – is a string specifying the attribute to be rasterised, value of None creates a binary mask and “FID” creates a temp vector file with a “FID” column and rasterises that column.

  • use_vec_extent – is a boolean specifying that the output image should be cut to the same extent as the input shapefile (Default is False and therefore output image will be the same as the input).

  • thematic – is a boolean (default True) specifying that the output image is an thematic dataset so a colour table will be populated.

  • no_data_val – is a float specifying the no data value associated with a continuous output image.

from rsgislib import vectorutils

inputVector = 'crowns.shp'
inputVectorLyr = 'crowns'
inputImage = 'injune_p142_casi_sub_utm.kea'
outputImage = 'psu142_crowns.kea'
vectorutils.rasterise_vec_lyr(inputVector,
                              inputVectorLyr,
                              inputImage,
                              outputImage,
                              'KEA',
                              vecAtt='FID')
rsgislib.vectorutils.createrasters.rasterise_vec_lyr_obj(vec_lyr_obj: osgeo.ogr.Layer, output_img: str, burn_val: int = 1, att_column: Optional[str] = None, calc_stats: bool = True, thematic: bool = True, no_data_val: float = 0)

A utility to rasterise a vector layer to an image covering the same region.

Parameters
  • vec_lyr_obj – is a OGR Vector Layer Object

  • output_img – is a string specifying the output image, this image must already exist and intersect within the input vector layer.

  • burn_val – is the value for the output image pixels if no attribute is provided.

  • att_column – is a string specifying the attribute to be rasterised, value of None creates a binary mask and “FID” creates a temp vector layer with a “FID” column and rasterises that column.

  • calc_stats – is a boolean specifying whether image stats and pyramids should be calculated.

  • thematic – is a boolean (default True) specifying that the output image is an thematic dataset so a colour table will be populated.

  • no_data_val – is a float specifying the no data value associated with a continuous output image.

rsgislib.vectorutils.createrasters.copy_vec_to_rat(vec_file: str, vec_lyr: str, input_img: str, output_img: str, fid_col: str = 'FID')

A utility to create raster copy of a polygon vector layer. The output image is a KEA file and the attribute table has the attributes from the vector layer.

Parameters
  • vec_file – is a string specifying the input vector file

  • vec_lyr – is a string specifying the layer within the input vector file

  • input_img – is a string specifying the input image defining the grid, pixel resolution and area for the rasterisation

  • output_img – is a string specifying the output KEA image for the rasterised vector layer

from rsgislib import vectorutils

inputVector = 'crowns.shp'
inputImage = 'injune_p142_casi_sub_utm.kea'
outputImage = 'psu142_crowns.kea'

vectorutils.copy_vec_to_rat(inputVector, 'crowns', inputImage, outputImage)

Merge Vectors

rsgislib.vectorutils.merge_vectors_to_gpkg(in_vec_files: list, out_vec_file: str, out_vec_lyr: str, exists: bool = False)

Function which will merge a list of vector files into an single output GeoPackage (GPKG) file using ogr2ogr.

Parameters
  • in_vec_files – is a list of input files.

  • out_vec_file – is the output GPKG database (*.gpkg)

  • out_vec_lyr – is the layer name in the output database (i.e., you can merge layers into single layer or write a number of layers to the same database).

  • exists – boolean which specifies whether the database file exists or not.

rsgislib.vectorutils.merge_vector_lyrs_to_gpkg(vec_file: str, out_vec_file: str, out_vec_lyr: str, exists: bool = False)

Function which will merge all the layers in the input vector file into an single output GeoPackage (GPKG) file using ogr2ogr.

Parameters
  • vec_file – is a vector file which contains multiple layers which are to be merged

  • out_vec_file – is the output GPKG database (*.gpkg)

  • out_vec_lyr – is the layer name in the output database (i.e., you can merge layers into single layer or write a number of layers to the same database).

  • exists – boolean which specifies whether the database file exists or not.

rsgislib.vectorutils.merge_vectors_to_gpkg_ind_lyrs(in_vec_files: list, out_vec_file: str, rename_dup_lyrs: bool = False, geom_type: Optional[str] = None)

Function which will merge a list of vector files into an single output GPKG file where each input file forms a new layer using the existing layer name. This function wraps the ogr2ogr command.

Parameters
  • in_vec_files – is a list of input files.

  • out_vec_file – is the output GPKG database (*.gpkg)

  • rename_dup_lyrs – If False an exception will be throw if any input layers has the same name. If True a layer will be renamed - with a random set of letters/numbers on the end.

  • geom_type – Force the output vector to have a particular geometry type (e.g., ‘POLYGON’). Same options as ogr2ogr.

rsgislib.vectorutils.merge_vector_layers(vecs_dict: list, out_vec_file: str, out_vec_lyr: Optional[str] = None, out_format: str = 'GPKG', out_epsg: Optional[int] = None)

A function which merges the input vector layers into a single output file using geopandas.

Parameters
  • vecs_dict – list of dicts with keys [{‘file’: ‘/file/path/to/file.gpkg’, ‘layer’: ‘layer_name’}] providing the file paths and layer names.

  • out_vec_file – output vector file.

  • out_vec_lyr – output vector layer.

  • out_format – output file format.

  • out_epsg – if input layers are different projections then option can be used to define the output projection.

rsgislib.vectorutils.merge_vector_files(vec_files: list, out_vec_file: str, out_vec_lyr: Optional[str] = None, out_format: str = 'GPKG', out_epsg: Optional[int] = None)

A function which merges the input files into a single output file using geopandas. If the input files have multiple layers they are all merged into the output file.

Parameters
  • vec_files – list of input files

  • out_vec_file – output vector file.

  • out_vec_lyr – output vector layer.

  • out_format – output file format.

  • out_epsg – if input layers are different projections then option can be used to define the output projection.

rsgislib.vectorutils.merge_utm_vecs_wgs84(in_vec_files: list, out_vec_file: str, out_vec_lyr: Optional[str] = None, out_format: str = 'GPKG', n_hemi_utm_file: Optional[str] = None, s_hemi_utm_file: Optional[str] = None, width_thres: float = 350)

A function which merges input files in UTM projections to the WGS84 projection cutting polygons which wrap from one side of the world to other (i.e., 180/-180 boundary).

Parameters
  • in_vec_files – list of input files

  • out_vec_file – output vector file.

  • out_vec_lyr – output vector layer - only used if output format is GPKG

  • out_format – output file format.

  • n_utm_zones_vec – GPKG file with layer per zone (layer names: 01, 02, … 59, 60) each projected in the northern hemisphere UTM projections.

  • s_utm_zone_vec – GPKG file with layer per zone (layer names: 01, 02, … 59, 60) each projected in the southern hemisphere UTM projections.

  • width_thres – The threshold (default 350 degrees) for the width of a polygon for which the polygons will be checked, looping through all the coordinates

rsgislib.vectorutils.merge_to_multi_layer_vec(input_file_lyrs: list, out_vec_file: str, out_format: str = 'GPKG', overwrite: bool = True)

A function which takes a list of vector files and layers (as VecLayersInfoObj objects) and merged them into a multi-layer vector file.

Parameters
  • input_file_lyrs – list of VecLayersInfoObj objects.

  • out_vec_file – output vector file.

  • out_format – output format Default=’GPKG’.

  • overwrite – bool (default = True) specifying whether the input file should be overwritten if it already exists.

class rsgislib.vectorutils.VecLayersInfoObj(vec_file: Optional[str] = None, vec_lyr: Optional[str] = None, vec_out_lyr: Optional[str] = None)

This is a class to store the information associated within the rsgislib.vectorutils.merge_to_multi_layer_vec function.

Parameters
  • vec_file – input vector file.

  • vec_lyr – input vector layer name

  • vec_out_lyr – output vector layer name

  • vec_file – input vector file.

  • vec_lyr – input vector layer name

  • vec_out_lyr – output vector layer name

Vector Select / Subset

rsgislib.vectorutils.get_att_lst_select_feats(vec_file: str, vec_lyr: str, att_names: list, vec_sel_file: str, vec_sel_lyr: str) list

Function to get a list of attribute values from features which intersect with the select layer.

Parameters
  • vec_file – vector layer from which the attribute data comes from.

  • vec_lyr – the layer name from which the attribute data comes from.

  • att_names – a list of attribute names to be outputted.

  • vec_sel_file – the vector file which will be intersected within the vector file.

  • vec_sel_lyr – the layer name which will be intersected within the vector file.

Returns

list of dictionaries with the output values.

rsgislib.vectorutils.get_att_lst_select_feats_lyr_objs(vec_lyr_obj: osgeo.ogr.Layer, att_names: list, vec_sel_lyr_obj: osgeo.ogr.Layer) list

Function to get a list of attribute values from features which intersect with the select layer.

Parameters
  • vec_lyr_obj – the OGR layer object from which the attribute data comes from.

  • att_names – a list of attribute names to be outputted.

  • vec_sel_lyr_obj – the OGR layer object which will be intersected within the vector file.

Returns

list of dictionaries with the output values.

rsgislib.vectorutils.get_att_lst_select_bbox_feats(vec_file: str, vec_lyr: str, att_names: list, bbox: list, bbox_epsg: Optional[int] = None) list

Function to get a list of attribute values from features which intersect with the select layer.

Parameters
  • vec_file – the OGR file from which the attribute data comes from.

  • vec_lyr – the layer name within the file from which the attribute data comes from.

  • att_names – a list of attribute names to be outputted.

  • bbox – the bounding box for the region of interest (xMin, xMax, yMin, yMax).

  • bbox_epsg – the projection of the BBOX (if None then ignore).

Returns

list of dictionaries with the output values.

rsgislib.vectorutils.get_att_lst_select_bbox_feats_lyr_objs(vec_lyr_obj: osgeo.ogr.Layer, att_names: list, bbox: list, bbox_epsg: Optional[int] = None) list

Function to get a list of attribute values from features which intersect with the select layer.

Parameters
  • vec_lyr_obj – the OGR layer object from which the attribute data comes from.

  • att_names – a list of attribute names to be outputted.

  • bbox – the bounding box for the region of interest (xMin, xMax, yMin, yMax).

  • bbox_epsg – the projection of the BBOX (if None then ignore).

Returns

list of dictionaries with the output values.

rsgislib.vectorutils.select_intersect_feats(vec_file: str, vec_lyr: str, vec_roi_file: str, vec_roi_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')

Function to select the features which intersect with region of interest (ROI) features which will be outputted into a new vector layer.

Parameters
  • vec_file – vector layer from which the attribute data comes from.

  • vec_lyr – the layer name from which the attribute data comes from.

  • vec_roi_file – the vector file which will be intersected within the vector file.

  • vec_roi_lyr – the layer name which will be intersected within the vector file.

  • out_vec_file – the vector file which will be outputted.

  • out_vec_lyr – the layer name which will be outputted.

  • out_format – output vector format (default GPKG)

rsgislib.vectorutils.export_spatial_select_feats(vec_file: str, vec_lyr: str, vec_sel_file: str, vec_sel_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str)

Function to get a list of attribute values from features which intersect with the select layer.

Parameters
  • vec_file – vector layer from which the attribute data comes from.

  • vec_lyr – the layer name from which the attribute data comes from.

  • vec_sel_file – the vector file which will be intersected within the vector file.

  • vec_sel_lyr – the layer name which will be intersected within the vector file.

  • out_vec_file – output vector file/path

  • out_vec_lyr – output vector layer

  • out_format – the output vector layer type.

rsgislib.vectorutils.subset_envs_vec_lyr_obj(vec_lyr_obj: osgeo.ogr.Layer, bbox: list, epsg: Optional[int] = None) -> (<class 'osgeo.ogr.DataSource'>, <class 'osgeo.ogr.Layer'>)

Function to get an ogr vector layer for the defined bounding box. The returned layer is returned as an in memory ogr Layer object.

Parameters
  • vec_lyr_obj – OGR Layer Object.

  • bbox – region of interest (bounding box). Define as [xMin, xMax, yMin, yMax].

  • epsg – provide an EPSG code for the layer if not well defined by the input layer.

Returns

OGR Layer and Dataset objects.

rsgislib.vectorutils.subset_veclyr_to_featboxs(vec_file_bbox: str, vec_lyr_bbox: str, vec_file_tosub: str, vec_lyr_tosub: str, out_lyr_name: str, out_file_base: str, out_file_end: str = 'gpkg', out_format: str = 'GPKG')

A function which subsets an input vector layer using the BBOXs of the features within another vector layer.

Parameters
  • vec_file_bbox – The vector file for the features which define the BBOXs

  • vec_lyr_bbox – The vector layer for the features which define the BBOXs

  • vec_file_tosub – The vector file for the layer which is to be subset.

  • vec_lyr_tosub – The vector layer for the layer which is to be subset.

  • out_lyr_name – The layer name for the output files - all output files will have the same layer name.

  • out_file_base – The base name for the output files. A numeric count 0-n will be inserted between this and the ending.

  • out_file_end – The output file ending (e.g., gpkg).

  • out_format – The output file driver (e.g., GPKG).

rsgislib.vectorutils.spatial_select(vec_file: str, vec_lyr: str, vec_roi_file: str, vec_roi_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')

A function to perform a spatial selection within the input vector using a ROI vector layer. This function uses geopandas so ensure that is installed.

Parameters
  • vec_file – Input vector file from which features are selected.

  • vec_lyr – Input vector file layer from which features are selected.

  • vec_roi_file – The ROI vector file used to select features within the input file.

  • vec_roi_lyr – The ROI vector layer used to select features within the input file.

  • out_vec_file – The output vector file with the selected features.

  • out_vec_lyr – The output layer file with the selected features.

  • out_format – the output vector format

rsgislib.vectorutils.subset_by_attribute(vec_file: str, vec_lyr: str, sub_col: str, sub_vals: list, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', match_type: str = 'equals')

A function which subsets an input vector layer based on a list of values.

Parameters
  • vec_file – Input vector file.

  • vec_lyr – Input vector layer

  • sub_col – The column used to subset the layer.

  • sub_vals – A list of values used to subset the layer. If using contains or start then regular expressions supported by the re library can be provided.

  • out_vec_file – The output vector file

  • out_vec_lyr – The output vector layer

  • out_format – The output vector format.

  • match_type – The type of match for the subset. Options: equals (default) - the same value. contains - string is anywhere within attribute value. start - string matches the start of the attribute value.

rsgislib.vectorutils.drop_rows_by_attribute(vec_file: str, vec_lyr: str, sub_col: str, sub_vals: list, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')

A function which subsets an input vector layer based on a list of values.

Parameters
  • vec_file – Input vector file.

  • vec_lyr – Input vector layer

  • sub_col – The column used to subset the layer.

  • sub_vals – A list of values used to subset the layer. If using contains or start then regular expressions supported by the re library can be provided.

  • out_vec_file – The output vector file

  • out_vec_lyr – The output vector layer

  • out_format – The output vector format.

  • match_type – The type of match for the subset. Options: equals (default) - the same value. contains - string is anywhere within attribute value. start - string matches the start of the attribute value.

rsgislib.vectorutils.rm_feat_att_duplicates(vec_file: str, vec_lyr: str, col_name: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')

A function which uses the values within an attribute column to remove duplicate features from the vector layer.

Parameters
  • vec_file – Input vector file.

  • vec_lyr – Input vector layer within the input file.

  • col_name – The column used to define unique features

  • out_vec_file – Output vector file

  • out_vec_lyr – output vector layer name.

  • out_format – output file format (default GPKG).

Vector Split

rsgislib.vectorutils.split_vec_lyr(vec_file: str, vec_lyr: str, n_feats: int, out_format: str, out_dir: str, out_vec_base: str, out_vec_ext: str)

A function which splits the input vector layer into a number of output layers.

Parameters
  • vec_file – input vector file.

  • vec_lyr – input layer name.

  • n_feats – number of features within each output file.

  • out_format – output file driver.

  • out_dir – output directory for the created output files.

  • out_vec_base – output layer name will be the same as the base file name.

  • out_vec_ext – file ending (e.g., gpkg). Note don’t include the dot, so input gpkg rather than .gpkg.

rsgislib.vectorutils.split_by_attribute(vec_file: str, vec_lyr: str, split_col_name: str, multi_layers: bool = True, out_vec_file: Optional[str] = None, out_file_path: Optional[str] = None, out_file_ext: Optional[str] = None, out_format: str = 'GPKG', dissolve: bool = False, chk_lyr_names: bool = True)

A function which splits a vector layer by an attribute value into either different layers or different output files.

Parameters
  • vec_file – Input vector file

  • vec_lyr – Input vector layer

  • split_col_name – The column name by which the vector layer will be split.

  • multi_layers – Boolean (default True). If True then a mulitple layer output file will be created (e.g., GPKG). If False then individual files will be outputted.

  • out_vec_file – Output vector file - only used if multi_layers = True

  • out_file_path – Output file path (directory) if multi_layers = False.

  • out_file_ext – Output file extension is multi_layers = False

  • out_format – The output format (e.g., GPKG, ESRI Shapefile).

  • dissolve – Boolean (Default=False) if True then a dissolve on the specified variable will be run as layers are separated.

  • chk_lyr_names – If True (default) layer names (from split_col_name) will be checked, which means punctuation removed and all characters being ascii characters.

rsgislib.vectorutils.split_feats_to_mlyrs(vec_file: str, vec_lyr: str, out_vec_file: str, out_format: str = 'GPKG')

A function which splits an existing vector layer into multiple layers

Parameters
  • vec_file – input vector file

  • vec_lyr – input vector layer

  • out_vec_file – output file, note the format must be one which supports multiple layers (e.g., GPKG).

  • out_format – The output format of the output file.

rsgislib.vectorutils.split_vec_lyr_random_subset(vec_file: str, vec_lyr: str, out_rmain_vec_file: str, out_rmain_vec_lyr: str, out_smpl_vec_file: str, out_smpl_vec_lyr: str, n_smpl: int, out_format: str = 'GPKG', rnd_seed: Optional[int] = None)

A function to split a vector layer into two subsets by randomly sampling the input file. This function uses geopandas so that library must therefore be installed.

Parameters
  • vec_file – Input vector file.

  • vec_lyr – Input vector layer.

  • out_rmain_vec_file – Output vector file with the ‘remain’ outputs (i.e., the remainder once the sample if taken)

  • out_rmain_vec_lyr – Output vector layer with the ‘remain’ outputs (i.e., the remainder once the sample if taken)

  • out_smpl_vec_file – Output vector file with the sampled outputs

  • out_smpl_vec_lyr – Output vector layer with the sampled outputs

  • n_smpl – the number of samples to be randomly selected

  • out_format – The output format of the output file. (Default: GPKG)

  • rnd_seed – A seed for the random number generator.

rsgislib.vectorutils.create_train_test_smpls(vec_file: str, vec_lyr: str, out_train_vec_file: str, out_train_vec_lyr: str, out_test_vec_file: str, out_test_vec_lyr: str, out_format: str = 'GPKG', prop_test: float = 0.2, tmp_dir: str = 'tmp', rnd_seed: Optional[int] = None)

A function for splitting a vector dataset into training and testing datasets.

Parameters
  • vec_file – Input vector file.

  • vec_lyr – Input vector layer.

  • out_train_vec_file – Output vector file with the training data.

  • out_train_vec_lyr – Output vector layer with the training data.

  • out_test_vec_file – Output vector file with the testing data.

  • out_test_vec_lyr – Output vector layer with the testing data.

  • out_format – The output format of the output file. (Default: GPKG)

  • prop_test – Proportion of the dataset to be defined as a the test data

  • tmp_dir – a temporary directory for intimediate outputs.

  • rnd_seed – A seed for the random number generator.

Vector Geometry

rsgislib.vectorutils.geopd_check_polys_wgs84_bounds_geometry(data_gdf, width_thres: float = 350)

A function which checks a polygons within the geometry of a geopanadas dataframe for specific case where they on the east/west edge (i.e., 180 / -180) and are therefore being wrapped around the world. For example, this function would change a longitude -179.91 to 180.01. The geopandas dataframe will be edit in place.

This function will import the shapely library.

Parameters
  • data_gdf – geopandas dataframe.

  • width_thres – The threshold (default 350 degrees) for the width of a polygon for which the polygons will be checked, looping through all the coordinates

Returns

geopandas dataframe

Vector / Raster Tests

rsgislib.vectorutils.does_vmsk_img_intersect(input_vmsk_img: str, vec_roi_file: str, vec_roi_lyr: str, tmp_dir: str, vec_epsg: Optional[int] = None)

This function checks whether the input binary raster mask intersects with the input vector layer. A check is first done as to whether the bounding boxes intersect, if they do then the intersection between the images is then calculated. The input image and vector can be in different projections but the projection needs to be well defined.

Parameters
  • input_vmsk_img – Input binary mask image file.

  • vec_roi_file – The input vector file.

  • vec_roi_lyr – The name of the input layer.

  • tmp_dir – a temporary directory for files generated during processing.

  • vec_epsg – If projection is poorly defined by the vector layer then it can be specified.

Vector Info

rsgislib.vectorutils.get_proj_wkt_from_vec(vec_file: str, vec_lyr: Optional[str] = None) str

A function which gets the WKT projection from the inputted vector file.

Parameters
  • vec_file – is a string with the input vector file name and path.

  • vec_lyr – is a string with the input vector layer name, if None then first layer read. (default: None)

Returns

WKT representation of projection

rsgislib.vectorutils.get_proj_epsg_from_vec(vec_file: str, vec_lyr: Optional[str] = None) int

A function which gets the EPSG projection from the inputted vector file.

Parameters
  • vec_file – is a string with the input vector file name and path.

  • vec_lyr – is a string with the input vector layer name, if None then first layer read. (default: None)

Returns

EPSG representation of projection

rsgislib.vectorutils.get_vec_feat_count(vec_file: str, vec_lyr: Optional[str] = None, compute_count: bool = True) int

Get a count of the number of features in the vector layers.

Parameters
  • vec_file – is a string with the input vector file name and path.

  • vec_lyr – is the layer for which extent is to be calculated (Default: None). if None assume there is only one layer and that will be read.

  • compute_count – is a boolean which specifies whether the layer extent should be calculated (rather than estimated from header) even if that operation is computationally expensive.

Returns

nfeats

rsgislib.vectorutils.count_feats_per_att_val(vec_file: str, vec_lyr: str, col_name: str, out_df_dict: bool = False) Dict

A function which returns the count of features for each variable value.

Parameters
  • vec_file – Input vector file.

  • vec_lyr – Input vector layer within the input file.

  • col_name – The column used to count the number of features per value.

  • out_df_dict – if true then dict will be formatted to import into a pandas dataframe. Otherwise, the output dict will use the attribute values as the key and count as value.

Returns

either dict with keys of vals and count for import into pandas or with attribute value and number of features

rsgislib.vectorutils.get_vec_lyrs_lst(vec_file: str) List[str]

A function which returns a list of available layers within the inputted vector file.

Parameters

vec_file – file name and path to input vector layer.

Returns

list of layer names (can be used with gdal.Dataset.GetLayerByName()).

rsgislib.vectorutils.get_vec_layer_extent(vec_file: str, vec_lyr: Optional[str] = None, compute_if_exp: bool = True) list

Get the extent of the vector layer.

Parameters
  • vec_file – is a string with the input vector file name and path.

  • vec_lyr – is the layer for which extent is to be calculated (Default: None) if None assume there is only one layer and that will be read.

  • compute_if_exp – is a boolean which specifies whether the layer extent should be calculated (rather than estimated from header) even if that operation is computationally expensive.

Returns

boundary box is returned (MinX, MaxX, MinY, MaxY)

rsgislib.vectorutils.get_vec_lyr_cols(vec_file: str, vec_lyr: str) List[str]

A function which returns a list of columns from the input vector layer.

Parameters
  • vec_file – input vector file.

  • vec_lyr – input vector layer

Returns

list of column names

rsgislib.vectorutils.get_ogr_vec_col_datatype_from_gdal_rat_col_datatype(rat_datatype: int) int

Returns the data type to create a column in a OGR vector layer for equalivant to rat_datatype.

Parameters

rat_datatype – the datatype (GFT_Integer, GFT_Real, GFT_String) for the RAT column.

Returns

OGR datatype (OFTInteger, OFTReal, OFTString)

Vectors Utilities

rsgislib.vectorutils.check_validate_geometries(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str, print_err_geoms: bool, del_exist_vec: bool)

A command fit a polygon to the points inputted.

Parameters
  • vec_file – is a string containing the input vector file path

  • vec_lyr – is a string containing the name of the input vector layer name

  • out_vec_file – is a string containing the output vector file path

  • out_vec_lyr – is a string containing the name of the output vector layer name

  • out_format – is a string specifying the output vector GDAL/OGR driver (e.g., GPKG).

  • print_err_geoms – is a bool, specifying whether were errors are found they are printed to the console.

  • del_exist_vec – is a bool, specifying whether to force removal of the output vector if it exists

rsgislib.vectorutils.delete_vector_file(vec_file: str, feedback: bool = True)

Function to delete an existing vector file.

Parameters
  • vec_file – vector file path

  • feedback – Boolean specifying whether the function should print feedback to the console as files are delted.