RSGISLib Vector Utils Module
Vector Attributes
- rsgislib.vectorutils.vector_maths(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str, out_col: str, exp: str, vars: list, del_exist_vec: bool)
A command to calculate a number column from data in existing columns. The syntax for the expression is from the muparser library see here for available operations and syntax .
- Parameters:
vec_file – is a string containing the input vector file path
vec_lyr – is a string containing the name of the input vector layer name
out_vec_file – is a string containing the output vector file path
out_vec_lyr – is a string containing the name of the output vector layer name
out_format – is a string containing the output file format
out_col – is a string containing the name of the output column
exp – is a string containing the muparser expression to be calculated.
vars – is a list of rsgislib.vectorutils.VecColVar objects defining the names of the variables used within the expression and defining which columns they are in the vec_file.
del_exist_vec – is a bool, specifying whether to force removal of the output vector if it exists
- rsgislib.vectorutils.copy_rat_cols_to_vector_lyr(vec_file: str, vec_lyr: str, rat_row_col: str, clumps_img: str, ratcols: List, out_col_names: List = None, out_col_types: List = None)
A function to copy columns from RAT to a vector layer. Note, the vector layer needs a column, which already exists, that specifies the row from the RAT the feature is related to. If you created the vector using the polygonise function then that column will have been created and called ‘PXLVAL’.
- Parameters:
vec_file – The vector file to be used.
vec_lyr – The name of the layer within the vector file.
rat_row_col – The column in the layer which specifies the RAT row the feature corresponds with.
clumps_img – The clumps image with the RAT from which information should be taken.
ratcols – The names of the columns in the RAT to be copied.
out_col_names – If you do not want the same column names as the RAT then you can specify alternatives. If None then the names will be the same as the RAT. (Default = None)
out_col_types – The data types used for the columns in vector layer. If None then matched to RAT. Default is None
- rsgislib.vectorutils.match_closest_vec_pts(vec_base_file: str, vec_base_lyr: str, vec_match_file: str, vec_match_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GeoJSON', tolerance=None, cp_match_atts=False, out_x_col='x_match', out_y_col='y_match', out_dist_col='dist_match', out_att_prefix='match_')
A function which finds the closest point between two vectors of point geometry. There is an option to copy the attributes from the matched points to the output in which case this can be used as a form of spatial join.
Note. this function is not intended to be used with large datasets and the full distance matrix (i.e., every point to every other point) is calculated.
- Parameters:
vec_base_file – Input base vector file which will be outputted
vec_base_lyr – Input base vector layer name.
vec_match_file – Input match vector file which will be matched to the base vector
vec_match_lyr – Input match vector layer name.
out_vec_file – the output vector file path
out_vec_lyr – the output vector layer name
out_format – the output format (Default: GeoJSON)
tolerance – a tolerance threshold where matches over this distance will not be outputted (i.e., the input vector will be subsetted).
cp_match_atts – Copy attributes from the matching vector layer.
out_x_col – the output column name for the matched x coordinate from the vec_match_lyr. Default: x_match
out_y_col – the output column name for the matched y coordinate from the vec_match_lyr. Default: y_match
out_dist_col – the output column name for the distances between the base point and the matched point.
out_att_prefix – A prefix for matched attributes, if outputted (i.e., cp_match_atts = True)
- Returns:
boolean where True is outputted if some points are matched and False if no points are matched.
- class rsgislib.vectorutils.VecColVar(name: str, field_name: str)
A class for using the the vector_math function specifying the input columns and the variable name to be used in the expression.
- Parameters:
name – the name of the variable to be used within the expression
field_name – the name of the column in the attribute table.
name – the name of the variable to be used within the expression
field_name – the name of the column in the attribute table.
Vector Projections
- rsgislib.vectorutils.get_proj_wkt_from_vec(vec_file: str, vec_lyr: str = None) str
A function which gets the WKT projection from the inputted vector file.
- Parameters:
vec_file – is a string with the input vector file name and path.
vec_lyr – is a string with the input vector layer name, if None then first layer read. (default: None)
- Returns:
WKT representation of projection
- rsgislib.vectorutils.get_proj_epsg_from_vec(vec_file: str, vec_lyr: str = None) int
A function which gets the EPSG projection from the inputted vector file.
- Parameters:
vec_file – is a string with the input vector file name and path.
vec_lyr – is a string with the input vector layer name, if None then first layer read. (default: None)
- Returns:
EPSG representation of projection
- rsgislib.vectorutils.redefine_vec_lyr_proj(vec_file: str, vec_lyr: str, epsg_code: int, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which (re-)defines the projection of a vector layer without reprojecting. This is useful if for some reason the projection is either incorrectly represented or not properly defined for some reason.
- Parameters:
vec_file – Input vector file
vec_lyr – Input vector layer
epsg_code – the epsg code for the projection you are defining the layer to.
out_vec_file – the output vector file
out_vec_lyr – the output vector layer
out_format – the output vector format (Default: GPKG)
- rsgislib.vectorutils.reproj_vec_lyr_gp(vec_file: str, vec_lyr: str, epsg_code: int, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which re-projects of a vector layer to a new projection using GeoPandas.
Note. this function loads the layer into memory you can use also use vector_translate for reprojection if you do not want that behaviour.
- Parameters:
vec_file – Input vector file
vec_lyr – Input vector layer
epsg_code – the epsg code for the projection you are defining the layer to.
out_vec_file – the output vector file
out_vec_lyr – the output vector layer
out_format – the output vector format (Default: GPKG)
Create Vectors
- rsgislib.vectorutils.createvectors.polygonise_raster_to_vec_lyr(out_vec_file: str, out_vec_lyr: str, out_format: str, input_img: str, img_band: int = 1, mask_img: str = None, mask_band: int = 1, replace_file: bool = True, replace_lyr: bool = True, pxl_val_fieldname: str = 'PXLVAL', use_8_conn: bool = False)
A utility to polygonise a raster to a OGR vector layer.
- Parameters:
out_vec_file – is a string specifying the output vector file path. If it exists it will be deleted and overwritten.
out_vec_lyr – is a string with the name of the vector layer.
out_format – is a string with the driver
input_img – is a string specifying the input image file to be polygonised
img_band – is an int specifying the image band to be polygonised. (default = 1)
mask_img – is an optional string mask file specifying a no data mask (default = None)
mask_band – is an int specifying the image band to be used the mask (default = 1)
replace_file – is a boolean specifying whether the vector file should be replaced (i.e., overwritten). Default=True.
replace_lyr – is a boolean specifying whether the vector layer should be replaced (i.e., overwritten). Default=True.
pxl_val_fieldname – is a string to specify the name of the output column representing the pixel value within the input image.
use_8_conn – is a bool specifying whether 8 connectedness or 4 connectedness should be used (4 is RSGISLib/GDAL default)
- rsgislib.vectorutils.createvectors.vectorise_pxls_to_pts(input_img: str, img_band: int, img_msk_val: int, out_vec_file: str, out_vec_lyr: str = None, out_format: str = 'GPKG', out_epsg_code: int = None, del_exist_vec: bool = False)
Function which creates a new output vector file for the pixels within the input image file with the value specified. Pixel locations will be the centroid of the pixel
- Parameters:
input_img – the input image
img_band – the band within the image to use
img_msk_val – the image value selecting the pixels to be converted to points
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_format – output file format (default GPKG).
out_epsg_code – optionally provide an EPSG code for the output layer. If None then taken from input image.
del_exist_vec – remove output file if it exists.
- rsgislib.vectorutils.createvectors.extract_image_footprint(input_img: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', tmp_dir: str = 'tmp', reproj_to: str = None, no_data_val: float = None)
A function to extract an image footprint as a vector.
- Parameters:
input_img – the input image file for which the footprint will be extracted.
out_vec_file – output vector file path and name.
out_vec_lyr – output vector layer name.
tmp_dir – temp directory which will be used during processing. It will be created and deleted once processing complete.
reproj_to – optional if not None then an ogr2ogr command will be run and the input here is what is to go into the ogr2ogr command after -t_srs. E.g., -t_srs epsg:4326
- rsgislib.vectorutils.createvectors.create_poly_vec_for_lst_bboxs(csv_file, out_vec_file, out_vec_lyr, out_format, epsg_code, min_x_col=0, max_x_col=1, min_y_col=2, max_y_col=3, ignore_rows=0, del_exist_vec=False)
This function takes a CSV file of bounding boxes (1 per line) and creates a polygon vector layer.
- Parameters:
csv_file – input CSV file.
out_vec_file – output vector file
out_vec_file – output vector layer
out_format – output vector file format (e.g., GPKG)
epsg_code – EPSG code specifying the projection of the data (4326 is WSG84 Lat/Long).
min_x_col – The index (starting at 0) for the column within the CSV file for the minimum X coordinate.
max_x_col – The index (starting at 0) for the column within the CSV file for the maximum X coordinate.
min_y_col – The index (starting at 0) for the column within the CSV file for the minimum Y coordinate.
max_y_col – The index (starting at 0) for the column within the CSV file for the maximum Y coordinate.
ignore_rows – The number of rows to ignore from the start of the CSV file (i.e., column headings)
del_exist_vec – If the output file already exists delete it before proceeding.
- rsgislib.vectorutils.createvectors.define_grid(bbox: Tuple[float, float, float, float] | List[float], x_size: int, y_size: int, in_epsg_code: int, out_vec: str, out_vec_lyr: str, out_format: str = 'GPKG', out_epsg_code: int = None, utm_grid: bool = False, utm_hemi: bool = False)
Define a grid of bounding boxes for a specified bounding box. The output grid can be in a different projection to the inputted bounding box. Where a UTM grid is required and there are multiple UTM zones then the layer name will be appended with utmXX[n|s]. Note. this only works with formats such as GPKG which support multiple layers. A shapefile which only supports 1 layer will not work.
- Parameters:
bbox – a bounding box (xMin, xMax, yMin, yMax)
x_size – Output grid size in X axis. If out_epsg_code or utm_grid defined then the grid size needs to be in the output unit.
y_size – Output grid size in Y axis. If out_epsg_code or utm_grid defined then the grid size needs to be in the output unit.
in_epsg_code – EPSG code for the projection of the bbox
out_vec – output vector file.
out_vec_lyr – output vector layer name.
out_format – output vector file format (see OGR codes). Default is GPKG.
out_epsg_code – if provided the output grid is reprojected to the projection defined by this EPSG code. (note. the grid size needs to the in the unit of this projection). Default is None.
utm_grid – provide the output grid in UTM projection where grid might go across multiple UTM zones. Default is False. grid size unit should be metres.
utm_hemi – if outputting a UTM projected grid then decided whether to use hemispheres or otherwise. If False then everything will be projected northern hemisphere (e.g., as with landsat or sentinel-2). Default is False.
- rsgislib.vectorutils.createvectors.create_wgs84_vector_grid(out_vec_file: str, out_vec_lyr: str, out_format: str, grid_x: float, grid_y: float, bbox: List[float], overlap: float = None, tile_names_col: str = 'tile_names', tile_name_prefix: str = '')
A function which creates a regular grid across a defined area using the WGS84 (EPSG:4326) projection.
- Parameters:
out_vec_file – output vector file
out_vec_lyr – output vector layer name
out_format – the output vector file format.
grid_x – the size in the x axis of the grid cells (in degrees).
grid_y – the size in the y axis of the grid cells (in degrees).
bbox – the area for which cells will be defined (MinX, MaxX, MinY, MaxY).
overlap – the overlap added to each grid cell. If None then no overlap applied.
tile_names_col – The output column name for the tile names.
tile_name_prefix – A prefix for the tile names.
- rsgislib.vectorutils.createvectors.create_poly_vec_bboxs(vec_file: str, vec_lyr: str, out_format: str, epsg_code: int, bboxs: List[Tuple[float, float, float, float] | List[float]], atts: Dict[str, List] = None, att_types: Dict[str, List] = None, overwrite: bool = True)
This function creates a set of polygons for a set of bounding boxes. When creating an attribute the available data types are ogr.OFTString, ogr.OFTInteger, ogr.OFTReal
- Parameters:
vec_file – output vector file/path
vec_lyr – output vector layer
out_format – the output vector layer type.
epsg_code – EPSG code specifying the projection of the data (e.g., 4326 is WSG84 Lat/Long).
bboxs – is a list of bounding boxes ([xMin, xMax, yMin, yMax]) to be saved to the output vector.
atts – is a dict of lists of attributes with the same length as the bboxs list. The dict should be named the same as the attTypes[‘names’] list.
att_types – is a dict with a list of attribute names (attTypes[‘names’]) and types (attTypes[‘types’]). The list must be the same length as one another and the number of atts. Example type: ogr.OFTString
overwrite – overwrite the vector file specified if it exists. Use False for GPKG where you want to add multiple layers.
- rsgislib.vectorutils.createvectors.write_pts_to_vec(out_vec_file: str, out_vec_lyr: str, out_format: str, epsg_code: int, pts_x: List[float], pts_y: List[float], atts: Dict[str, List] = None, att_types: Dict[str, List] = None, replace: bool = True, file_opts: List[str] = [], lyr_opts: List[str] = [])
This function creates a set of polygons for a set of bounding boxes. When creating an attribute the available data types are ogr.OFTString, ogr.OFTInteger, ogr.OFTReal
- Parameters:
out_vec_file – output vector file/path
out_vec_lyr – output vector layer
out_format – the output vector layer type.
epsg_code – EPSG code specifying the projection of the data (e.g., 4326 is WSG84 Lat/Long).
pts_x – is a list of x coordinates.
pts_y – is a list of y coordinates.
atts – is a dict of lists of attributes with the same length as the ptsX & ptsY lists. The dict should be named the same as the attTypes[‘names’] list.
att_types – is a dict with a list of attribute names (attTypes[‘names’]) and types (attTypes[‘types’]). The list must be the same length as one another and the number of atts. Example type: ogr.OFTString
replace – if the output vector file exists overwrite.
file_opts – Options passed when creating the file. Default: []. Common value might be [“OVERWRITE=YES”]
lyr_opts – Options passed when create the layer Default: []. Common value might be [“OVERWRITE=YES”]
- rsgislib.vectorutils.createvectors.create_bboxs_for_pts(vec_file: str, vec_lyr: str, bbox_width: float, bbox_height: float, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', del_exist_vec: bool = False, epsg_code: int = None)
A function which takes a set of points (from the input vector layer) and creates a set of boxes with the same height and width, one for each point.
Note, the geometry type for the input vector layer must be points.
- Parameters:
vec_file – the input vector file/path
vec_lyr – the name of the input vector layer.
bbox_width – width (in the units of the projection) for the output boxes
bbox_height – height (in the units of the projection) for the output boxes
out_vec_file – output vector file/path
out_vec_lyr – output vector layer name
out_format – output vector format.
del_exist_vec – If the output file already exists delete it before proceeding.
epsg_code – if not well defined specify the EPSG code for the projection.
- rsgislib.vectorutils.create_lines_of_points(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str, step: float, del_exist_vec: bool)
A function to create a regularly spaced set of points following a set of lines.
- Parameters:
vec_file – is a string containing the input vector file path (must be lines)
vec_lyr – is a string containing the name of the input vector layer name
out_vec_file – is a string containing the output vector file path (will be points)
out_vec_lyr – is a string containing the name of the output vector layer name
out_format – is a string containing the output file format
step – is a float specifying the distance between points along the line.
del_exist_vec – is a bool, specifying whether to force removal of the output vector if it exists
- rsgislib.vectorutils.createvectors.create_random_pts_in_radius(centre_x: float, centre_y: float, radius: float, n_pts: int, epsg_code: int, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', rnd_seed: int = None, n_pts_multi_bbox: float = 3)
A function which generates a set of random points within a radius from the defined centre point and exports to a vector file. The output vector is populated with the distance and angle from the centre to the individual points. Note, that the distance and angle calculate is only valid for a projected coordinate system (i.e., is it not valid for lat/lon).
- Parameters:
centre_x – The x coordinate of the centre point
centre_y – The y coordinate of the centre point
radius – the radius (in unit of coordinate system) defining the region of interest
n_pts – the number of points to be generated.
epsg_code – the EPSG code for the projection of the points.
out_vec_file – the output file path and name.
out_vec_lyr – the output layer name.
out_format – the output file format (Default: GeoJSON)
rnd_seed – the seed for the random generator.
n_pts_multi_bbox – the multiplier used to define the number of points generated within the bbox of the circle which is then subset. 3 should always be enough but lowing to 2 will reduce the memory footprint and speed up runtime. In rare cases you might need to increase this if an insufficient number of points were found within the radius specified.
- rsgislib.vectorutils.createvectors.create_random_pts_in_bbox(bbox: Tuple[float, float, float, float] | List[float], n_pts: int, epsg_code: int, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', rnd_seed: int = None)
A function which generates a set of random points within a boundary box.
- Parameters:
bbox – The bounding box the points ([xMin, xMax, yMin, yMax])
n_pts – the number of points to be generated.
epsg_code – the EPSG code for the projection of the points.
out_vec_file – the output file path and name.
out_vec_lyr – the output layer name.
out_format – the output file format (Default: GeoJSON)
rnd_seed – the seed for the random generator.
- rsgislib.vectorutils.create_copy_vector_lyr(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str, options: List = [], replace: bool = False, in_memory: bool = False)
A function which creates a copy of the input vector layer.
- Parameters:
vec_file – the file path to the vector file.
vec_lyr – the name of the vector layer. If None then first layer is returned.
out_vec_file – output vector file
out_vec_lyr – output vector layer within the input file.
out_format – the OGR driver for the output file.
options – provide a list of driver specific options (e.g., ‘OVERWRITE=YES’); see https://www.gdal.org/ogr_formats.html
replace – if true the output file is replaced (i.e., overwritten to anything in an existing file will be lost).
in_memory – If true vector layer will be read into memory and then outputted.
- rsgislib.vectorutils.createvectors.create_vec_for_image(input_imgs: List, output_dir: str, out_format: str = 'GeoJSON', geometry_type: int = 1, out_name_replace: Dict = None, out_file_ext: str = None, del_exist_vec: bool = False)
A function which creates a simple (dummy) vector layer for each input images. This function is intended to save time creating vector layers where a vector layer is needed for a set of images but digitising some information.
A single geometry is added to the layer, for a point this is the image centre, for a line this is from the TL to BR and for a polygon this is the bbox.
- Parameters:
input_imgs – a list of input images.
output_dir – a directory where the output vector layers will be created
out_format – the output format for the vector layers (Default: GeoJSON)
geometry_type – the geometry type of the vector layers (rsgislib.GEOM_PT, rsgislib.GEOM_LINE or rsgislib.GEOM_POLY) Default: rsgislib.GEOM_PT
out_name_replace – a dictionary of replacement values for editing the input image file names. If None (default) then ignored. For example, {“_ortho”, “”} will remove ‘_ortho’ from the input file names.
out_file_ext – the output extension for the output files (e.g., geojson) If None (Default) then this will be created.
del_exist_vec – delete the vector files if they already exist (Default: False)
- rsgislib.vectorutils.createvectors.create_hex_grid_bbox(bbox: Tuple[float, float, float, float] | List[float], bbox_epsg: int, hex_scale: int, out_vec_file: str, out_vec_lyr: str, out_format: str)
A function which uses the h3 library (https://uber.github.io/h3-py/intro.html) to create a hexagon grid for the region of interest specified by the bbox.
- Parameters:
bbox – the bbox (xMin, xMax, yMin, yMax) defining the region of interest.
bbox_epsg – the epsg code for the bbox.
hex_scale – the scale of the hexagons produced. A lower number will produce few hexagons. The scale is an integer value.
out_vec_file – The output vector file name and path
out_vec_lyr – The output vector layer name.
out_format – The output vector file format (e.g., GPKG or GeoJSON).
- rsgislib.vectorutils.createvectors.create_hex_grid_polys(vec_in_file: str, vec_in_lyr: str, hex_scale: int, out_vec_file: str, out_vec_lyr: str, out_format: str)
A function which uses the h3 library (https://uber.github.io/h3-py/intro.html) to create a hexagon grid for the region of interest provided by polygon(s) in the input vector layer. If the input layer is not EPSG:4326 (WGS84) it will be reprojected and the resulting hexagon grid reprojected back to the projection of the input vector layer.
- Parameters:
vec_in_file – Input vector file
vec_in_lyr – Input vector layer name
hex_scale – the scale of the hexagons produced. A lower number will produce few hexagons. The scale is an integer value.
out_vec_file – The output vector file name and path
out_vec_lyr – The output vector layer name.
out_format – The output vector file format (e.g., GPKG or GeoJSON).
- rsgislib.vectorutils.createvectors.create_lines_vec(vec_file: str, vec_lyr: str, out_format: str, epsg_code: int, lines: List[List[Tuple[float, float]]], overwrite: bool = True)
This function creates a set of lines from a list of points.
- Parameters:
vec_file – output vector file/path
vec_lyr – output vector layer
out_format – the output vector layer type.
epsg_code – EPSG code specifying the projection of the data (e.g., 4326 is WSG84 Lat/Long).
lines – is a list of lines where each line is defined as a list of tuples (X, Y).
overwrite – overwrite the vector file specified if it exists. Use False for GPKG where you want to add multiple layers.
- rsgislib.vectorutils.createvectors.create_img_transects(input_img: str, out_vec_file: str, out_vec_lyr: str, x_intervals: List[float] = None, y_intervals: List[float] = None, out_format: str = 'GPKG')
A function which will create transects across an image in both X and Y axis’. To specify the transect using the x_intervals and y_intervals parameters. These are lists of values between 0 and 1 where a value of 0.5 is halfway along the axis. Therefore, an input of [0.25, 0.5, 0.75] will create transects for the axis specified at a quarter, half and three-quarters of the way along the axis. You must specify at least one or x_intervals or y_intervals.
- Parameters:
input_img – The input image path
out_vec_file – the output vector path
out_vec_lyr – the output vector layer name
x_intervals – the list of intervals for the X axis (values between 0-1). If None then the axis is ignored.
y_intervals – the list of intervals for the Y axis (values between 0-1). If None then the axis is ignored.
out_format – the output vector format (Default: GPKG)
Vector I/O
- rsgislib.vectorutils.open_gdal_vec_lyr(vec_file: str, vec_lyr: str = None, readonly: bool = True) -> (<class 'osgeo.ogr.DataSource'>, <class 'osgeo.ogr.Layer'>)
A function which opens a GDAL/OGR vector layer and returns the Dataset and Layer objects. Note, the file must be closed by setting the dataset to None.
- Parameters:
vec_file – the file path to the vector file.
vec_lyr – the name of the vector layer. If None then first layer is returned.
readonly – if False then the layer will be opened and allow editing of the layer while if True (default) then it will be read only.
- Returns:
GDAL dataset, GDAL Layer
- rsgislib.vectorutils.read_vec_lyr_to_mem(vec_file: str, vec_lyr: str) -> (<class 'osgeo.ogr.DataSource'>, <class 'osgeo.ogr.Layer'>)
A function which reads a vector layer to an OGR in memory layer.
- Parameters:
vec_file – input vector file
vec_lyr – input vector layer within the input file.
- Returns:
ogr_dataset, ogr_layer
- rsgislib.vectorutils.get_mem_vec_lyr_subset(vec_file: str, vec_lyr: str, bbox: ~typing.List) -> (<class 'osgeo.ogr.DataSource'>, <class 'osgeo.ogr.Layer'>)
Function to get an ogr vector layer for the defined bounding box. The returned layer is returned as an in memory ogr Layer object.
- Parameters:
vec_file – vector layer from which the attribute data comes from.
vec_lyr – the layer name from which the attribute data comes from.
bbox – region of interest (bounding box). Define as [xMin, xMax, yMin, yMax].
- Returns:
OGR Layer and Dataset objects.
- rsgislib.vectorutils.write_vec_lyr_to_file(vec_lyr_obj: Layer, out_vec_file: str, out_vec_lyr: str, out_format: str, options: List = [], replace: bool = False)
A function which reads a vector layer to an OGR in memory layer.
- Parameters:
vec_lyr_obj – OGR vector layer object
out_vec_file – output vector file
out_vec_lyr – output vector layer within the input file.
out_format – the OGR driver for the output file.
options – provide a list of driver specific options (e.g., ‘OVERWRITE=YES’); see https://www.gdal.org/ogr_formats.html
replace – if true the output file is replaced (i.e., overwritten to anything in an existing file will be lost).
- rsgislib.vectorutils.vector_translate(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str = None, out_format: str = 'GPKG', drv_create_opts: List = [], lyr_create_opts: List = [], access_mode: str = None, src_srs: SpatialReference = None, dst_srs: SpatialReference = None, del_exist_vec: bool = False)
A function which translates a vector file to another format, similar to ogr2ogr. If you wish to reproject the input file then provide a destination srs (e.g., “EPSG:27700”, or wkt string, or proj4 string).
- Parameters:
vec_file – the input vector file.
vec_lyr – the input vector layer name
out_vec_file – the output vector file.
out_vec_lyr – the name of the output vector layer (if None then the same as the input).
out_format – the output vector file format (e.g., GPKG, GEOJSON, etc.)
drv_create_opts – a list of options for the creation of the output file.
lyr_create_opts – a list of options for the creation of the output layer.
access_mode – default is None for creation but other but other options are: [None (creation), ‘update’, ‘append’, ‘overwrite’]
src_srs – provide a source spatial reference for the input vector file. Default=None. can be used to provide a projection where none has been specified or the information has gone missing. Can be used without performing a reprojection.
dst_srs – provide a spatial reference for the output vector file to be reprojected to. (Default=None) If specified then the file will be reprojected.
del_exist_vec – remove output file if it exists.
- rsgislib.vectorutils.reproj_vector_layer(vec_file: str, out_vec_file: str, out_proj_wkt: str, out_format: str = 'GPKG', out_vec_lyr: str = None, vec_lyr: str = None, in_proj_wkt: str = None, del_exist_vec: bool = False)
A function which reprojects a vector layer. You might also consider using rsgislib.vectorutils.vector_translate, particularly if you are reprojecting the data and changing between coordinate units (e.g., degrees to meters)
- Parameters:
vec_file – is a string with name and path to input vector file.
out_vec_file – is a string with name and path to output vector file.
out_proj_wkt – is a string with the WKT string for the output vector file.
out_format – is the output vector file format. Default is ESRI Shapefile.
out_vec_lyr – is a string for the output layer name. If None then ignored and assume there is just a single layer in the vector and layer name is the same as the file name.
vec_lyr – is a string for the input layer name. If None then ignored and assume there is just a single layer in the vector.
in_proj_wkt – is a string with the WKT string for the input shapefile (Optional; taken from input file if not specified).
- rsgislib.vectorutils.reproj_vec_lyr_obj(vec_lyr_obj: Layer, out_vec_file: str, out_epsg: int, out_format: str = 'MEMORY', out_vec_lyr: str = None, in_epsg: int = None, print_feedback: bool = True)
A function which reprojects a vector layer. You might also consider using rsgislib.vectorutils.vector_translate, particularly if you are reprojecting the data and changing between coordinate units (e.g., degrees to meters)
- Parameters:
vec_lyr_obj – is a GDAL vector layer object.
out_vec_file – is a string with name and path to output vector file - is created.
out_epsg – is an int with the EPSG code to which the input vector layer is to be reprojected to.
out_format – is the output vector file format. Default is MEMORY - i.e., nothing written to disk.
out_vec_lyr – is a string for the output layer name. If None then ignored and assume there is just a single layer in the vector and layer name is the same as the file name.
inLyrName – is a string for the input layer name. If None then ignored and assume there is just a single layer in the vector.
in_epsg – is an int with the EPSG code for the input vector file (Optional; taken from input file if not specified).
print_feedback – is a boolean (Default True) specifying whether feedback should be printed to the console.
- Returns:
Returns the output datasource and layer objects (result_ds, result_lyr). datasource needs to be set to None once you have finished using to free memory and if written to disk to ensure the whole dataset is written.
- rsgislib.vectorutils.reproj_wgs84_vec_to_utm(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str = None, use_hemi: bool = True, out_format: str = 'GPKG', drv_create_opts: List = [], lyr_create_opts: List = [], access_mode: str = 'overwrite', del_exist_vec: bool = False)
A function which reprojects an input file projected in WGS84 (EPSG:4326) to UTM, where the UTM zone is automatically identified using the mean x and y.
- Parameters:
vec_file – the input vector file.
vec_lyr – the input vector layer name
out_vec_file – the output vector file.
out_vec_lyr – the name of the output vector layer (if None then the same as the input).
use_hemi – True differentiate between Southern and Northern hemisphere. False use Northern hemisphere.
out_format – the output vector file format (e.g., GPKG, GEOJSON, etc.)
drv_create_opts – a list of options for the creation of the output file.
lyr_create_opts – a list of options for the creation of the output layer.
access_mode – by default the function overwrites the output file but other options are: [‘update’, ‘append’, ‘overwrite’]
del_exist_vec – remove output file if it exists.
Create Rasters
- rsgislib.vectorutils.createrasters.rasterise_vec_lyr(vec_file: str, vec_lyr: str, input_img: str, output_img: str, gdalformat: str = 'KEA', burn_val: int = 1, datatype: int = 5, att_column: str = None, use_vec_extent: bool = False, thematic: bool = True, no_data_val: float = 0)
A utility to rasterise a vector layer to an image covering the same region and at the same resolution as the input image.
- Parameters:
vec_file – is a string specifying the input vector file
vec_lyr – is a string specifying the input vector layer name.
input_img – is a string specifying the input image defining the grid, pixel resolution and area for the rasterisation (if None and vecExt is False them assumes output image already exists and just uses it as is burning vector into it)
output_img – is a string specifying the output image for the rasterised vector file
gdalformat – is the output image format (Default: KEA).
burn_val – is the value for the output image pixels if no attribute is provided.
datatype – of the output file, default is rsgislib.TYPE_8UINT
att_column – is a string specifying the attribute to be rasterised, value of None creates a binary mask and “FID” creates a temp vector file with a “FID” column and rasterises that column.
use_vec_extent – is a boolean specifying that the output image should be cut to the same extent as the input shapefile (Default is False and therefore output image will be the same as the input).
thematic – is a boolean (default True) specifying that the output image is an thematic dataset so a colour table will be populated.
no_data_val – is a float specifying the no data value associated with a continuous output image.
from rsgislib import vectorutils inputVector = 'crowns.shp' inputVectorLyr = 'crowns' inputImage = 'injune_p142_casi_sub_utm.kea' outputImage = 'psu142_crowns.kea' vectorutils.rasterise_vec_lyr(inputVector, inputVectorLyr, inputImage, outputImage, 'KEA', vecAtt='FID')
- rsgislib.vectorutils.createrasters.rasterise_vec_lyr_obj(vec_lyr_obj: Layer, output_img: str, burn_val: int = 1, att_column: str = None, calc_stats: bool = True, thematic: bool = True, no_data_val: float = 0)
A utility to rasterise a vector layer to an image covering the same region.
- Parameters:
vec_lyr_obj – is a OGR Vector Layer Object
output_img – is a string specifying the output image, this image must already exist and intersect within the input vector layer.
burn_val – is the value for the output image pixels if no attribute is provided.
att_column – is a string specifying the attribute to be rasterised, value of None creates a binary mask and “FID” creates a temp vector layer with a “FID” column and rasterises that column.
calc_stats – is a boolean specifying whether image stats and pyramids should be calculated.
thematic – is a boolean (default True) specifying that the output image is an thematic dataset so a colour table will be populated.
no_data_val – is a float specifying the no data value associated with a continuous output image.
- rsgislib.vectorutils.createrasters.copy_vec_to_rat(vec_file: str, vec_lyr: str, input_img: str, output_img: str, fid_col: str = 'FID')
A utility to create raster copy of a polygon vector layer. The output image is a KEA file and the attribute table has the attributes from the vector layer.
- Parameters:
vec_file – is a string specifying the input vector file
vec_lyr – is a string specifying the layer within the input vector file
input_img – is a string specifying the input image defining the grid, pixel resolution and area for the rasterisation
output_img – is a string specifying the output KEA image for the rasterised vector layer
from rsgislib import vectorutils inputVector = 'crowns.shp' inputImage = 'injune_p142_casi_sub_utm.kea' outputImage = 'psu142_crowns.kea' vectorutils.copy_vec_to_rat(inputVector, 'crowns', inputImage, outputImage)
- rsgislib.vectorutils.createrasters.create_vector_range_lut_score_img(vec_file: str, vec_lyr: str, vec_col: str, tmp_vec_file: str, tmp_vec_lyr: str, tmp_vec_col: str, input_img: str, output_img: str, scrs_lut: Dict[int, Tuple[float, float]], out_format: str = 'GPKG', gdalformat: str = 'KEA')
A function which uses a look up table (LUT) with ranges, defined by lower (>=) and upper (<) values to recode columns within a vector layer and export the column as a raster layer.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
vec_col – The column within which the unique values will be identified.
tmp_vec_file – Intermediate vector file
tmp_vec_lyr – Intermediate vector layer name.
tmp_vec_col – The intermediate vector output numeric column
input_img – is a string specifying the input image defining the grid, pixel resolution and area for the rasterisation.
output_img – is a string specifying the output image for the rasterised vector file
scrs_lut – the LUT for defining the output values. Features outside of the values defined by the LUT will be set as zero. The LUT should define an int as the key which will be the output value and a tuple specifying the lower (>=) and upper (<) values within the vec_col for setting the key value.
:param out_format:output file vector format (default GPKG). :param gdalformat: is the output image format (Default: KEA).
- rsgislib.vectorutils.createrasters.create_vector_lst_lut_score_img(vec_file: str, vec_lyr: str, vec_col: str, tmp_vec_file: str, tmp_vec_lyr: str, tmp_vec_col: str, input_img: str, output_img: str, scrs_lut: List[Tuple[str | int, int]], out_format: str = 'GPKG', gdalformat: str = 'KEA')
A function which uses a look up table (LUT) as a list of tuples recoding values within the a column within a vector layer and export the column as a raster layer. Example LUT tuples: (“Hello”, 1) or (“World”, 2)
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
vec_col – The column within which the unique values will be identified.
tmp_vec_file – Intermediate vector file
tmp_vec_lyr – Intermediate vector layer name.
tmp_vec_col – The intermediate vector output numeric column
input_img – is a string specifying the input image defining the grid, pixel resolution and area for the rasterisation.
output_img – is a string specifying the output image for the rasterised vector file
scrs_lut – the LUT defined as a list which should be a list of tuples (LookUp, OutValue).
:param out_format:output file vector format (default GPKG). :param gdalformat: is the output image format (Default: KEA).
- rsgislib.vectorutils.createrasters.create_dist_zones_to_vec_layer(vec_file: str, vec_lyr: str, input_img: str, tmp_vec_img: str, tmp_dist_img: str, output_img: str, recode_lut: List[Tuple[int, Tuple[float, float]]], gdalformat: str = 'KEA', datatype: int = 5, max_dist_thres: float = None, backgrd_val: int = 0)
A function which calculates the distance to vector features and then recodes the distance into categories based on a look up table (LUT) provided. The LUT should be a list specifying the output value and lower (>=) and upper (<) thresholds for that category. For example, (1, (10, 20)). If you do not want to specify a lower or upper value then use math.nan. For example, (2, (math.nan, 10)) or (3, (20, math.nan)).
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
input_img – an input image which will used as a reference for the pixel grid for rasterising the vector layer and calculating distance.
tmp_vec_img – a temporary image generated during the analysis which is a rasterised version of the vector layer.
tmp_dist_img – a temporary image generated during the analysis which is the distance to the rasterised vector features.
output_img – the output image where the distance has been recoded to categories using the recode_lut.
recode_lut – The recoding LUT specifying the categories to split the distance layer into.
gdalformat – the output image file format (default: KEA)
datatype – the output image file data type (default: rsgislib.TYPE_8UINT)
max_dist_thres – A threshold limiting the maximum distance to be calculated from the vector layer. Limiting this distance can speed up the analysis.
backgrd_val – The background value used when recoding the distance image. i.e., if a pixel does not fall into any of the categories specified then it will be given this value.
Merge Vectors
- rsgislib.vectorutils.merge_vectors_to_gpkg(in_vec_files: List[str], out_vec_file: str, out_vec_lyr: str, exists: bool = False)
Function which will merge a list of vector files into an single output GeoPackage (GPKG) file using ogr2ogr.
- Parameters:
in_vec_files – is a list of input files.
out_vec_file – is the output GPKG database (*.gpkg)
out_vec_lyr – is the layer name in the output database (i.e., you can merge layers into single layer or write a number of layers to the same database).
exists – boolean which specifies whether the database file exists or not.
- rsgislib.vectorutils.merge_vector_lyrs_to_gpkg(vec_file: str, out_vec_file: str, out_vec_lyr: str, exists: bool = False)
Function which will merge all the layers in the input vector file into an single output GeoPackage (GPKG) file using ogr2ogr.
- Parameters:
vec_file – is a vector file which contains multiple layers which are to be merged
out_vec_file – is the output GPKG database (*.gpkg)
out_vec_lyr – is the layer name in the output database (i.e., you can merge layers into single layer or write a number of layers to the same database).
exists – boolean which specifies whether the database file exists or not.
- rsgislib.vectorutils.merge_vectors_to_gpkg_ind_lyrs(in_vec_files: List, out_vec_file: str, rename_dup_lyrs: bool = False, geom_type: str = None)
Function which will merge a list of vector files into an single output GPKG file where each input file forms a new layer using the existing layer name. This function wraps the ogr2ogr command.
- Parameters:
in_vec_files – is a list of input files.
out_vec_file – is the output GPKG database (*.gpkg)
rename_dup_lyrs – If False an exception will be throw if any input layers has the same name. If True a layer will be renamed - with a random set of letters/numbers on the end.
geom_type – Force the output vector to have a particular geometry type (e.g., ‘POLYGON’). Same options as ogr2ogr.
- rsgislib.vectorutils.merge_vector_layers(vecs_dict: List, out_vec_file: str, out_vec_lyr: str = None, out_format: str = 'GPKG', out_epsg: int = None, remove_cols: List[str] = None)
A function which merges the input vector layers into a single output file using geopandas.
- Parameters:
vecs_dict – List of dicts with keys [{‘file’: ‘/file/path/to/file.gpkg’, ‘layer’: ‘layer_name’}] providing the file paths and layer names.
out_vec_file – output vector file.
out_vec_lyr – output vector layer.
out_format – output file format.
out_epsg – if input layers are different projections then option can be used to define the output projection.
remove_cols – a list of columns to be removed during the merge.
- rsgislib.vectorutils.merge_vector_files(vec_files: List[str], out_vec_file: str, out_vec_lyr: str = None, out_format: str = 'GPKG', out_epsg: int = None, remove_cols: List[str] = None)
A function which merges the input files into a single output file using geopandas. If the input files have multiple layers they are all merged into the output file.
- Parameters:
vec_files – List of input files
out_vec_file – output vector file.
out_vec_lyr – output vector layer.
out_format – output file format.
out_epsg – if input layers are different projections then option can be used to define the output projection.
remove_cols – a list of columns to be removed during the merge.
- rsgislib.vectorutils.merge_utm_vecs_wgs84(in_vec_files: List, out_vec_file: str, out_vec_lyr: str = None, out_format: str = 'GPKG', n_hemi_utm_file: str = None, s_hemi_utm_file: str = None, width_thres: float = 350)
A function which merges input files in UTM projections to the WGS84 projection cutting polygons which wrap from one side of the world to other (i.e., 180/-180 boundary).
- Parameters:
in_vec_files – list of input files
out_vec_file – output vector file.
out_vec_lyr – output vector layer - only used if output format is GPKG
out_format – output file format.
n_utm_zones_vec – GPKG file with layer per zone (layer names: 01, 02, … 59, 60) each projected in the northern hemisphere UTM projections.
s_utm_zone_vec – GPKG file with layer per zone (layer names: 01, 02, … 59, 60) each projected in the southern hemisphere UTM projections.
width_thres – The threshold (default 350 degrees) for the width of a polygon for which the polygons will be checked, looping through all the coordinates
- rsgislib.vectorutils.merge_to_multi_layer_vec(input_file_lyrs: List, out_vec_file: str, out_format: str = 'GPKG', overwrite: bool = True)
A function which takes a list of vector files and layers (as VecLayersInfoObj objects) and merged them into a multi-layer vector file.
- Parameters:
input_file_lyrs – list of VecLayersInfoObj objects.
out_vec_file – output vector file.
out_format – output format Default=’GPKG’.
overwrite – bool (default = True) specifying whether the input file should be overwritten if it already exists.
- class rsgislib.vectorutils.VecLayersInfoObj(vec_file: str = None, vec_lyr: str = None, vec_out_lyr: str = None)
This is a class to store the information associated within the rsgislib.vectorutils.merge_to_multi_layer_vec function.
- Parameters:
vec_file – input vector file.
vec_lyr – input vector layer name
vec_out_lyr – output vector layer name
vec_file – input vector file.
vec_lyr – input vector layer name
vec_out_lyr – output vector layer name
Vector Select / Subset
- rsgislib.vectorutils.spatial_select_gp(vec_in_file: str, vec_in_lyr: str, vec_roi_file: str, vec_roi_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', tmp_col_name: str = 'tmp_sel_join_fid', vec_in_epsg: int = None, vec_roi_epsg: int = None)
A function which spatially selects features from the input vector layer which intersects the ROI vector layer. This function is implemented using geopandas and is generally faster than the export_spatial_select_feats or select_intersect_feats functions.
Note, defining epsg codes for the datasets does not reproject the datasets but just makes sure that correct projection is being used.
- Parameters:
vec_in_file – the input vector file path.
vec_in_lyr – the input vector layer name
vec_roi_file – the roi vector file path
vec_roi_lyr – the roi vector layer name
out_vec_file – the output vector file path
out_vec_lyr – the output vector layer name
out_format – the output vector format (e.g., GPKG).
tmp_col_name – The name of a temporary column added to the input layer used to ensure there are no duplicated features in the output layer. The default name is: “tmp_sel_join_fid”.
vec_in_epsg – Optionally provide the epsg code for the input vector layer.
vec_roi_epsg – Optionally provide the epsg code for the roi vector layer.
- rsgislib.vectorutils.get_att_lst_select_feats(vec_file: str, vec_lyr: str, att_names: List, vec_sel_file: str, vec_sel_lyr: str) List[Dict]
Function to get a list of attribute values from features which intersect with the select layer.
- Parameters:
vec_file – vector layer from which the attribute data comes from.
vec_lyr – the layer name from which the attribute data comes from.
att_names – a list of attribute names to be outputted.
vec_sel_file – the vector file which will be intersected within the vector file.
vec_sel_lyr – the layer name which will be intersected within the vector file.
- Returns:
list of dictionaries with the output values.
- rsgislib.vectorutils.get_att_lst_select_feats_lyr_objs(vec_lyr_obj: Layer, att_names: List, vec_sel_lyr_obj: Layer) List[Dict]
Function to get a list of attribute values from features which intersect with the select layer.
- Parameters:
vec_lyr_obj – the OGR layer object from which the attribute data comes from.
att_names – a list of attribute names to be outputted.
vec_sel_lyr_obj – the OGR layer object which will be intersected within the vector file.
- Returns:
list of dictionaries with the output values.
- rsgislib.vectorutils.get_att_lst_select_bbox_feats(vec_file: str, vec_lyr: str, att_names: List, bbox: Tuple[float, float, float, float] | List[float], bbox_epsg: int = None) List[Dict]
Function to get a list of attribute values from features which intersect with the select layer.
- Parameters:
vec_file – the OGR file from which the attribute data comes from.
vec_lyr – the layer name within the file from which the attribute data comes from.
att_names – a list of attribute names to be outputted.
bbox – the bounding box for the region of interest (xMin, xMax, yMin, yMax).
bbox_epsg – the projection of the BBOX (if None then ignore).
- Returns:
list of dictionaries with the output values.
- rsgislib.vectorutils.get_att_lst_select_bbox_feats_lyr_objs(vec_lyr_obj: Layer, att_names: List, bbox: Tuple[float, float, float, float] | List[float], bbox_epsg: int = None) List[Dict]
Function to get a list of attribute values from features which intersect with the select layer.
- Parameters:
vec_lyr_obj – the OGR layer object from which the attribute data comes from.
att_names – a list of attribute names to be outputted.
bbox – the bounding box for the region of interest (xMin, xMax, yMin, yMax).
bbox_epsg – the projection of the BBOX (if None then ignore).
- Returns:
list of dictionaries with the output values.
- rsgislib.vectorutils.select_intersect_feats(vec_file: str, vec_lyr: str, vec_roi_file: str, vec_roi_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
Function to select the features which intersect with region of interest (ROI) features which will be outputted into a new vector layer.
- Parameters:
vec_file – vector layer from which the attribute data comes from.
vec_lyr – the layer name from which the attribute data comes from.
vec_roi_file – the vector file which will be intersected within the vector file.
vec_roi_lyr – the layer name which will be intersected within the vector file.
out_vec_file – the vector file which will be outputted.
out_vec_lyr – the layer name which will be outputted.
out_format – output vector format (default GPKG)
- rsgislib.vectorutils.export_spatial_select_feats(vec_file: str, vec_lyr: str, vec_sel_file: str, vec_sel_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str)
Function to get a list of attribute values from features which intersect with the select layer.
- Parameters:
vec_file – vector layer from which the attribute data comes from.
vec_lyr – the layer name from which the attribute data comes from.
vec_sel_file – the vector file which will be intersected within the vector file.
vec_sel_lyr – the layer name which will be intersected within the vector file.
out_vec_file – output vector file/path
out_vec_lyr – output vector layer
out_format – the output vector layer type.
- rsgislib.vectorutils.subset_envs_vec_lyr_obj(vec_lyr_obj: ~osgeo.ogr.Layer, bbox: ~typing.List, epsg: int = None) -> (<class 'osgeo.ogr.DataSource'>, <class 'osgeo.ogr.Layer'>)
Function to get an ogr vector layer for the defined bounding box. The returned layer is returned as an in memory ogr Layer object.
- Parameters:
vec_lyr_obj – OGR Layer Object.
bbox – region of interest (bounding box). Define as [xMin, xMax, yMin, yMax].
epsg – provide an EPSG code for the layer if not well defined by the input layer.
- Returns:
OGR Layer and Dataset objects.
- rsgislib.vectorutils.subset_veclyr_to_featboxs(vec_file_bbox: str, vec_lyr_bbox: str, vec_file_tosub: str, vec_lyr_tosub: str, out_lyr_name: str, out_file_base: str, out_file_end: str = 'gpkg', out_format: str = 'GPKG')
A function which subsets an input vector layer using the BBOXs of the features within another vector layer.
- Parameters:
vec_file_bbox – The vector file for the features which define the BBOXs
vec_lyr_bbox – The vector layer for the features which define the BBOXs
vec_file_tosub – The vector file for the layer which is to be subset.
vec_lyr_tosub – The vector layer for the layer which is to be subset.
out_lyr_name – The layer name for the output files - all output files will have the same layer name.
out_file_base – The base name for the output files. A numeric count 0-n will be inserted between this and the ending.
out_file_end – The output file ending (e.g., gpkg).
out_format – The output file driver (e.g., GPKG).
- rsgislib.vectorutils.spatial_select(vec_file: str, vec_lyr: str, vec_roi_file: str, vec_roi_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function to perform a spatial selection within the input vector using a ROI vector layer. This function uses geopandas so ensure that is installed.
- Parameters:
vec_file – Input vector file from which features are selected.
vec_lyr – Input vector file layer from which features are selected.
vec_roi_file – The ROI vector file used to select features within the input file.
vec_roi_lyr – The ROI vector layer used to select features within the input file.
out_vec_file – The output vector file with the selected features.
out_vec_lyr – The output layer file with the selected features.
out_format – the output vector format
- rsgislib.vectorutils.subset_by_attribute(vec_file: str, vec_lyr: str, sub_col: str, sub_vals: List, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', match_type: str = 'equals')
A function which subsets an input vector layer based on a list of values.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer
sub_col – The column used to subset the layer.
sub_vals – A list of values used to subset the layer. If using contains or start then regular expressions supported by the re library can be provided.
out_vec_file – The output vector file
out_vec_lyr – The output vector layer
out_format – The output vector format.
match_type – The type of match for the subset. Options: equals (default) - the same value. contains - string is anywhere within attribute value. start - string matches the start of the attribute value.
- rsgislib.vectorutils.select_feats_str_search(vec_file: str, vec_lyr: str, select_col: str, select_val: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which select features from a vector layer based on a string value within an attribute column. For example, providing a select value ‘River’ would select all features which had river within the column specified such as ‘River Amazon’, ‘River Seven’ etc. Note, it is case-sensitive.
- Parameters:
vec_file – The input vector file path
vec_lyr – the input vector layer name
select_col – the column which is search within
select_val – the value used to select features
out_vec_file – the output file path
out_vec_lyr – the output layer name
out_format – the output format (Default: GPKG)
- rsgislib.vectorutils.drop_rows_by_attribute(vec_file: str, vec_lyr: str, sub_col: str, sub_vals: List, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which subsets an input vector layer based on a list of values.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer
sub_col – The column used to subset the layer.
sub_vals – A list of values used to subset the layer. If using contains or start then regular expressions supported by the re library can be provided.
out_vec_file – The output vector file
out_vec_lyr – The output vector layer
out_format – The output vector format.
- rsgislib.vectorutils.rm_feat_att_duplicates(vec_file: str, vec_lyr: str, col_name: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which uses the values within an attribute column to remove duplicate features from the vector layer.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
col_name – The column used to define unique features
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_format – output file format (default GPKG).
- rsgislib.vectorutils.spatial_select_bbox(vec_file: str, vec_lyr: str, bbox: List[float], out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', vec_in_epsg: int = None)
A function which spatially subsets the vector layer to the bbox [xMin, xMax, yMin, yMax] provided. The function uses geopandas.
- Parameters:
vec_file – Input vector file
vec_lyr – Input vector layer
bbox – region of interest (bounding box). Define as [xMin, xMax, yMin, yMax].
out_vec_file – Output vector file
out_vec_lyr – Output vector layer
out_format – output vector format (Default: GPKG)
vec_in_epsg – Optionally, the EPSG code of the input vector layer can be specified to ensure the output file has the correct projection.
Vector Split
- rsgislib.vectorutils.split_vec_lyr(vec_file: str, vec_lyr: str, n_feats: int, out_format: str, out_dir: str, out_vec_base: str, out_vec_ext: str)
A function which splits the input vector layer into a number of output layers.
- Parameters:
vec_file – input vector file.
vec_lyr – input layer name.
n_feats – number of features within each output file.
out_format – output file driver.
out_dir – output directory for the created output files.
out_vec_base – output layer name will be the same as the base file name.
out_vec_ext – file ending (e.g., gpkg). Note don’t include the dot, so input gpkg rather than .gpkg.
- rsgislib.vectorutils.split_by_attribute(vec_file: str, vec_lyr: str, split_col_name: str, multi_layers: bool = True, out_vec_file: str = None, out_file_path: str = None, out_file_ext: str = None, out_format: str = 'GPKG', dissolve: bool = False, chk_lyr_names: bool = True)
A function which splits a vector layer by an attribute value into either different layers or different output files.
- Parameters:
vec_file – Input vector file
vec_lyr – Input vector layer
split_col_name – The column name by which the vector layer will be split.
multi_layers – Boolean (default True). If True then a mulitple layer output file will be created (e.g., GPKG). If False then individual files will be outputted.
out_vec_file – Output vector file - only used if multi_layers = True
out_file_path – Output file path (directory) if multi_layers = False.
out_file_ext – Output file extension is multi_layers = False
out_format – The output format (e.g., GPKG, ESRI Shapefile).
dissolve – Boolean (Default=False) if True then a dissolve on the specified variable will be run as layers are separated.
chk_lyr_names – If True (default) layer names (from split_col_name) will be checked, which means punctuation removed and all characters being ascii characters.
- rsgislib.vectorutils.split_feats_to_mlyrs(vec_file: str, vec_lyr: str, out_vec_file: str, out_format: str = 'GPKG')
A function which splits an existing vector layer into multiple layers
- Parameters:
vec_file – input vector file
vec_lyr – input vector layer
out_vec_file – output file, note the format must be one which supports multiple layers (e.g., GPKG).
out_format – The output format of the output file.
- rsgislib.vectorutils.create_n_random_subsets(vec_file: str, vec_lyr: str, out_vec_dir: str, out_vec_base: str, out_vec_ext: str, out_format: str = 'GPKG', n_subs: int = 10, smpl_frac: float = 0.5, n_smpl: int = None, replacement: bool = False, rnd_seed: int = None)
A function which creates n random subsets of the features within a vector layer. This is useful when running a bootstrapping process or similar.
- Parameters:
vec_file – input vector file path
vec_lyr – input vector layer name
out_vec_dir – output directory path
out_vec_base – output base file name
out_vec_ext – output file extension for the vector files (e.g., gpkg)
out_format – output vector file format (Default: GPKG)
n_subs – the number of subsets to generate (Default: 10).
smpl_frac – the fraction of the whole data to take as the subset. Note, if n_smpl is defined it will be used over smpl_frac. (Defualt: 10)
n_smpl – the number of samples to take from the input layer for each subset (Default: None). Note, if n_smpl is defined (i.e., not None) then smpl_frac will be ignored.
replacement – Boolean specifying whether the random subset is selected with replacement (Default: False)
rnd_seed – A seed for the random selection. Default: None.
- rsgislib.vectorutils.split_vec_lyr_random_subset(vec_file: str, vec_lyr: str, out_rmain_vec_file: str, out_rmain_vec_lyr: str, out_smpl_vec_file: str, out_smpl_vec_lyr: str, n_smpl: int, out_format: str = 'GPKG', rnd_seed: int = None)
A function to split a vector layer into two subsets by randomly sampling the input file. This function uses geopandas so that library must therefore be installed.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer.
out_rmain_vec_file – Output vector file with the ‘remain’ outputs (i.e., the remainder once the sample if taken)
out_rmain_vec_lyr – Output vector layer with the ‘remain’ outputs (i.e., the remainder once the sample if taken)
out_smpl_vec_file – Output vector file with the sampled outputs
out_smpl_vec_lyr – Output vector layer with the sampled outputs
n_smpl – the number of samples to be randomly selected
out_format – The output format of the output file. (Default: GPKG)
rnd_seed – A seed for the random number generator.
- rsgislib.vectorutils.create_train_test_smpls(vec_file: str, vec_lyr: str, out_train_vec_file: str, out_train_vec_lyr: str, out_test_vec_file: str, out_test_vec_lyr: str, out_format: str = 'GPKG', prop_test: float = 0.2, tmp_dir: str = 'tmp', rnd_seed: int = None)
A function for splitting a vector dataset into training and testing datasets.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer.
out_train_vec_file – Output vector file with the training data.
out_train_vec_lyr – Output vector layer with the training data.
out_test_vec_file – Output vector file with the testing data.
out_test_vec_lyr – Output vector layer with the testing data.
out_format – The output format of the output file. (Default: GPKG)
prop_test – Proportion of the dataset to be defined as a the test data
tmp_dir – a temporary directory for intimediate outputs.
rnd_seed – A seed for the random number generator.
Vector Geometry
- rsgislib.vectorutils.geopd_check_polys_wgs84_bounds_geometry(data_gdf, width_thres: float = 350)
A function which checks a polygons within the geometry of a geopanadas dataframe for specific case where they on the east/west edge (i.e., 180 / -180) and are therefore being wrapped around the world. For example, this function would change a longitude -179.91 to 180.01. The geopandas dataframe will be edit in place.
This function will import the shapely library.
- Parameters:
data_gdf – geopandas dataframe.
width_thres – The threshold (default 350 degrees) for the width of a polygon for which the polygons will be checked, looping through all the coordinates
- Returns:
geopandas dataframe
Vector / Raster Tests
- rsgislib.vectorutils.does_vmsk_img_intersect(input_vmsk_img: str, vec_roi_file: str, vec_roi_lyr: str, tmp_dir: str, vec_epsg: int = None)
This function checks whether the input binary raster mask intersects with the input vector layer. A check is first done as to whether the bounding boxes intersect, if they do then the intersection between the images is then calculated. The input image and vector can be in different projections but the projection needs to be well defined.
- Parameters:
input_vmsk_img – Input binary mask image file.
vec_roi_file – The input vector file.
vec_roi_lyr – The name of the input layer.
tmp_dir – a temporary directory for files generated during processing.
vec_epsg – If projection is poorly defined by the vector layer then it can be specified.
Vector Info
- rsgislib.vectorutils.get_vec_feat_count(vec_file: str, vec_lyr: str = None, compute_count: bool = True) int
Get a count of the number of features in the vector layers.
- Parameters:
vec_file – is a string with the input vector file name and path.
vec_lyr – is the layer for which extent is to be calculated (Default: None). if None assume there is only one layer and that will be read.
compute_count – is a boolean which specifies whether the layer extent should be calculated (rather than estimated from header) even if that operation is computationally expensive.
- Returns:
nfeats
- rsgislib.vectorutils.count_feats_per_att_val(vec_file: str, vec_lyr: str, col_name: str, out_df_dict: bool = False) Dict
A function which returns the count of features for each variable value.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
col_name – The column used to count the number of features per value.
out_df_dict – if true then dict will be formatted to import into a pandas dataframe. Otherwise, the output dict will use the attribute values as the key and count as value.
- Returns:
either dict with keys of vals and count for import into pandas or with attribute value and number of features
- rsgislib.vectorutils.get_vec_lyrs_lst(vec_file: str) List[str]
A function which returns a list of available layers within the inputted vector file.
- Parameters:
vec_file – file name and path to input vector layer.
- Returns:
list of layer names (can be used with gdal.Dataset.GetLayerByName()).
- rsgislib.vectorutils.get_vec_layer_extent(vec_file: str, vec_lyr: str = None, compute_if_exp: bool = True) Tuple[float, float, float, float]
Get the extent of the vector layer.
- Parameters:
vec_file – is a string with the input vector file name and path.
vec_lyr – is the layer for which extent is to be calculated (Default: None) if None assume there is only one layer and that will be read.
compute_if_exp – is a boolean which specifies whether the layer extent should be calculated (rather than estimated from header) even if that operation is computationally expensive.
- Returns:
boundary box is returned (MinX, MaxX, MinY, MaxY)
- rsgislib.vectorutils.get_vec_lyr_cols(vec_file: str, vec_lyr: str) List[str]
A function which returns a list of columns from the input vector layer.
- Parameters:
vec_file – input vector file.
vec_lyr – input vector layer
- Returns:
list of column names
- rsgislib.vectorutils.get_ogr_vec_col_datatype_from_gdal_rat_col_datatype(rat_datatype: int) int
Returns the data type to create a column in a OGR vector layer for equalivant to rat_datatype.
- Parameters:
rat_datatype – the datatype (GFT_Integer, GFT_Real, GFT_String) for the RAT column.
- Returns:
OGR datatype (OFTInteger, OFTReal, OFTString)
- rsgislib.vectorutils.get_vec_lyr_geom_type(vec_file: str, vec_lyr: str = None) int
A function which returns an rsgislib.GEOM_XX value related to the vector geometry type.
- Parameters:
vec_file – is a string with the input vector file name and path.
vec_lyr – is the layer for which extent is to be calculated (Default: None). if None assume there is only one layer and that will be read.
- Returns:
int representing the geometry type (rsgislib.GEOM_XX)
- rsgislib.vectorutils.get_geom_type_name(geom_type: int) str
A function which returns a string with a human-readable name of the geometry type.
- Parameters:
geom_type – the numerical type (e.g., rsgislib.GEOM_POLY)
- Returns:
the name of the geometry type
Vectors Utilities
- rsgislib.vectorutils.check_validate_geometries(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str, print_err_geoms: bool, del_exist_vec: bool)
A command fit a polygon to the points inputted.
- Parameters:
vec_file – is a string containing the input vector file path
vec_lyr – is a string containing the name of the input vector layer name
out_vec_file – is a string containing the output vector file path
out_vec_lyr – is a string containing the name of the output vector layer name
out_format – is a string specifying the output vector GDAL/OGR driver (e.g., GPKG).
print_err_geoms – is a bool, specifying whether were errors are found they are printed to the console.
del_exist_vec – is a bool, specifying whether to force removal of the output vector if it exists
- rsgislib.vectorutils.delete_vector_file(vec_file: str, feedback: bool = True)
Function to delete an existing vector file.
- Parameters:
vec_file – vector file path
feedback – Boolean specifying whether the function should print feedback to the console as files are delted.