RSGISLib Vector Attributes Module
Basic Read and Writing Columns
- rsgislib.vectorattrs.write_vec_column(out_vec_file: str, out_vec_lyr: str, att_column: str, att_col_datatype: int, att_col_data: List)
A function which will write a column to a vector file
- Parameters:
out_vec_file – The file / path to the vector data ‘file’.
out_vec_lyr – The layer to which the data is to be added.
att_column – Name of the output column
att_col_datatype – ogr data type for the output column (e.g., ogr.OFTString, ogr.OFTInteger, ogr.OFTReal)
att_col_data – A list of the same length as the number of features in vector file.
- rsgislib.vectorattrs.write_vec_column_to_layer(out_vec_lyr_obj: Layer, att_column: str, att_col_datatype: int, att_col_data: List)
A function which will write a column to a vector layer.
- Parameters:
out_vec_lyr_obj – GDAL/OGR vector layer object
att_column – Name of the output column
att_col_datatype – ogr data type for the output column (e.g., ogr.OFTString, ogr.OFTInteger, ogr.OFTReal)
att_col_data – A list of the same length as the number of features in vector file.
- rsgislib.vectorattrs.read_vec_column(vec_file: str, vec_lyr: str, att_column: str) List
A function which will reads a column from a vector file
- Parameters:
vec_file – The file / path to the vector data ‘file’.
vec_lyr – The layer to which the data is to be read from.
att_column – Name of the input column
- Returns:
a list with the column values.
- rsgislib.vectorattrs.read_vec_columns(vec_file: str, vec_lyr: str, att_columns: List[str]) List[Dict]
A function which will reads a number of column from a vector file
- Parameters:
vec_file – The file / path to the vector data ‘file’.
vec_lyr – The layer to which the data is to be read from.
att_columns – List of input attribute column names to be read in.
- Returns:
list of dicts with the column names as keys
- rsgislib.vectorattrs.get_vec_cols_as_array(vec_file: str, vec_lyr: str, cols: List[str], lower_limit: float = None, upper_limit: float = None) array
A function returns an n x m numpy array with the values for the columns specified.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
cols – list of columns to be read and returned.
no_data_val – no data value used within the column values. Rows with a no data value will be dropped. If None then ignored (Default: None)
lower_limit – Optional lower limit to define valid values. Note the same value is used for all the columns listed. If a value is found to be outside of the threshold the whole row is removed.
upper_limit – Optional upper limit to define valid values. Note the same value is used for all the columns listed. If a value is found to be outside of the threshold the whole row is removed.
- Returns:
a numpy array with the column values.
Add Columns
- rsgislib.vectorattrs.add_fid_col(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', out_col: str = 'fid')
A function which adds a numeric feature ID (FID) column with unique values per feature within the file.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_format – output file format (default GPKG).
out_col – The output FID column name (Default: fid)
- rsgislib.vectorattrs.add_numeric_col_lut(vec_file: str, vec_lyr: str, ref_col: str, val_lut: Dict, out_col: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which adds a numeric column based off an existing column in the vector file, using an dict LUT to define the values.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
ref_col – The column within which the unique values will be identified.
val_lut – A dict LUT (key should be value in ref_col and value be the value outputted to out_col).
out_col – The output numeric column
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_format – output file format (default GPKG).
- rsgislib.vectorattrs.add_numeric_col(vec_file: str, vec_lyr: str, out_col: str, out_vec_file: str, out_vec_lyr: str, out_val: float = 1, out_format: str = 'GPKG', out_col_int: bool = False)
A function which adds a numeric column with the same value for all the features.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
out_col – The output numeric column
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_val – output numeric value
out_format – output file format (default GPKG).
out_col_int – Specify whether the output column should be an int datatype. If True (default: False) then the output column will be of type int. If False then it will be type float.
- rsgislib.vectorattrs.add_string_col(vec_file: str, vec_lyr: str, out_col: str, out_vec_file: str, out_vec_lyr: str, out_val: str = 'str_val', out_format: str = 'GPKG')
A function which adds a string column with the same value for all the features.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
out_col – The output numeric column
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_val – output numeric value
out_format – output file format (default GPKG).
- rsgislib.vectorattrs.add_string_col_lut(vec_file: str, vec_lyr: str, ref_col: str, val_lut: Dict, out_col: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which adds a string (text) column based off an existing column in the vector file, using an dict LUT to define the values.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
ref_col – The column within which the unique values will be identified.
val_lut – A dict LUT (key should be value in ref_col and value be the value outputted to out_col).
out_col – The output numeric column
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_format – output file format (default GPKG).
- rsgislib.vectorattrs.add_numeric_col_range_lut(vec_file: str, vec_lyr: str, vec_col: str, out_vec_file: str, out_vec_lyr: str, out_vec_col: str, val_lut: Dict[int, Tuple[float, float]], out_format: str = 'GPKG')
A function which adds a numerical column to the vector layer using an LUT and low (>=) and upper (<) values with reference to the input column for defining the output value which will be the LUT key.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
vec_col – The column within which the unique values will be identified.
out_vec_file – Output vector file
out_vec_lyr – Output vector layer name.
out_vec_col – The output numeric column
val_lut – the LUT for defining the output values. Features outside of the values defined by the LUT will be set as zero. The LUT should define an int as the key which will be the output value and a tuple specifying the lower (>=) and upper (<) values within the vec_col for setting the key value.
out_format – output file format (default GPKG).
- rsgislib.vectorattrs.add_numeric_col_from_lst_lut(vec_file: str, vec_lyr: str, ref_col: str, vals_lut: List[Tuple[str | int, int]], out_col: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which adds a numeric column based off an existing column in the vector file, using an list based LUT to define the values. The LUT should be defined as a list of tuples with the value to match as the first value and the second the value to be outputted. For example, (“Hello”, 1) or (“World”, 2)
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
ref_col – The column within which the unique values will be identified.
vals_lut – A list LUT which should be a list of tuples (LookUp, OutValue).
out_col – The output numeric column
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_format – output file format (default GPKG).
- rsgislib.vectorattrs.create_name_col(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', out_col: str = 'names', x_col: str = 'MinX', y_col: str = 'MaxY', prefix: str = '', postfix: str = '', coords_lat_lon: bool = True, int_coords: bool = True, coord_gain: float = 0.0, zero_x_pad: int = 0, zero_y_pad: int = 0, round_n_digts: int = 0, non_neg: bool = False, replace_dec_pt: bool = True, dec_pt_val: str = '')
A function which creates a column in the vector layer which can define a name using coordinates associated with the feature. Often this is useful if a tiling has been created and from this a set of images are to generated for example.
- Parameters:
vec_file – input vector file
vec_lyr – input vector layer name
out_vec_file – output vector file
out_vec_lyr – output vector layer name
out_format – The output format of the output file. (Default: GPKG)
out_col – The name of the output column
x_col – The column with the x coordinate
y_col – The column with the y coordinate
prefix – A prefix to the name
postfix – A postfix to the name
coords_lat_lon – A boolean specifying if the coordinates are lat / long
int_coords – A boolean specifying whether to integerise the coordinates.
coord_gain – Apply a gain to the coordinate before integerise. Default = 0.0 (i.e., no gain)
zero_x_pad – If larger than zero then the X coordinate will be zero padded.
zero_y_pad – If larger than zero then the Y coordinate will be zero padded.
round_n_digts – If larger than zero then the coordinates will be rounded to n significant digits
non_neg – boolean specifying whether an negative coordinates should be made positive. (Default: False)
replace_dec_pt – replace the decimal point with another string. Default: True
dec_pt_val – the value used instead of a decimal point. Default: “” i.e., empty string so decimal point is removed.
- rsgislib.vectorattrs.create_date_col(vec_file: str, vec_lyr: str, year_col: str, month_col: str, day_col: str, out_date_col: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
Column Utilities
- rsgislib.vectorattrs.drop_vec_cols(vec_file: str, vec_lyr: str, drop_cols: List[str], out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', chk_cols_present: bool = True)
A function which allows vector columns to be removed from the layer.
param vec_file: Input vector file :param vec_lyr: Input vector layer :param drop_cols: List of columns to remove from layer :param out_vec_file: the output vector file :param out_vec_lyr: the output vector layer :param out_format: the output vector format (Default: GPKG) :param chk_cols_present: boolean (default: True) to check that the columns to be
removed are present and remove those from the list which are not present.
- rsgislib.vectorattrs.rename_vec_cols(vec_file: str, vec_lyr: str, rname_cols_lut: Dict[str, str], out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which allows vector column to be renamed.
- Parameters:
vec_file – Input vector file
vec_lyr – Input vector layer
rname_cols_lut – dict look up for the columns to be renamed. Format: {“orig_name”: “new_name”}
out_vec_file – the output vector file
out_vec_lyr – the output vector layer
out_format – the output vector format (Default: GPKG)
Joins
- rsgislib.vectorattrs.perform_spatial_join(vec_base_file: str, vec_base_lyr: str, vec_join_file: str, vec_join_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', join_how: str = 'inner', join_op: str = 'within', vec_base_epsg: int = None, vec_join_epsg: int = None)
A function to perform a spatial join between two vector layers. This function uses geopandas so this needs to be installed. You also need to have the rtree package to generate the index used to perform the intersection.
Note, defining epsg codes for the datasets does not reproject the datasets but just makes sure that correct projection is being used.
For more information see: http://geopandas.org/mergingdata.html#spatial-joins
- Parameters:
vec_base_file – the base vector file with the geometries which will be outputted.
vec_base_lyr – the layer name for the base vector.
vec_join_file – the vector with the attributes which will be joined to the base vector geometries.
vec_join_lyr – the layer name for the join vector.
out_vec_file – the output vector file.
out_vec_lyr – the layer name for the output vector.
out_format – The output vector file format (Default GPKG)
join_how – Specifies the type of join that will occur and which geometry is retained. The options are [left, right, inner]. The default is ‘inner’
join_op – Defines whether or not to join the attributes of one object to another. The options are [intersects, within, contains] and default is ‘within’
vec_base_epsg – Optionally provide the epsg code for the base vector layer.
vec_join_epsg – Optionally provide the epsg code for the join vector layer.
Calculate Column Values
- rsgislib.vectorattrs.pop_bbox_cols(vec_file: str, vec_lyr: str, x_min_col: str = 'xmin', x_max_col: str = 'xmax', y_min_col: str = 'ymin', y_max_col: str = 'ymax')
A function which adds a polygons boundary bbox as attributes to each feature.
- Parameters:
vec_file – vector file.
vec_lyr – layer within the vector file.
x_min_col – output column name.
x_max_col – output column name.
y_min_col – output column name.
y_max_col – output column name.
- rsgislib.vectorattrs.add_geom_bbox_cols(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', min_x_col: str = 'MinX', max_x_col: str = 'MaxX', min_y_col: str = 'MinY', max_y_col: str = 'MaxY')
A function which adds columns to the vector layer with the bbox of each geometry.
- Parameters:
vec_file – input vector file
vec_lyr – input vector layer name
out_vec_file – output vector file
out_vec_lyr – output vector layer name
out_format – The output format of the output file. (Default: GPKG)
min_x_col – Name of the MinX column (Default: MinX)
max_x_col – Name of the MaxX column (Default: MaxX)
min_y_col – Name of the MinY column (Default: MinY)
max_y_col – Name of the MaxY column (Default: MaxY)
- rsgislib.vectorattrs.add_unq_numeric_col(vec_file: str, vec_lyr: str, unq_col: str, out_col: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', lut_json_file: str = None)
A function which adds a numeric column based off an existing column in the vector file.
- Parameters:
vec_file – Input vector file.
vec_lyr – Input vector layer within the input file.
unq_col – The column within which the unique values will be identified.
out_col – The output numeric column
out_vec_file – Output vector file
out_vec_lyr – output vector layer name.
out_format – output file format (default GPKG).
lut_json_file – an optional output LUT file.
- rsgislib.vectorattrs.calc_npts_in_radius(vec_in_file: str, vec_in_lyr: str, radius: float, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', out_col_name: str = 'n_pts_r', n_cores: int = 1)
A function which calculate the number of points intersecting within a radius of each point.
- Parameters:
vec_in_file – Input vector file path (must be points geometry)
vec_in_lyr – Input vector layer (must be points geometry)
radius – the search radius
out_vec_file – Output vector file path
out_vec_lyr – Output vector layer
out_format – output vector format (Default: GPKG)
out_col_name – output column name (Default: n_pts_r)
n_cores – the number of cores to be used for the query. If -1 is passed then all available cores will be used.
- rsgislib.vectorattrs.create_angle_sets(vec_file: str, vec_lyr: str, angle_col: str, start_angle: int, angle_set_width: int, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', out_angle_set_col: str = 'angle_set')
A function which creates sets of features based on an angle column. The assumption is that the angle is from a fixed centre point. The angle sets are mirrored so you can look at patterns along an angle.
- Parameters:
vec_file – Input vector file path
vec_lyr – The input vector layer name.
angle_col – The name of the column within the vector layer with the angles must be degrees (0-360)
start_angle – The angle to start the angle sets from.
angle_set_width – The width of the angle sets - must divide in 180.
out_vec_file – The output vector file path.
out_vec_lyr – The output vector layer name
out_format – The output vector file format (Default: GPKG)
out_angle_set_col – The column in the output file with the column sets. The column sets are specified by an integer ID (1 - n)
- rsgislib.vectorattrs.create_orthogonal_angle_sets(vec_file: str, vec_lyr: str, angle_col: str, start_angle: int, angle_half_width: int, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', out_angle_set_col: str = 'angle_set')
A function which creates a pair of angle sets which are orthogonal to one another. For example, if the start angle is 0 and the angle half width is 20 then the width of each set will be 40 degrees and the first set will be from 340 to 20 degrees. If the start angle is 60 then the first set would be 40-80.
- Parameters:
vec_file – Input vector file path
vec_lyr – The input vector layer name.
angle_col – The name of the column within the vector layer with the angles must be degrees (0-360)
start_angle – The angle to start the angle sets from.
angle_half_width – The half width of each of the angle sets.
out_vec_file – The output vector file path.
out_vec_lyr – The output vector layer name
out_format – The output vector file format (Default: GPKG)
out_angle_set_col – The column in the output file with the column set
- rsgislib.vectorattrs.calc_vec_area(vec_file: str, vec_lyr: str, out_col: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which adds a column to the attribute table with the area of each polygon. Geometry is expected to be polygon.
- Parameters:
vec_file – the input vector file
vec_lyr – the input vector layer
out_col – the output column name
out_vec_file – output vector file path
out_vec_lyr – output vector layer name
out_format – the output vector format (Default: GPKG)
- rsgislib.vectorattrs.calc_vec_length(vec_file: str, vec_lyr: str, out_col: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG')
A function which adds a column to the attribute table with the length of each vector feature. Geometry is expected to be polygon or line.
- Parameters:
vec_file – the input vector file
vec_lyr – the input vector layer
out_col – the output column name
out_vec_file – output vector file path
out_vec_lyr – output vector layer name
out_format – the output vector format (Default: GPKG)
- rsgislib.vectorattrs.calc_vec_pt_dist_angle(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', x_centre: float = None, y_centre: float = None, angle_col: str = 'angle', dist_col: str = 'dist')
A function which adds a column to the attribute table with the distance and angle for each point from a centre point (x_centre, y_centre). If x_centre, y_centre are provided then they are calculated as the mean of all the points.
- Parameters:
vec_file – Input vector file path
vec_lyr – Input vector layer name
out_vec_file – Output vector file path
out_vec_lyr – Output vector layer name
out_format – the output vector format (Default: GPKG)
x_centre – Optionally the X centre point (Default: None). If None then calculated as the mean of all the points.
y_centre – Optionally the Y centre point (Default: None). If None then calculated as the mean of all the points.
angle_col – The output angle column name
dist_col – The output distance column name
Get Column Summaries
- rsgislib.vectorattrs.get_unq_col_values(vec_file: str, vec_lyr: str, col_name: str) array
A function which splits a vector layer by an attribute value into either different layers or different output files.
- Parameters:
vec_file – Input vector file
vec_lyr – Input vector layer
col_name – The column name for which a list of unique values will be returned.
- Returns:
a numpy array as a list of the unique within the column.
Sort By Attributes
- rsgislib.vectorattrs.sort_vec_lyr(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, sort_by: str | List[str], ascending: bool | List[bool], out_format: str = 'GPKG')
A function which sorts a vector layer based on the attributes of the layer. You can sort by either a single attribute or within multiple attributes if a list is provided. This function is implemented using geopandas.
- Parameters:
vec_file – the input vector file.
vec_lyr – the input vector layer name.
out_vec_file – the output vector file.
out_vec_lyr – the output vector layer name.
sort_by – either a string with the name of a single attribute or a list of strings if multiple attributes are used for the sort.
ascending – either a bool (True: ascending; False: descending) or list of bools if a list of attributes was given.
out_format – The output vector file format (Default: GPKG)
Change Attribute Values
- rsgislib.vectorattrs.find_replace_str_vec_lyr(vec_file: str, vec_lyr: str, out_vec_file: str, out_vec_lyr: str, cols: List[str], find_replace: Dict[str, str], out_format: str = 'GPKG')
A function which performs a find and replace on a string column(s) within the vector layer. For example, replacing a no data value (e.g., NA) with something more useful. This function is implemented using geopandas.
- Parameters:
vec_file – the input vector file.
vec_lyr – the input vector layer name.
out_vec_file – the output vector file.
out_vec_lyr – the output vector layer name.
cols – a list of strings with the names of the columns to which the find and replace is to be applied.
find_replace – the value pairs where the dict keys are the values to be replaced and the value is the replacement value.
out_format – The output vector file format (Default: GPKG)
- rsgislib.vectorattrs.check_str_col(vec_file: str, vec_lyr: str, vec_col: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', rm_non_ascii: bool = True, rm_dashs: bool = False, rm_spaces: bool = False, rm_punc: bool = False)
A function which checks the values in a string column removing non-ascii characters and optionally removing spaces, dashes and punctuation.
- Parameters:
vec_file – the input vector file.
vec_lyr – the input vector layer name.
vec_col – the name of the column to be checked.
out_vec_file – the output vector file.
out_vec_lyr – the output vector layer name.
out_format – The output vector file format (Default: GPKG)
rm_non_ascii – If True (default True) remove any non-ascii characters from the string
rm_dashs – If True (default False) remove any dashes from the string and replace with underscores.
rm_spaces – If True (default False) remove any spaces from the string.
rm_punc – If True (default False) remove any punctuation (other than ‘_’ or ‘-’) from the string.
Geometry Intersections
- rsgislib.vectorattrs.count_pt_intersects(vec_in_file: str, vec_in_lyr: str, vec_pts_file: str, vec_pts_lyr: str, out_vec_file: str, out_vec_lyr: str, out_format: str = 'GPKG', out_count_col: str = 'n_points', tmp_col_name: str = 'tmp_join_fid', vec_in_epsg: int = None, vec_pts_epsg: int = None)
A function which counts the number of points intersecting a set of polygons adding the count to each polygon as a new column.
Note, defining epsg codes for the datasets does not reproject the datasets but just makes sure that correct projection is being used.
- Parameters:
vec_in_file – the input polygons vector file path.
vec_in_lyr – the input polygons vector layer name
vec_pts_file – the points vector file path
vec_pts_lyr – the points vector layer name
out_vec_file – the output vector file path
out_vec_lyr – the output vector layer name
out_format – the output vector format (e.g., GPKG).
out_count_col – the output column name (default: n_points)
tmp_col_name – The name of a temporary column added to the input layer used to ensure there are no duplicated features in the output layer. The default name is: “tmp_sel_join_fid”.
vec_in_epsg – Optionally provide the epsg code for the input vector layer.
vec_pts_epsg – Optionally provide the epsg code for the selection vector layer.
- rsgislib.vectorattrs.annotate_vec_selection(vec_in_file: str, vec_in_lyr: str, vec_sel_file: str, vec_sel_lyr: str, out_vec_file: str, out_vec_lyr: str, out_col_name: str = 'sel_feats', out_format: str = 'GPKG', tmp_col_name: str = 'tmp_sel_join_fid', vec_in_epsg: int = None, vec_sel_epsg: int = None)
A function which spatial selects features from the input vector layer which intersects the selection vector layer populating a column within the output vector layer specifying which features intersect.
Note, defining epsg codes for the datasets does not reproject the datasets but just makes sure that correct projection is being used.
- Parameters:
vec_in_file – the input vector file path.
vec_in_lyr – the input vector layer name
vec_sel_file – the selection vector file path
vec_sel_lyr – the selection vector layer name
out_vec_file – the output vector file path
out_vec_lyr – the output vector layer name
out_col_name – the output boolean column specifying those features which intersect with the vec_sel_lyr layer.
out_format – the output vector format (e.g., GPKG).
tmp_col_name – The name of a temporary column added to the input layer used to ensure there are no duplicated features in the output layer. The default name is: “tmp_sel_join_fid”.
vec_in_epsg – Optionally provide the epsg code for the input vector layer.
vec_sel_epsg – Optionally provide the epsg code for the selection vector layer.
Export Attribute Table
- rsgislib.vectorattrs.export_vec_attrs_to_csv(vec_file: str, vec_lyr: str, output_file: str)
A function which exports the attribute table from a vector layer to a CSV file.
- Parameters:
vec_file – The input vector file path
vec_lyr – The input vector layer name
output_file – The output file path.
- rsgislib.vectorattrs.export_vec_attrs_to_excel(vec_file: str, vec_lyr: str, output_file: str, out_sheet_name: str = 'Sheet1')
A function which exports the attribute table from a vector layer to a Excel file (*.xlsx).
- Parameters:
vec_file – The input vector file path
vec_lyr – The input vector layer name
output_file – The output file path.
- rsgislib.vectorattrs.export_vec_attrs_to_parquet(vec_file: str, vec_lyr: str, output_file: str, gzip_output: bool = True)
A function which exports the attribute table from a vector layer to a parquet file.
- Parameters:
vec_file – The input vector file path
vec_lyr – The input vector layer name
output_file – The output file path.