RSGISLib Check Datasets Tools
Images
- rsgislib.tools.checkdatasets.run_check_gdal_image_file(input_img: str, check_bands: bool = True, n_bands: int = 0, chk_proj: bool = False, epsg_code: int = 0, read_img: bool = False, smpl_n_pxls: int = 10, calc_chk_sum: bool = False, max_file_size: int = None, rm_err: bool = False, print_err: bool = True, timeout: int = 4)
A function which checks a GDAL compatible image file using the check_gdal_image_file function where a mutliprocessing object is used to catch errors which can crash Python and still continue without crashing the Python environment.
You probably want to call this function rather than calling check_gdal_image_file directly.
- Parameters:
input_img – the file path to the gdal image file.
check_bands – boolean specifying whether individual image bands should be opened and checked (Default: True)
n_bands – int specifying the number of expected image bands. Ignored if 0; Default is 0.
chk_proj – boolean specifying whether to check that the projection has been defined.
epsg_code – int for the EPSG code for the projection. Error raised if image is not that projection.
read_img – boolean specifying whether to try reading some image pixel values from the image. This option will read npxls (e.g., 10) random image pixel values from a randomly selected band.
smpl_n_pxls – The number of pixel values to be randomly selected (default = 10). More values = longer runtime.
calc_chk_sum – boolean specifying whether a checksum should be calculated for each band to check validity
max_file_size – int specifying the maximum file size for the input file. If None then ignored.
rm_err – boolean specifying whether to delete the file if an error is found
print_err – print any errors associated with the file to the console
timeout – a timeout in seconds (Default = 4) for the tests to be undertaken.
- Returns:
boolean whether the file is OK (i.e., passed tests) or not.
- rsgislib.tools.checkdatasets.run_check_gdal_image_files(input_imgs: list, check_bands: bool = True, n_bands: int = 0, chk_proj: bool = False, epsg_code: int = 0, read_img: bool = False, smpl_n_pxls: int = 10, calc_chk_sum: bool = False, max_file_size: int = None, rm_err: bool = False, print_err: bool = True, print_file_names: bool = False, timeout: int = 4)
A function which checks a list of GDAL compatible image files using the check_gdal_image_file function where a mutliprocessing object is used to catch errors which can crash Python and still continue without crashing the Python environment.
You probably want to call this function rather than calling check_gdal_image_file directly.
- Parameters:
input_imgs – a list of input images.
check_bands – boolean specifying whether individual image bands should be opened and checked (Default: True)
n_bands – int specifying the number of expected image bands. Ignored if 0; Default is 0.
chk_proj – boolean specifying whether to check that the projection has been defined.
epsg_code – int for the EPSG code for the projection. Error raised if image is not that projection.
read_img – boolean specifying whether to try reading some image pixel values from the image. This option will read npxls (e.g., 10) random image pixel values from a randomly selected band.
smpl_n_pxls – The number of pixel values to be randomly selected (default = 10). More values = longer runtime.
calc_chk_sum – boolean specifying whether a checksum should be calculated for each band to check validity
max_file_size – int specifying the maximum file size for the input file. If None then ignored.
rm_err – boolean specifying whether to delete the file if an error is found
print_err – print any errors associated with the file to the console
print_file_names – print the names of the file before they are tested.
timeout – a timeout in seconds (Default = 4) for the tests to be undertaken.
- Returns:
boolean whether all the files are OK (i.e., passed tests) or not.
- rsgislib.tools.checkdatasets.check_gdal_image_file(input_img: str, check_bands: bool = True, n_bands: int = 0, chk_proj: bool = False, epsg_code: int = 0, read_img: bool = False, smpl_n_pxls: int = 10, calc_chk_sum: bool = False, max_file_size: int = None)
A function which checks a GDAL compatible image file and returns an error message if appropriate.
- Parameters:
input_img – the file path to the gdal image file.
check_bands – boolean specifying whether individual image bands should be opened and checked (Default: True)
n_bands – int specifying the number of expected image bands. Ignored if 0; Default is 0.
chk_proj – boolean specifying whether to check that the projection has been defined.
epsg_code – int for the EPSG code for the projection. Error raised if image is not that projection.
read_img – boolean specifying whether to try reading some image pixel values from the image. This option will read npxls (e.g., 10) random image pixel values from a randomly selected band.
smpl_n_pxls – The number of pixel values to be randomly selected (default = 10). More values = longer runtime.
calc_chk_sum – boolean specifying whether a checksum should be calculated for each band to check validity
max_file_size – int specifying the maximum file size for the input file. If None then ignored.
- Returns:
boolean (True: file ok; False: Error found), string (error message if required otherwise empty string)
Vectors
- rsgislib.tools.checkdatasets.run_check_gdal_vector_file(vec_file: str, chk_proj: bool = True, epsg_code: int = 0, max_file_size: int = None, rm_err: bool = False, print_err: bool = True, multi_file: bool = False, timeout: int = 4)
A function which checks a GDAL compatible vector file using the check_gdal_vector_file function where a mutliprocessing object is used to catch errors which can crash Python and still continue without crashing the Python environment.
You probably want to call this function rather than calling check_gdal_vector_file directly.
- Parameters:
vec_file – the file path to the gdal vector file.
chk_proj – boolean specifying whether to check that the projection has been defined.
epsg_code – int for the EPSG code for the projection. Error raised if image is not that projection.
max_file_size – int specifying the maximum file size for the input file. If None then ignored.
rm_err – boolean specifying whether to delete the file if an error is found
print_err – print any errors associated with the file to the console
multi_file – if True (Default: False) then remove files with the same basename. Useful for ESRI Shapefiles which are made up of multiple files.
timeout – a timeout in seconds (Default = 4) for the tests to be undertaken.
- Returns:
boolean specifying whether the file is OK (i.e., tests passed) or not.
- rsgislib.tools.checkdatasets.run_check_gdal_vector_files(vec_files: list, chk_proj: bool = True, epsg_code: int = 0, max_file_size: int = None, rm_err: bool = False, print_err: bool = True, multi_file: bool = False, print_file_names: bool = False, timeout: int = 4)
A function which checks a list of GDAL compatible vector files using the check_gdal_vector_file function where a mutliprocessing object is used to catch errors which can crash Python and still continue without crashing the Python environment.
You probably want to call this function rather than calling check_gdal_vector_file directly.
- Parameters:
vec_files – list of input file paths.
chk_proj – boolean specifying whether to check that the projection has been defined.
epsg_code – int for the EPSG code for the projection. Error raised if image is not that projection.
max_file_size – int specifying the maximum file size for the input file. If None then ignored.
rm_err – boolean specifying whether to delete the file if an error is found
print_err – print any errors associated with the file to the console
multi_file – if True (Default: False) then remove files with the same basename. Useful for ESRI Shapefiles which are made up of multiple files.
print_file_names – print the names of the file before they are tested.
timeout – a timeout in seconds (Default = 4) for the tests to be undertaken.
- Returns:
boolean specifying whether all the files are OK (i.e., tests passed) or not.
- rsgislib.tools.checkdatasets.check_gdal_vector_file(vec_file: str, chk_proj: bool = True, epsg_code: int = 0, max_file_size: int = None)
A function which checks a GDAL compatible vector file and returns an error message if appropriate.
- Parameters:
vec_file – the file path to the gdal vector file.
chk_proj – boolean specifying whether to check that the projection has been defined.
epsg_code – int for the EPSG code for the projection. Error raised if image is not that projection.
max_file_size – int specifying the maximum file size for the input file. If None then ignored.
- Returns:
boolean (True: file OK; False: Error found), string (error message if required otherwise empty string)
HDF5 Files
- rsgislib.tools.checkdatasets.run_check_hdf5_file(input_file: str, rm_err: bool = False, print_err: bool = True, timeout: int = 4)
A function which checks a HDF5 file using the check_hdf5_file function where a mutliprocessing object is used to catch errors which can crash Python and still continue without crashing the Python environment.
You probably want to call this function rather than calling check_hdf5_file directly.
- Parameters:
input_file – the file path to the HDF5 file.
rm_err – boolean specifying whether to delete the file if an error is found
print_err – print any errors associated with the file to the console
timeout – a timeout in seconds (Default = 4) for the tests to be undertaken.
- Returns:
boolean specifying whether the file is OK (i.e., tests passed) or not.
- rsgislib.tools.checkdatasets.run_check_hdf5_files(input_files: list, rm_err: bool = False, print_err: bool = True, print_file_names: bool = False, timeout: int = 4)
A function which checks a list of HDF5 files using the check_hdf5_file function where a mutliprocessing object is used to catch errors which can crash Python and still continue without crashing the Python environment.
You probably want to call this function rather than calling check_hdf5_file directly.
- Parameters:
input_files – a list of input HDF5 file paths.
rm_err – boolean specifying whether to delete the file if an error is found
print_err – print any errors associated with the file to the console
print_file_names – print the names of the file before they are tested.
timeout – a timeout in seconds (Default = 4) for the tests to be undertaken.
- Returns:
boolean specifying whether the file is OK (i.e., tests passed) or not.
- rsgislib.tools.checkdatasets.check_hdf5_file(input_file: str)
A function which checks whether a HDF5 file is valid.
- Parameters:
input_file – the file path to the input file.
- Returns:
a boolean - True file is valid. False file is not valid.