RSGISLib Data Sources

This module has function to help with accessing and downloading data.

USGS Earth Explorer

rsgislib.dataaccess.usgs_m2m.usgs_login(username: str = None, password: str = None) str

A function to login to the USGS m2m service.

Parameters:
  • username – Your username for USGS EarthExplorer. If RSGIS_USGS_USER environmental variable is specified then username will read from there is None is passed (Default: None)

  • password – Your password for USGS EarthExplorer. If RSGIS_USGS_PASS environmental variable is specified then password will read from there is None is passed (Default: None)

Returns:

the API key for the USGS session.

rsgislib.dataaccess.usgs_m2m.usgs_logout(api_key: str)

Log out of the USGS m2m system using the api_key created at login. :param api_key: The API key created at login to authenticate.

rsgislib.dataaccess.usgs_m2m.can_user_dwnld(api_key: str) bool

Does the user logged in with the api_key have permission to download data.

Parameters:

api_key – The API key created at login to authenticate.

Returns:

boolean - True does have permission.

rsgislib.dataaccess.usgs_m2m.can_user_order(api_key: str) bool

Does the user logged in with the api_key have permission to order data.

Parameters:

api_key – The API key created at login to authenticate.

Returns:

boolean - True does have permission.

rsgislib.dataaccess.usgs_m2m.get_wrs_pt(api_key: str, row: int, path: int, grid_version: int = 2) -> (<class 'float'>, <class 'float'>)

Get a point for the WRS row/path which can be used for a query.

Parameters:
  • api_key – The API key created at login to authenticate.

  • row – integer for row

  • path – integer for path

  • grid_version – Whether the row/path is WRS1 or WRS2. Default: WRS2.

Returns:

longitude, latitude

rsgislib.dataaccess.usgs_m2m.get_wrs_bbox(api_key: str, row: int, path: int, grid_version: int = 2) -> (<class 'float'>, <class 'float'>, <class 'float'>, <class 'float'>)

Get a bbox for the WRS row/path which can be used for a query.

Parameters:
  • api_key – The API key created at login to authenticate.

  • row – integer for row

  • path – integer for path

  • grid_version – Whether the row/path is WRS1 or WRS2. Default: WRS2.

Returns:

BBOX in lon/lat (x_min, x_max, y_min, y_max)

A function to search for landsat imagery from the USGS.

Parameters:
  • dataset – The name of the dataset to query.

  • api_key – The API key created at login to authenticate.

  • start_date – Start date as a datetime object. (Earlier date)

  • end_date – End date as a datetime object. (Later date)

  • cloud_min – Minimum cloud cover (Default: 0)

  • cloud_max – Maximum cloud cover.

  • bbox – (MinX, MaxX, MinY, MaxY)

  • pt – (X, Y)

  • poly_geom – NOT IMPLEMENTED YET!

  • months – List of months as ints (1-12) you want to limit the search for.

  • full_meta – Full metadata returned (Default: False)

  • max_n_rslts – the maximum number of scenes to be returned (cannot be larger than 100 - if larger than 100 then use get_all_usgs_search function.

  • start_n – The scene number to start the data retrieval from. Note you probably don’t want to use this parameter but use the get_all_usgs_search function.

Returns:

List of scenes found and Dict of meta-data for the number of scenes available.

Uses the usgs_search function to retrive multiple ‘pages’ of search results. So, if you need more than 100 scenes you can use this function to undertake the multiple queries required and merge the results into a single list.

Parameters:
  • dataset – The name of the dataset to query.

  • api_key – The API key created at login to authenticate.

  • max_n_rslts – The maximum number of scenes you want returned.

  • start_date – Start date as a datetime object. (Earlier date)

  • end_date – End date as a datetime object. (Later date)

  • cloud_min – Minimum cloud cover (Default: 0)

  • cloud_max – Maximum cloud cover.

  • bbox – (MinX, MaxX, MinY, MaxY)

  • pt – (X, Y)

  • poly_geom – NOT IMPLEMENTED YET!

  • months – List of months as ints (1-12) you want to limit the search for.

  • full_meta – Full metadata returned (Default: False)

Returns:

List of scenes found through the query.

rsgislib.dataaccess.usgs_m2m.get_download_ids(scns, bulk=False)

A function for extracting a list of display and entity IDs from a list of scenes as would have been returned by from a search query.

Parameters:
  • scns – a list of the scenes

  • bulk – If True then only scenes available for bulk download will be outputted.

Returns:

List of display IDs, List of Entity IDs

rsgislib.dataaccess.usgs_m2m.create_scene_list(api_key: str, dataset: str, scn_ent_ids: List[str], lst_name: str, lst_period: str = 'P1W') int

A function which creates a list of scenes on the system which could be downloaded.

ISO 8601 duration format: P(n)Y(n)M(n)DT(n)H(n)M(n)S

Where:
P is the duration designator (referred to as “period”), and is always placed

at the beginning of the duration.

Y is the year designator that follows the value for the number of years. M is the month designator that follows the value for the number of months. W is the week designator that follows the value for the number of weeks. D is the day designator that follows the value for the number of days. T is the time designator that precedes the time components. H is the hour designator that follows the value for the number of hours. M is the minute designator that follows the value for the number of minutes. S is the second designator that follows the value for the number of seconds.

For example: “P3Y6M4DT12H30M5S” = A duration of three years, six months, four days, twelve hours, thirty minutes, and five seconds.

Parameters:
  • api_key – The API key created at login to authenticate user.

  • dataset – name of the dataset

  • scn_ent_ids – list of entity IDs

  • lst_name – a name for the list - can be anything you want but should be meaningful to you.

  • lst_period – Period the list will exist for in ISO 8601 duration format. Default is P1W (i.e., 1 week).

Returns:

Number of scenes added.

rsgislib.dataaccess.usgs_m2m.remove_scene_list(api_key: str, lst_name: str)

A function to remove a scene list from the system.

Parameters:
  • api_key – The API key created at login to authenticate user.

  • lst_name – a name for the list. Defined by create_scene_list.

rsgislib.dataaccess.usgs_m2m.check_dwnld_opts(api_key: str, lst_name: str, dataset: str, dwnld_filetype: str = 'bundle', rm_lst: bool = True) List[Dict[str, str]]
Parameters:
  • api_key – The API key created at login to authenticate user.

  • lst_name – A name for the list - Defined by create_scene_list.

  • dataset – name of the dataset

  • dwnld_filetype – What you want to download. Options: bundle, band or all Default: is bundle which will be a tar.gz with all the files for the scene.

  • rm_lst – bool specifying whether the list should be deleted once the processing has finished.

Returns:

returns a list of dicts with the entityId and productId.

NASA Common Metadata Repository

rsgislib.dataaccess.nasa_cmr.get_prods_info(prod_short_name: str) List[Dict]

A function which returns information for a product available from the CMR.

Available products can be found here: https://earthdata.nasa.gov/eosdis/science-system-description/eosdis-standard-products

Parameters:

prod_short_name – The name of the product you are interested in.

Returns:

A list of products (probably different versions).

rsgislib.dataaccess.nasa_cmr.check_prod_version_avail(prod_short_name: str, version: str) bool

A function which checks if a version is available.

Parameters:
  • prod_short_name – the product short name for the product of interest.

  • version – the version of the product to be retrieved.

Returns:

Boolean specifying whether the version is available.

rsgislib.dataaccess.nasa_cmr.get_max_prod_version(prod_short_name: str) str

A function which attempts to find the highest (latest) version for a product.

Parameters:

prod_short_name – the product short name for the product of interest.

Returns:

string representation of the highest version.

rsgislib.dataaccess.nasa_cmr.find_granules(prod_short_name: str, version: str, only_dnwld: bool = True, bbox: List[float] = None, pt: List[float] = None, start_date: datetime = None, end_date: datetime = None, cloud_min: int = 0, cloud_max: int = None, sort_date: bool = True, sort_desc: bool = True, page_size: int = 100, page_num: int = 1, other_params: Dict[str, str] = None) List[Dict]

A function which will find granules from the CMR system for the product of interest using the search parameters provided.

https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html#granule-search-by-parameters

Parameters:
  • prod_short_name – the product short name for the product of interest.

  • version – the version of the product to be retrieved.

  • only_dnwld – If true (default)

  • bbox – (MinX, MaxX, MinY, MaxY)

  • pt – (X, Y)

  • start_date – Start date as a datetime object. (Earlier date)

  • end_date – End date as a datetime object. (Later date)

  • cloud_min – Minimum cloud cover (Default: 0)

  • cloud_max – Maximum cloud cover.

  • sort_date – Sort the response by the acquisition date

  • sort_desc – Sort order (ascending or descending). Ascending: oldest version. Descending: newest version.

  • page_size – The number of records to be returned by a single query as a ‘page’.

  • page_num – The page number to be retrieved allowing results greater than the number which will fit on a single page to be retrieved.

  • other_params – A dict of other parameters where the key is the search parameter name and the value is the value to search with.

Returns:

A list of dictionaries with a dictionary for item.

rsgislib.dataaccess.nasa_cmr.find_all_granules(prod_short_name: str, version: str, only_dnwld: bool = True, bbox: List[float] = None, pt: List[float] = None, start_date: datetime = None, end_date: datetime = None, cloud_min: int = 0, cloud_max: int = None, sort_date: bool = True, sort_desc: bool = True, page_size: int = 100, max_n_pages: int = 100, other_params: Dict[str, str] = None) List[Dict]

A function which will find granules from the CMR system for the product of interest using the search parameters provided using the find_granules function but iterates through all the pages available to return all the available granules rather than just a single page.

Parameters:
  • prod_short_name – the product short name for the product of interest.

  • version – the version of the product to be retrieved.

  • only_dnwld – If true (default)

  • bbox – (MinX, MaxX, MinY, MaxY)

  • pt – (X, Y)

  • start_date – Start date as a datetime object. (Earlier date)

  • end_date – End date as a datetime object. (Later date)

  • cloud_min – Minimum cloud cover (Default: 0)

  • cloud_max – Maximum cloud cover.

  • sort_date – Sort the response by the acquisition date

  • sort_desc – Sort order (ascending or descending). Ascending: oldest version. Descending: newest version.

  • page_size – The number of records to be returned by a single query as a ‘page’. (Default: 100)

  • max_n_pages – the maximum number of pages returned (Default: 100)

  • other_params – A dict of other parameters where the key is the search parameter name and the value is the value to search with.

Returns:

A list of dictionaries with a dictionary for item.

rsgislib.dataaccess.nasa_cmr.get_total_file_size(granule_lst: List[Dict]) float

A function which using the list granules to sum the total file size of the granules in the list. The file size units are whatever has been use for the product but seems to be usually be MegaBytes (MB).

Parameters:

granule_lst – List of granules from find_granules or find_all_granules

Returns:

float for the total file size.

rsgislib.dataaccess.nasa_cmr.cmr_download_file_http(input_url: str, out_file_path: str, username: str, password: str, no_except: bool = True) bool
Parameters:
  • input_url – The input remote URL to be downloaded.

  • out_file_path – the local file path and file name

  • username – the username for the server

  • password – the password for the server

Returns:

boolean as to whether the file was successfully downloaded or not.