RSGISLib Web Tools
- rsgislib.tools.webtools.find_web_imgs(base_url: str, file_ext: str = 'jpg') List[str]
A function which extracts all the URLs for the images with a specified file extension on a given web page. Note this will only return images which are specified using the <img> tag in the web page. Note this will only return image files which are on this page and does not follow links to other pages. Additionally, links which do not have the file extension specified will be ignored.
- Parameters:
base_url – A string representing the URL of the website to search for images.
file_ext – A string representing the file extension of the images to search for. Defaults to “jpg”.
- Returns:
A list of strings containing the URLs of images found on the website with the specified file extension.
- rsgislib.tools.webtools.find_web_files(base_url: str, file_ext: str = 'pdf') List[str]
Get a list of file which are linked (using <a> tag) on the web page with the file extension specified. Note this will only return files which are on this page and does not follow links to other pages. Additionally, links which do not have the file extension specified will be ignored.
- Parameters:
base_url – The base URL of the website to search for files.
file_ext – The file extension to filter for (default is “pdf”).
- Returns:
A list of URLs for files with the specified extension found on the website.