RSGISLib Timeseries Analysis Module
Model Fitting
Thw following functions allow for a stack of timeseries raster images to be converted into a single output image containing per-band season-trend model coefficients, RMSE, and an overall value per-band. The outputs and model fitting are based on the following paper:
Zhu, Z.; Woodcock, C.E.; Holden, C.; Yang, Z. Generating synthetic Landsat images based on all available Landsat data: Predicting Landsat surface reflectance at any given time. Remote Sensing of Environment 2015, 162, 67–83. doi:10.1016/j.rse.2015.02.009.
Models are fitted over the entire provided time series, i.e. the script does not look for breaks/changes.
The input is a JSON file with a list of date:filepath pairs as strings, e.g:
{
"YYYY-MM-DD": "/path/to/image/file/1.tif",
"YYYY-MM-DD": "/path/to/image/file/2.tif",
"YYYY-MM-DD": "/path/to/image/file/3.tif"
}
To fit the model use the following function:
rsgislib.timeseries.modelfitting.get_ST_model_coeffs('example.json', 'coeffs.kea', bands=[3,4,5,6,7], num_processes=4)
The output image can then be used directly (e.g., for classification) or use to predict an output image of particular date:
rsgislib.timeseries.modelfitting.predict_for_date('2019-01-15', 'coeffs.kea', 'predicted.kea')
- rsgislib.timeseries.modelfitting.get_ST_model_coeffs(json_fp, output_fp, gdalformat='KEA', bands=None, num_processes=1, model_type='Lasso', alpha=20, cv=False)
Main function to run to generate the output image. Given an input JSON file and an output file path, generates a multi-band output image where each pixel contains the model details for that pixel. Opening/closing of files, generation of blocks and use of multiprocessing is all handled by RIOS. No data value should be define in the image headers and be the same across all the images.
- Parameters:
json_fp – Path to JSON file of date/filepath pairs.
output_fp – Path for output file.
gdalformat – Short driver name for GDAL, e.g. KEA, GTiff.
bands – List of GDAL band numbers to use in the analysis, e.g. [2, 5, 7].
num_processes – Number of concurrent processes to use.
model_type – Either ‘Lasso’ or ‘OLS’. The type of model fitting to use. OLS will be faster, but more likely to overfit. Both types will adjust the number of model coefficients depending on the number of observations.
alpha – If using Lasso fitting, the alpha value controls the degree of penalization of the coefficients. The lower the value, the closer the model will fit the data. For surface reflectance, a value of around 20 (the default) is usually OK.
cv – If using Lasso fitting, you can use cross validation to choose the value of alpha by setting cv=True. However, this is not recommended and will substantially increase run time.
- rsgislib.timeseries.modelfitting.predict_for_date(date, input_path, output_path, gdalformat='KEA', num_processes=1)
Main function to generate the predicted image. Given an input image containing per-band model coefficients, outputs a multi-band predicted image over the same area. Opening/closing of files, generation of blocks and use of multiprocessing is all handled by RIOS.
- Parameters:
date – The date to predict in YYYY-MM-DD format.
input_path – Path to the input image generated by get_model_coeffs.py.
output_path – Path for the output image.
gdalformat – Short driver name for GDAL, e.g. KEA, GTiff.
num_processes – Number of concurrent processes to use.