RSGISLib Classification Module

The classification module provides classification functionality within RSGISLib.

rsgislib.classification.generateTransectAccuracyPts(inputImage, inputLinesShp, outputPtsShp, classImgCol, classImgVecCol, classRefVecCol, lineStep, force=False)

A tool for converting a set of lines in to point transects and populating with the information for undertaking an accuracy assessment.

Where:

  • inputImage is a string specifying the input image file with classification.
  • inputLinesShp is a string specifying the input lines shapefile path.
  • outputPtsShp is a string specifying the output points shapefile path.
  • classImgCol is a string speciyfing the name of the column in the image file containing the class names.
  • classImgVecCol is a string specifiying the output column in the shapefile for the classified class names.
  • classRefVecCol is an optional string specifiying an output column in the shapefile which can be used in the accuracy assessment for the reference data.
  • lineStep is a double specifying the step along the lines between the points
  • force is an optional boolean specifying whether the output shapefile should be deleted if is already exists (True and it will be deleted; Default is False)

Image Pixel Classification

class rsgislib.classification.classimgutils.ClassInfoObj(id=None, fileH5=None, red=None, green=None, blue=None)

This is a class to store the information associated within the classification.

  • id - Output pixel value for this class
  • fileH5 - hdf5 file (from rsgislib.imageutils.extractZoneImageBandValues2HDF) with the training data for the class
  • red - Red colour for visualisation (0-255)
  • green - Green colour for visualisation (0-255)
  • blue - Blue colour for visualisation (0-255)
class rsgislib.classification.classimgutils.SamplesInfoObj(className=None, classID=None, maskImg=None, maskPxlVal=None, outSampImgFile=None, numSamps=None, samplesH5File=None, red=None, green=None, blue=None)

This is a class to store the information associated within the classification.

  • className - The name of the class
  • classID - Is the classification numeric ID (i.e., output pixel value)
  • maskImg - The input image mask from which samples are taken
  • maskPxlVal - The pixel value within the mask for the class
  • outSampImgFile - Temporary file which will store the sampled pixels.
  • numSamps - The number of samples required.
  • samplesH5File - File location for the HDF5 file with the input image values for training.
  • red - for visualisation red value.
  • green - for visualisation green value.
  • blue - for visualisation blue value.
rsgislib.classification.classimgutils.applyClassifer(classTrainInfo, skClassifier, imgMask, imgMaskVal, imgFileInfo, outputImg, gdalFormat, classClrNames=True)

This function uses a trained classifier and applies it to the provided input image.

  • classTrainInfo - dict (where the key is the class name) of ClassInfoObj objects which will be used to train the classifier (i.e., trainClassifier()), provide pixel value id and RGB class values.
  • skClassifier - a trained instance of a scikit-learn classifier (e.g., use trainClassifier or findClassifierParametersAndTrain)
  • imgMask - is an image file providing a mask to specify where should be classified. Simplest mask is all the valid data regions (rsgislib.imageutils.genValidMask)
  • imgMaskVal - the pixel value within the imgMask to limit the region to which the classification is applied. Can be used to create a heirachical classification.
  • imgFileInfo - a list of rsgislib.imageutils.ImageBandInfo objects (also used within rsgislib.imageutils.extractZoneImageBandValues2HDF) to identify which images and bands are to be used for the classification so it adheres to the training data.
  • outputImg - output image file with the classification. Note. by default a colour table and class names column is added to the image. If an error is produced use HFA or KEA formats.
  • gdalFormat - is the output image format - all GDAL supported formats are supported.
  • classClrNames - default is True and therefore a colour table will the colours specified in classTrainInfo and a ClassName column (from imgFileInfo) will be added to the output file.
rsgislib.classification.classimgutils.findClassifierParametersAndTrain(classTrainInfo, paramSearchSampNum=0, gridSearch=GridSearchCV(cv=None, error_score=’raise’, estimator=RandomForestClassifier(bootstrap=True, class_weight=None, criterion=’gini’, max_depth=None, max_features=’auto’, max_leaf_nodes=None, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False), fit_params={}, iid=True, n_jobs=1, param_grid={}, pre_dispatch=‘2*n_jobs’, refit=True, return_train_score=True, scoring=None, verbose=0))

A function to find the optimal parameters for classification using a Grid Search (http://scikit-learn.org/stable/modules/grid_search.html). The returned classifier instance will be trained using the input data.

  • classTrainInfo - list of ClassInfoObj objects which will be used to train the classifier.
  • paramSearchSampNum - the number of samples that will be randomly sampled from the training data for each class for applying the grid search (tend to use a small data sample as can take a long time). A value of 500 would use 500 samples per class.
  • gridSearch - is an instance of the sklearn.model_selection.GridSearchCV with an instance of the choosen classifier and parameters to be searched.
rsgislib.classification.classimgutils.performPerPxlMLClassShpTrain(imageBandInfo=[], classInfo={}, outputImg=’classImg.kea’, gdalFormat=’KEA’, tmpPath=’./tmp’, skClassifier=RandomForestClassifier(bootstrap=True, class_weight=None, criterion=’gini’, max_depth=None, max_features=’auto’, max_leaf_nodes=None, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False), gridSearch=None, paramSearchSampNum=100)

A function which performs a per-pixel based classification of a scene using a machine learning classifier from the scikit-learn library where a single polygon shapefile per class is required to represent the training data.

  • imageBandInfo is a list of rsgislib.imageutils.ImageBandInfo objects specifying the images which should be used.
  • classInfo is a dict of rsgislib.classification.classimgutils.ClassInfoObj objects where the key is the class name. The fileH5 field is used to define the file path to the shapefile with the training data.
  • outputImg is the name and path to the output image file.
  • gdalFormat is the output image file format (e.g., KEA).
  • tmpPath is a tempory file path which can be used during processing.
  • skClassifier is an instance of a scikit-learn classifier appropriately parameterised. If None then the gridSearch object must not be None.
  • gridSearch is an instance of a scikit-learn sklearn.model_selection.GridSearchCV object with the classifier and parameter search space specified. (If None then skClassifier will be used; if both not None then skClassifier will be used in preference to gridSearch)

Example:

from rsgislib.classification import classimgutils
from rsgislib import imageutils

from sklearn.ensemble import ExtraTreesClassifier
from sklearn.model_selection import GridSearchCV

imageBandInfo=[imageutils.ImageBandInfo('./LS2MSS_19750620_lat10lon6493_r67p250_rad_srefdem_30m.kea', 'Landsat', [1,2,3,4])]
classInfo=dict()
classInfo['Forest'] = classimgutils.ClassInfoObj(id=1, fileH5='./ForestRegions.shp', red=0, green=255, blue=0)
classInfo['Non-Forest'] = classimgutils.ClassInfoObj(id=2, fileH5='./NonForestRegions.shp', red=100, green=100, blue=100)


skClassifier=ExtraTreesClassifier(n_estimators=20)
classimgutils.performPerPxlMLClassShpTrain(imageBandInfo, classInfo, outputImg='classImg.kea', gdalFormat='KEA', tmpPath='./tmp', skClassifier=skClassifier)
rsgislib.classification.classimgutils.performVotingClassification(skClassifiers, trainSamplesInfo, imgFileInfo, classAreaMask, classMaskPxlVal, tmpDIR, tmpImgBase, outClassImg, gdalFormat=’KEA’, numCores=-1)

A function which will perform a number of classification creating a combined classification by a simple vote. The classifier parameters can be differed as a list of classifiers is provided (the length of the list is equal to the number of votes), where the training data is resampled for each classifier. The analysis can be performed using multiple processing cores.

Where:

  • skClassifiers a list of classifiers (from scikit-learn), the number of classifiers defined will be equal to the number of votes.
  • trainSamplesInfo - a list of rsgislib.classification.classimgutils.SamplesInfoObj objects used to parameters the classifer and extract training data.
  • imgFileInfo - a list of rsgislib.imageutils.ImageBandInfo objects (also used within rsgislib.imageutils.extractZoneImageBandValues2HDF) to identify which images and bands are to be used for the classification so it adheres to the training data.
  • classAreaMask - a mask image which is used to specified the areas of the scene which are to be classified.
  • classMaskPxlVal - is the pixel value within the classAreaMask image for the areas of the image which are to be classified.
  • tmpDIR - a temporary file location which will be created and removed during processing.
  • tmpImgBase - the same name of files written to the tmpDIR
  • outClassImg - the final output image file.
  • gdalFormat - the output file format for outClassImg
  • numCores - is the number of processing cores to be used for the analysis (if -1 then all cores on the machine will be used).

Example:

classVoteTemp = os.path.join(imgTmp, 'ClassVoteTemp')

imgFileInfo = [rsgislib.imageutils.ImageBandInfo(img2010dB, 'sardb', [1,2]), rsgislib.imageutils.ImageBandInfo(imgSRTM, 'srtm', [1])]
trainSamplesInfo = []
trainSamplesInfo.append(PerformVotingClassifier.SamplesInfoObj(className='Water', classID=1, maskImg=classTrainRegionsMask, maskPxlVal=1, outSampImgFile='WaterSamples.kea', numSamps=500, samplesH5File='WaterSamples_pxlvals.h5', red=0, green=0, blue=255))
trainSamplesInfo.append(PerformVotingClassifier.SamplesInfoObj(className='Land', classID=2, maskImg=classTrainRegionsMask, maskPxlVal=2, outSampImgFile='LandSamples.kea', numSamps=500, samplesH5File='LandSamples_pxlvals.h5', red=150, green=150, blue=150))
trainSamplesInfo.append(PerformVotingClassifier.SamplesInfoObj(className='Mangroves', classID=3, maskImg=classTrainRegionsMask, maskPxlVal=3, outSampImgFile='MangroveSamples.kea', numSamps=500, samplesH5File='MangroveSamples_pxlvals.h5', red=0, green=153, blue=0))

skClassifiers = []
for i in range(5):
    skClassifiers.append(ExtraTreesClassifier(n_estimators=50))
    
for i in range(5):
    skClassifiers.append(ExtraTreesClassifier(n_estimators=100))
    
for i in range(5):
    skClassifiers.append(ExtraTreesClassifier(n_estimators=50), max_depth=2)
    
for i in range(5):
    skClassifiers.append(ExtraTreesClassifier(n_estimators=100), max_depth=2)

mangroveRegionClassImg = MangroveRegionClass.kea
classimgutils.performVotingClassification(skClassifiers, trainSamplesInfo, imgFileInfo, classWithinMask, 1, classVoteTemp, 'ClassImgSample', mangroveRegionClassImg, gdalFormat='KEA', numCores=-1)
rsgislib.classification.classimgutils.trainClassifier(classTrainInfo, skClassifier)

This function trains the classifier.

Raster GIS

rsgislib.classification.classratutils.balanceSampleTrainingRandom(clumpsImg, trainCol, outTrainCol, minNoSamples, maxNoSamples)

A function to balance the number of training samples for classification so the number is above a minimum threshold (minNoSamples) and all equal to the class with the smallest number of samples unless that is above a set maximum (maxNoSamples).

  • clumpsImg is a string with the file path to the input image with RAT
  • trainCol is a string for the name of the input column specifying the training samples (zero is no data)
  • outTrainCol is a string with the name of the outputted training samples.
  • minNoSamples is an int specifying the minimum number of training samples for a class (if below threshold class is removed).
  • maxNoSamples is an int specifiying the maximum number of training samples per class.
rsgislib.classification.classratutils.classifyWithinRAT(clumpsImg, classesIntCol, classesNameCol, variables, classifier=RandomForestClassifier(bootstrap=True, class_weight=None, criterion=’gini’, max_depth=None, max_features=3, max_leaf_nodes=None, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1, oob_score=True, random_state=None, verbose=0, warm_start=False), outColInt=’OutClass’, outColStr=’OutClassName’, roiCol=None, roiVal=1, classColours=None, preProcessor=None, justFit=False)

A function which will perform a classification within the RAT using a classifier from scikit-learn

  • clumpsImg is the clumps image on which the classification is to be performed
  • classesIntCol is the column with the training data as int values
  • classesNameCol is the column with the training data as string class names
  • variables is an array of column names which are to be used for the classification
  • classifier is an instance of a scikit-learn classifier (e.g., RandomForests which is Default)
  • outColInt is the output column name for the int class representation (Default: ‘OutClass’)
  • outColStr is the output column name for the class names column (Default: ‘OutClassName’)
  • roiCol is a column name for a column which specifies the region to be classified. If None ignored (Default: None)
  • roiVal is a int value used within the roiCol to select a region to be classified (Default: 1)
  • classColours is a python dict using the class name as the key along with arrays of length 3 specifying the RGB colours for the class.
  • preProcessor is a scikit-learn processors such as sklearn.preprocessing.MaxAbsScaler() which can rescale the input variables independently as read in (Define: None; i.e., not in use).
  • justFit is a boolean specifying that the classifier should just be fitted to the data and not applied (Default: False; i.e., apply classification)

Example:

from sklearn.ensemble import ExtraTreesClassifier
from rsgislib.classification import classratutils

classifier = ExtraTreesClassifier(n_estimators=100, max_features=3, n_jobs=-1, verbose=0)

classColours = dict()
classColours['Forest'] = [0,138,0]
classColours['NonForest'] = [200,200,200]

variables = ['GreenAvg', 'RedAvg', 'NIR1Avg', 'NIR2Avg', 'NDVI']
classifyWithinRAT(clumpsImg, classesIntCol, classesNameCol, variables, classifier=classifier, classColours=classColours)

from sklearn.preprocessing import MaxAbsScaler

# With pre-processor
classifyWithinRAT(clumpsImg, classesIntCol, classesNameCol, variables, classifier=classifier, classColours=classColours, preProcessor=MaxAbsScaler())
rsgislib.classification.classratutils.classifyWithinRATTiled(clumpsImg, classesIntCol, classesNameCol, variables, classifier=RandomForestClassifier(bootstrap=True, class_weight=None, criterion=’gini’, max_depth=None, max_features=3, max_leaf_nodes=None, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1, oob_score=True, random_state=None, verbose=0, warm_start=False), outColInt=’OutClass’, outColStr=’OutClassName’, roiCol=None, roiVal=1, classColours=None, scaleVarsRange=False, justFit=False)

A function which will perform a classification within the RAT using a classifier from scikit-learn using the rios ratapplier interface allowing very large RATs to be processed.

  • clumpsImg is the clumps image on which the classification is to be performed
  • classesIntCol is the column with the training data as int values
  • classesNameCol is the column with the training data as string class names
  • variables is an array of column names which are to be used for the classification
  • classifier is an instance of a scikit-learn classifier (e.g., RandomForests which is Default)
  • outColInt is the output column name for the int class representation (Default: ‘OutClass’)
  • outColStr is the output column name for the class names column (Default: ‘OutClassName’)
  • roiCol is a column name for a column which specifies the region to be classified. If None ignored (Default: None)
  • roiVal is a int value used within the roiCol to select a region to be classified (Default: 1)
  • classColours is a python dict using the class name as the key along with arrays of length 3 specifying the RGB colours for the class.
  • scaleVarsRange will rescale each variable independently to a range of 0-1 (default: False).
  • justFit is a boolean specifying that the classifier should just be fitted to the data and not applied (Default: False; i.e., apply classification)

Example:

from sklearn.ensemble import ExtraTreesClassifier
from rsgislib.classification import classratutils

classifier = ExtraTreesClassifier(n_estimators=100, max_features=3, n_jobs=-1, verbose=0)

classColours = dict()
classColours['Forest'] = [0,138,0]
classColours['NonForest'] = [200,200,200]

variables = ['GreenAvg', 'RedAvg', 'NIR1Avg', 'NIR2Avg', 'NDVI']
classifyWithinRATTiled(clumpsImg, classesIntCol, classesNameCol, variables, classifier=classifier, classColours=classColours)
    
# With using range scaling.
classifyWithinRATTiled(clumpsImg, classesIntCol, classesNameCol, variables, classifier=classifier, classColours=classColours, scaleVarsRange=True)
rsgislib.classification.classratutils.clusterWithinRAT(clumpsImg, variables, clusterer=MiniBatchKMeans(batch_size=100, compute_labels=True, init=’k-means++’, init_size=None, max_iter=100, max_no_improvement=10, n_clusters=8, n_init=3, random_state=None, reassignment_ratio=0.01, tol=0.0, verbose=0), outColInt=’OutCluster’, roiCol=None, roiVal=1, clrClusters=True, clrSeed=10, addConnectivity=False, preProcessor=None)

A function which will perform a clustering within the RAT using a clustering algorithm from scikit-learn

  • clumpsImg is the clumps image on which the classification is to be performed.
  • variables is an array of column names which are to be used for the clustering.
  • clusterer is an instance of a scikit-learn clusterer (e.g., MiniBatchKMeans which is Default; Note with 8 clusters).
  • outColInt is the output column name identifying the clusters (Default: ‘OutCluster’).
  • roiCol is a column name for a column which specifies the region to be clustered. If None ignored (Default: None).
  • roiVal is a int value used within the roiCol to select a region to be clustered (Default: 1).
  • clrClusters is a boolean specifying whether the colour table should be updated to correspond to the clusters (Default: True).
  • clrSeed is an integer seeding the random generator used to generate the colours (Default=10; if None provided system time used).
  • addConnectivity is a boolean which adds a kneighbors_graph to the clusterer (just an option for the AgglomerativeClustering algorithm)
  • preProcessor is a scikit-learn processors such as sklearn.preprocessing.MaxAbsScaler() which can rescale the input variables independently as read in (Define: None; i.e., not in use).

Example:

from rsgislib.classification import classratutils
from sklearn.cluster import DBSCAN

sklearnClusterer = DBSCAN(eps=1, min_samples=50)
classratutils.clusterWithinRAT('MangroveClumps.kea', ['MinX', 'MinY'], clusterer=sklearnClusterer, outColInt="OutCluster", roiCol=None, roiVal=1, clrClusters=True, clrSeed=10, addConnectivity=False)

# With pre-processor
from sklearn.preprocessing import MaxAbsScaler
classratutils.clusterWithinRAT('MangroveClumps.kea', ['MinX', 'MinY'], clusterer=sklearnClusterer, outColInt="OutCluster", roiCol=None, roiVal=1, clrClusters=True, clrSeed=10, addConnectivity=False, preProcessor=MaxAbsScaler())
rsgislib.classification.classratutils.findClassifierParameters(clumpsImg, classesIntCol, variables, preProcessor=None, gridSearch=GridSearchCV(cv=None, error_score=’raise’, estimator=RandomForestClassifier(bootstrap=True, class_weight=None, criterion=’gini’, max_depth=None, max_features=’auto’, max_leaf_nodes=None, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False), fit_params={}, iid=True, n_jobs=1, param_grid={}, pre_dispatch=‘2*n_jobs’, refit=True, return_train_score=True, scoring=None, verbose=0))

Find the optimal parameters for a classifier using a grid search and return a classifier instance with those optimal parameters.

  • clumpsImg is the clumps image on which the classification is to be performed
  • classesIntCol is the column with the training data as int values
  • variables is an array of column names which are to be used for the classification
  • preProcessor is a scikit-learn processors such as sklearn.preprocessing.MaxAbsScaler() which can rescale the input variables independently as read in (Define: None; i.e., not in use).
  • gridSearch is an instance of GridSearchCV parameterised with a classifier and parameters to be searched.

return:

  • Instance of the classifier with optimal parameters defined.

Example:

from rsgislib.classification import classratutils
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import MaxAbsScaler

clumpsImg = "./LS8_20150621_lat10lon652_r67p233_clumps.kea"
classesIntCol = 'ClassInt'

classParameters = {'kernel':['linear', 'rbf',  'poly', 'sigmoid'], 'C':[1, 2, 3, 4, 5, 10, 100, 400, 500, 1e3, 5e3, 1e4, 5e4, 1e5], 'gamma':[0.0001, 0.0005, 0.001, 0.005, 0.01, 0.1, 'auto'], 'degree':[2, 3, 4, 5, 6, 7, 8], 'class_weight':['', 'balanced'], 'decision_function_shape':['ovo', 'ovr', None]}
variables = ['BlueRefl', 'GreenRefl', 'RedRefl', 'NIRRefl', 'SWIR1Refl', 'SWIR2Refl']

gSearch = GridSearchCV(SVC(), classParameters)
classifier = classratutils.findClassifierParameters(clumpsImg, classesIntCol, variables, preProcessor=MaxAbsScaler(), gridSearch=gSearch)
rsgislib.classification.collapseClasses(inputimage, outputimage, gdalformat, classColumn, classIntCol)

Collapses an attribute table with a large number of classified clumps (segments) to a attribute table with a single row per class (i.e. a classification rather than segmentation.

Where:

  • inputImage is a string containing the name and path of the input file with attribute table.
  • outputImage is a string containing the name and path of the output file.
  • gdalformat is a string with the output image format for the GDAL driver.
  • classColumn is a string with the name of the column with the class names - internally this will be treated as a string column even if a numerical column is specified.
  • classIntCol is a sting specifying the name of a column with the integer class representation. This is an optional parameter but if specified then the int reprentation of the classes will be reserved.
rsgislib.classification.colour3bands(inputimage, outputimage, gdalformat)

Generates a 3 band colour image from the colour table in the input file.

Where:

  • inputImage is a string containing the name and path of the input file with attribute table.
  • outputImage is a string containing the name and path of the output file.
  • gdalformat is a string with the output image format for the GDAL driver.

Accuracy Assessment

rsgislib.classification.generateRandomAccuracyPts(inputImage, outputShp, classImgCol, classImgVecCol, classRefVecCol, numPts, seed, force)

Generates a set of random points for accuracy assessment.

Where:

  • inputImage is a string containing the name and path of the input image with attribute table.
  • outputShp is a string containing the name and path of the output shapefile.
  • classImgCol is a string speciyfing the name of the column in the image file containing the class names.
  • classImgVecCol is a string specifiying the output column in the shapefile for the classified class names.
  • classRefVecCol is a string specifiying an output column in the shapefile which can be used in the accuracy assessment for the reference data.
  • numPts is an int specifying the total number of points which should be created.
  • seed is an int specifying the seed for the random number generator. (Optional: Default 10)
  • force is a bool, specifying whether to force removal of the output vector if it exists. (Optional: Default False)
rsgislib.classification.generateStratifiedRandomAccuracyPts(inputImage, outputShp, classImgCol, classImgVecCol, classRefVecCol, numPts, seed, force)

Generates a set of stratified random points for accuracy assessment.

Where:

  • inputImage is a string containing the name and path of the input image with attribute table.
  • outputShp is a string containing the name and path of the output shapefile.
  • classImgCol is a string speciyfing the name of the column in the image file containing the class names.
  • classImgVecCol is a string specifiying the output column in the shapefile for the classified class names.
  • classRefVecCol is a string specifiying an output column in the shapefile which can be used in the accuracy assessment for the reference data.
  • numPts is an int specifying the number of points for each class which should be created.
  • seed is an int specifying the seed for the random number generator. (Optional: Default 10)
  • force is a bool, specifying whether to force removal of the output vector if it exists. (Optional: Default False)
rsgislib.classification.generateTransectAccuracyPts(inputImage, inputLinesShp, outputPtsShp, classImgCol, classImgVecCol, classRefVecCol, lineStep, force=False)

A tool for converting a set of lines in to point transects and populating with the information for undertaking an accuracy assessment.

Where:

  • inputImage is a string specifying the input image file with classification.
  • inputLinesShp is a string specifying the input lines shapefile path.
  • outputPtsShp is a string specifying the output points shapefile path.
  • classImgCol is a string speciyfing the name of the column in the image file containing the class names.
  • classImgVecCol is a string specifiying the output column in the shapefile for the classified class names.
  • classRefVecCol is an optional string specifiying an output column in the shapefile which can be used in the accuracy assessment for the reference data.
  • lineStep is a double specifying the step along the lines between the points
  • force is an optional boolean specifying whether the output shapefile should be deleted if is already exists (True and it will be deleted; Default is False)
rsgislib.classification.popClassInfoAccuracyPts(inputImage, inputShp, classImgCol, classImgVecCol, classRefVecCol)

Generates a set of stratified random points for accuracy assessment.

Where:

  • inputImage is a string containing the name and path of the input image with attribute table.
  • inputShp is a string containing the name and path of the input shapefile.
  • classImgCol is a string speciyfing the name of the column in the image file containing the class names.
  • classImgVecCol is a string specifiying the output column in the shapefile for the classified class names.
  • classRefVecCol is an optional string specifiying an output column in the shapefile which can be used in the accuracy assessment for the reference data.