8 Thematic Accuracy Assessment
Module Overview
This module performs a comprehensive thematic accuracy assessment of the land cover map using an independent ground reference data. The framework for this module are based on the publication, Good Practices for Estimating Area and Assessing Accuracy of Land Change, by Olofsson et al 2014. This framework separates accuracy assessment into three major components, namely sampling design, response design, and analysis. The Luma-GE only adapt the sampling design and analysis components. Since the response design consist of protocols and guideline in obtaining the ground reference data, this component are currently not implemented.
Input
| Name of input | Input type | Details |
|---|---|---|
| Validation map | User’s input | In shapefile format. |
| Classified LULC map | Input from Other Modules | Module 6 |
Output
- Confusion matrix.
- Thematic accuracy metrics: Overall Accuracy (OA), Kappa Coefficient, Producer’s Accuracy (PA), User’s Accuracy (UA).
- Reference data sites (optional, for users who do not have reference data, the system will generate reference data sites for them to label outside Luma-GE).
- Spatial distribution of error
Process
8.1 Checking Prerequisites from Previous Modules
This is a part of System Response 7.1: Verification data
8.1.1 Checking Prerequisites from Previous Modules
Luma User Journey
The user is displayed a verification of the required inputs stored in the system.
ImportantError Handling NotificationThis module cannot be accessed if the system is missing the required inputs from the Input from Other Modules.
Luma Geospatial Engine
- The system validates availability of the required inputs.
8.2 Thematic Accuracy Assessment
8.2.1 Selecting The Workflow
Two workflow are available for Module 7. The first workflow is design for the user who did not have reference data. The second workflow is for the user who already have a reference data.
Luma User Journey
If the user did not have reference samples, the user select the “generate reference sample” workflow.
The user specified desire standard error (margin of error) of the map. The range of valid value is 0.1% - 10%, with smaller value resulting in larger sample requirement
The user determine the minimum expected accuracy for each class. This option is set to optional, with the default value of 85% for all class
User are able to generate the sample and check the allocation as well as the spatial distribution in the map canvas.
The user download the reference data and proceed to labeled the data using visual interpretation of higher resolution imagery, field survey, or combination of both. This process are conducted outside Luma-GE
If the user already have reference sample, they choose the “compute accuracy workflow”
The user prompted to upload their reference samples. Currently, the Luma-GE only support shapefile data
The user select column header that correspond to class ID and class name.
The user perform the accuracy assessment and decide if their classification is meet their accuracy needs
Luma Geospatial Engine
For each workflow, the geospatial engine perform the following operation
8.2.2 Generate Reference Sample Workflow
This workflow consist of two steps, steps, namely sample size calculation and sample allocation. The result of each steps is how many minimum sample required for each class and the location for the samples.
Sample Size Calculation
Sample size calculation for stratified random sample is calculated using the formula provided by Cochran (1977, Eq. (5.25)). The samples are proportionally allocated for each strata (class), resulting in balance sample allocation. The key steps for sample size calculation is as follows:
Calculate stratum standard deviation:
Si = √(Ui × (1 - Ui))Where:
Si = standard deviation for stratum i
Ui = expected accuracy for class
Calculate stratum weight:
Wi = Ni / NWi = weight of stratum i
Ni = Number of pixels in class i
N = Total number of pixels
Calculate total sample size:
n = (Σ(Wi × Si) / SE)²n = Total required samples
SE = desire standard error
Allocate sample per stratum
ni = n × (Wi × Si) / Σ(Wi × Si)ni = Number of samples for stratum i
sample_size_calculator.get_pixel_count_per_class: calculate pixel count for each class using earth engine’s reducer
sample_size_calculator.validate_sample_size_inputs: Input validation prior to main sample size calculation
sample_size_calculator.calculate_strata_sample: Main function that calculate sample size for each class
Sample Allocation
The required sample for each class is allocated in the area of interest using ee.Image.stratifiedSample() . The sample size are based on the sample size calculation result. The feature collection from this process only contain class ID for each land cover class. Class name will be added in the future update. The user are able to download the samples, therefore they can labeled the sample outside Luma GE. Current function only support shapefiles.
sample_size_calculator.generate_stratified_samples: Generate the reference samples based sample size calculation
sample_size_calculator.export_samples_to_shp: Export the generated samples
8.2.3 Accuracy Computation Workflow
Luma User Journey
This is a part of User Journey 7.2 Upload data
The user is prompted to upload the validation map in a
.zipfile. The user is reminded that the validation map should have the same ID class as the one used for the classification.NoteSuccess NotificationThe system shows a confirmation that the validation data has been validated.
ImportantError Handling NotificationThe system shows an error message if the uploaded validation map fails the validation process.
The system displays the uploaded validation map on a canvas map along with the tabular data.
Luma Geospatial Engine
The system verifies if the
.shpof the validation map is the uploaded inside the.zipfile.ImportantError Handling NotificationThe system provides an error notification if the
.shpfile is not found inside the.zipfile.The system conducts geometry fixes on the uploaded validation map.
TipRelated Functioninput_utils.shapefile_validator.validate_and_fix_geometry():to validate and fix geometries issues.ImportantError Handling NotificationAn error message is displayed if the system failed to run
validate_and_fix_geometry().The system converts the validation map from
.gdfdata into Earth Engine Feature Collection formatTipRelated Functioninput_utils.shapefile_validator.convert_roi_gdf(): to convert geodataframe into EE Feture Collection. The supported multi-geometries type for this conversions are MultiPoint and MultiPolygon.ImportantError Handling NotificationAn error message is displayed if the system failed to run
convert_roi_gdf().
8.2.4 Starting the thematic accuracy assessment
Luma User Journey
This is a part of User Journey 7.3: Thematic accuracy assessment
The user is prompted to fill in the parameters for the thematic accuracy assessment, including specifying the column that refers to the LULC ID class and the pixel size of the classified LULC map.
The user can optionally set the confidence interval for the thematic accuracy assessment.
The user is prompted to start the thematic accuracy assessment process.
ImportantError Handling NotificationThe system shows an error message if the system fails to run the thematic accuracy assessment.
Luma Geospatial Engine
This is a part of System Response 7.3: Thematic accuracy assessment
The system performs a thematic accuracy assessment by validating inputs, sampling the classified LULC map at reference points.
TipRelated Functionsaccuracy.thematic_accuracy.validate_assessment_inputs(): checks whether the classified LULC map, validation map, LULC ID class field, and pixel size parameter meet the required conditions before the assessment runs. This function is used inrun_accuracy_assessment().accuracy.thematic_accuracy._extract_confusion_matrix_data(): extracts overall accuracy, kappa, per-class accuracies, and the confusion matrix array from anee.ConfusionMatrixobject. This function is used inrun_accuracy_assessment().accuracy.thematic_accuracy._calculate_confidence_interval(): Computes the confidence interval for overall accuracy using a normal approximation based on the number of correct and total samples. This function is used inrun_accuracy_assessment().accuracy.thematic_accuracy._calculate_f1_scores(): Calculates the F1 score for each class using the producer’s and user’s accuracy values. This function is used inrun_accuracy_assessment().accuracy.thematic_accuracy.run_accuracy_assessment(): Executes the full thematic accuracy workflow, including validation, sampling (sampleRegions()), metric extraction (errorMatrix()), confidence interval calculation, and compilation of final results.Thematic Accuracy Metrics
Several accuracy metrics is calculated to provide comprehensive report of the map’s thematic quality.
Overall Accuracy
Overall Accuracy (OA): Sum of the major diagonal (correctly classified pixels) divided by the total pixels in the entire confusion matrix
OA = Σ(n_ii) / NKappa Coefficient
Kappa coefficient is statistical test generated from the error matrix. Kappa coefficient show how well the classification performed as compared to just randomly assigning values. Kappa value range from -1 to 1, in which the value of 0 indicate that the classification is no better than a random classification. The negative value indicate that the classification is worse than random classification. The value closer to 1 indicate that the classification is better than random classification.
κ = (Po - Pe) / (1 - Pe) Po = Σ(n_ii) / N Pe = Σ(n_i+ × n_+i) / N²Where:
Po = Observed Aggreement (OA)
Pe = Expected agreement by chance
Producers’s Accuracy (Recall/Sensitivity)
One of the class level accuracy metric, which answer the question, “What fraction of actual class was correctly mapped?”
PA_i = n_ii / n_i+Where:
n_ii= correctly classified samples for class in_i+= total reference samples for class i (row sum)User’s Accuracy (Precision)
One of the class level error metric, which answer the question “What fraction of predicted class i is actually class i?”
UA_i = n_ii / n_+iWhere:
n_ii= correctly classified samples for class in_+i= total predicted samples for class i (column sum)
8.2.5 Spatial Distribution of Error
Luma Geospatial Engine
This is part of System Response 7.3: Thematic accuracy assessment
This function provide reference data flagging for visualizing the spatial error distribution, improving the error analysis by tagging each point with whether the classifier got it right or wrong. Currently, this function only works if the reference data unit is point (single pixel). If the reference data consist of polygons, this function will not work.
The system conduct checks if reference data geodataframe is available.
Add actual and predicted class to the geodataframe as key point in determining correctness of the map
Generate a pop-up html with comprehensive information regarding the status of the corresponding reference sites
accuracy.validation_error_flag.classify_validation_points(): core function to perform correctness flagging to the classification map
accuracy.validation_error_flag.generate_popup_html(): create a pop up html for visualizing the error flag
8.2.6 Reviewing the thematic accuracy result
Luma User Journey
This is a part of User Journey 7.4: Preview thematic accuracy result
The system displays the overall accuracy metrics result:
- Overall Accuracy (OA): Percentage of correctly classified validation samples
- Kappa Coefficient: Agreement beyond chance, accounting for class distribution
- Confidence interval at 95%: The lower and upper bounds that describe the uncertainty range of the overall accuracy estimate at the 95% confidence level
- Reference Data: The total number of validation samples used in the thematic accuracy assessment
The system displays the accuracy metrics result on class-level:
- Producer’s Accuracy: Measure of classification completeness (1 - omission error)
- User’s Accuracy: Measure of classification reliability (1 - commission error)
- F1 score: Summarizes Producer’s and User’s Accuracy metrics
- The user can inspect where the misclassification happens using the spatial distribution of error map
The system displays the confusion matrix using heatmap visualization
The user is offered the option to download the overall accuracy metrics result summary or the class-level accuracy result in a
.csvformat.The user is provided with several option to improve the map quality or repeating the accuracy assessment. They can return to Module 6 to change classification parameters, return to Module 5 to add predictors, return to Module 3 to add more training data, and return to Module 2 to change their classification scheme. Each options is arrange according to easiest modification (module 6) to the hardest (module 2).
Luma Geospatial Engine
This sub-step does not involve any operations from Luma Geospatial Engine.