-
Notifications
You must be signed in to change notification settings - Fork 3
/
P8_rasterPackageAdvanced_II-v1.html
823 lines (737 loc) · 30.6 KB
/
P8_rasterPackageAdvanced_II-v1.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<meta name="author" content="João Gonçalves" />
<title>P8 Advanced techniques with raster data (part-2) - Supervised Classification</title>
<script src="P8_rasterPackageAdvanced_II-v1_files/jquery-1.11.3/jquery.min.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="P8_rasterPackageAdvanced_II-v1_files/bootstrap-3.3.5/css/bootstrap.min.css" rel="stylesheet" />
<script src="P8_rasterPackageAdvanced_II-v1_files/bootstrap-3.3.5/js/bootstrap.min.js"></script>
<script src="P8_rasterPackageAdvanced_II-v1_files/bootstrap-3.3.5/shim/html5shiv.min.js"></script>
<script src="P8_rasterPackageAdvanced_II-v1_files/bootstrap-3.3.5/shim/respond.min.js"></script>
<script src="P8_rasterPackageAdvanced_II-v1_files/navigation-1.1/tabsets.js"></script>
<link href="P8_rasterPackageAdvanced_II-v1_files/highlightjs-9.12.0/default.css" rel="stylesheet" />
<script src="P8_rasterPackageAdvanced_II-v1_files/highlightjs-9.12.0/highlight.js"></script>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
pre:not([class]) {
background-color: white;
}
</style>
<script type="text/javascript">
if (window.hljs) {
hljs.configure({languages: []});
hljs.initHighlightingOnLoad();
if (document.readyState && document.readyState === "complete") {
window.setTimeout(function() { hljs.initHighlighting(); }, 0);
}
}
</script>
<style type="text/css">
h1 {
font-size: 34px;
}
h1.title {
font-size: 38px;
}
h2 {
font-size: 30px;
}
h3 {
font-size: 24px;
}
h4 {
font-size: 18px;
}
h5 {
font-size: 16px;
}
h6 {
font-size: 12px;
}
.table th:not([align]) {
text-align: left;
}
</style>
</head>
<body>
<style type = "text/css">
.main-container {
max-width: 940px;
margin-left: auto;
margin-right: auto;
}
code {
color: inherit;
background-color: rgba(0, 0, 0, 0.04);
}
img {
max-width:100%;
height: auto;
}
.tabbed-pane {
padding-top: 12px;
}
button.code-folding-btn:focus {
outline: none;
}
</style>
<div class="container-fluid main-container">
<!-- tabsets -->
<script>
$(document).ready(function () {
window.buildTabsets("TOC");
});
</script>
<!-- code folding -->
<div class="fluid-row" id="header">
<h1 class="title toc-ignore">P8 Advanced techniques with raster data (part-2) - Supervised Classification</h1>
<h4 class="author"><em>João Gonçalves</em></h4>
<h4 class="date"><em>28 de Novembro de 2017</em></h4>
</div>
<div id="background" class="section level3">
<h3>Background</h3>
<hr />
<p>In <strong>supervised classification</strong>, contrarily to the unsupervised version, <em>a priori</em> defined reference classes are used as additional information. This initial process, determines which classes are the result of the classification. Usually a statistical or machine-learning algorithm is used to obtain a classification function which maps every instance (pixel or object, depending on the approach used) to its corresponding class. The following workflow is commonly used for deploying a supervised classifier:</p>
<ul>
<li><p>Definition of the thematic classes of land cover/use (e.g., coniferous forest, deciduous forest, water, agriculture, urban);</p></li>
<li><p>Classification of suitable training areas (reference areas/pixels for each class);</p></li>
<li><p>Calibration of a classification algorithm (for the training set(s));</p></li>
<li><p>Classification of the entire image;</p></li>
<li><p>Classifier performance evaluation, verification and inspection of the results (for the testing set(s)).</p></li>
</ul>
<p>Usually, in supervised classification, spectral data from each of the sensor bands is used to obtain a statistical or rule-based <em>spectral signature</em> for each class. Besides “direct” spectral data, other kinds of information or features can be used for classifier training. These include, band ratios, spectral indices, texture features, temporal variation features (e.g., green-up and senescence changes) as well as ancillary data (e.g., elevation, slope, built-up masks, roads).</p>
<p>Combining the training data and the spectral (or other) features in a classifier algorithm allows to classify the entire image, outside the training space. Usually a form of train/test split set strategy (holdout cross-validation, k-fold CV, etc) is used. The training set is used for classifier calibration while the testing set is used for evaluating the classification performance. This process is usually repeated a few times and then an average value of validation indices is calculated.</p>
<p>Because R currently provides a very-large set of classification algorithms (a good package to access them is <code>caret</code>), it is particularly well-equipped to handle this kind of problem. For developing the examples, the Random Forest (RF) algorithm will be used. RF is implemented in the (conveniently named 😉) <code>randomForest</code> package. Although packages such as <code>caret</code> provide many useful functions to handle classification (training, tuning and evaluation processes), I will not use it them here. My objective in this post, is to explore and show the basic and <em>“under the wood”</em> workflow in <em>pixel-based</em> classification of raster data.</p>
<p>In a nutshell, RF is an ensemble learning method for classification, regression and other tasks, that operate by constructing multiple decision trees during the training stage (‘bagging’) and outputting the class that is the mode of the classes (classification) or the average prediction (regression) of the individual trees. This way, RF correct for decision trees’ habit of over-fitting to their training set. See more <a href="https://en.wikipedia.org/wiki/Random_forest">here</a> and <a href="https://www.kdnuggets.com/2017/10/random-forests-explained.html">here</a>.</p>
<p>Sample data from the optical <strong>Sentinel-2a</strong> (S2) satellite platform will be used in the examples below (see <a href="https://earth.esa.int/web/sentinel/user-guides/sentinel-2-msi">here</a> for more details). This data was made available in the 2017 IEEE GRSS Data Fusion Contest and provided by the <em>GRSS Data and Algorithm Standard Evaluation</em> (DASE) <a href="http://dase.ticinumaerospace._com/index.php">website</a> (you have to register to access the sample datasets currently available). More specifically, we will use one Sentinel-2 scene for Berlin containing <a href="https://earth.esa.int/web/sentinel/user-guides/sentinel-2-msi/resolutions/spatial">10 spectral bands</a>, originally at 10m and 20m of spatial resolution but, re-sampled to 100m in DASE.<br />
Along with S2 spectral data, DASE also provides training samples for calibrating classifiers. The legend encompasses a total of 12 land cover/use classes that is presented in the table below (NOTE: only 12 out of the 17 classes actually appear in the Berlin area).</p>
<pre class="r"><code>legBerlin <- read.csv(url("https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/data/legend_berlin.csv"))
knitr::kable(legBerlin)</code></pre>
<table>
<thead>
<tr class="header">
<th align="left">Type</th>
<th align="right">Code</th>
<th align="left">Land.cover.class.description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="left">Artificial</td>
<td align="right">1</td>
<td align="left">Compact high-rise</td>
</tr>
<tr class="even">
<td align="left">Artificial</td>
<td align="right">2</td>
<td align="left">Compact midrise</td>
</tr>
<tr class="odd">
<td align="left">Artificial</td>
<td align="right">3</td>
<td align="left">Compact low-rise</td>
</tr>
<tr class="even">
<td align="left">Artificial</td>
<td align="right">4</td>
<td align="left">Open high-rise</td>
</tr>
<tr class="odd">
<td align="left">Artificial</td>
<td align="right">5</td>
<td align="left">Open midrise</td>
</tr>
<tr class="even">
<td align="left">Artificial</td>
<td align="right">6</td>
<td align="left">Open low-rise</td>
</tr>
<tr class="odd">
<td align="left">Artificial</td>
<td align="right">7</td>
<td align="left">Lightweight low-rise</td>
</tr>
<tr class="even">
<td align="left">Artificial</td>
<td align="right">8</td>
<td align="left">Large low-rise</td>
</tr>
<tr class="odd">
<td align="left">Artificial</td>
<td align="right">9</td>
<td align="left">Sparsely built</td>
</tr>
<tr class="even">
<td align="left">Artificial</td>
<td align="right">10</td>
<td align="left">Heavy industry</td>
</tr>
<tr class="odd">
<td align="left">Vegetated</td>
<td align="right">11</td>
<td align="left">Dense trees</td>
</tr>
<tr class="even">
<td align="left">Vegetated</td>
<td align="right">12</td>
<td align="left">Scattered trees</td>
</tr>
<tr class="odd">
<td align="left">Vegetated</td>
<td align="right">13</td>
<td align="left">Bush and scrub</td>
</tr>
<tr class="even">
<td align="left">Vegetated</td>
<td align="right">14</td>
<td align="left">Low plants</td>
</tr>
<tr class="odd">
<td align="left">Vegetated</td>
<td align="right">15</td>
<td align="left">Bare rock or paved</td>
</tr>
<tr class="even">
<td align="left">Vegetated</td>
<td align="right">16</td>
<td align="left">Bare soil or sand</td>
</tr>
<tr class="odd">
<td align="left">Vegetated</td>
<td align="right">17</td>
<td align="left">Water</td>
</tr>
</tbody>
</table>
<div id="data-loading-and-visualization" class="section level4">
<h4>Data loading and visualization</h4>
<hr />
<p>Now that we have defined some useful concepts, the workflow and, the data we can start coding! 👍 👍 The first step is to download and uncompress the spectral data for Sentinel-2. These will later be used as training input features for the classification algorithm is able to identify the spectral signatures for each class.</p>
<pre class="r"><code>## Create a folder named data-raw inside the working directory to place downloaded data
if(!dir.exists("./data-raw")) dir.create("./data-raw")
## If you run into download problems try changing: method = "wget"
download.file("https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/data/berlin.zip", "./data-raw/berlin.zip", method = "auto")
## Uncompress the zip file
unzip("./data-raw/berlin.zip", exdir = "./data-raw")</code></pre>
<p>Load the required packages for the post:</p>
<pre class="r"><code>library(raster)
library(randomForest)</code></pre>
<p>Now that we have the data files available, let’s create a <code>RasterStack</code> object from it. We will also change layer names for more convenient ones.</p>
<pre class="r"><code>fl <- list.files("./data-raw/berlin/S2", pattern = ".tif$", full.names = TRUE)
rst <- stack(fl)
names(rst) <- c(paste("b",2:8,sep=""),"b8a","b11","b12")</code></pre>
<p>Now, let’s use the <code>plotRGB</code> function to visually explore the spectral data from Sentinel-2. RGB composites made from different band combinations allow us to highlight different aspects of land cover and see different layers of the Earth surface. Note: band numbers in the Sentinel-2 satellite differ from its position (integer index) in the <code>RasterStack</code> object.</p>
<p>Let’s start by making a Natural Color RGB composite from S2 bands: 4,3,2.</p>
<pre class="r"><code>plotRGB(rst, r=3, g=2, b=1, scale=1E5, stretch="lin")</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P8_plot_rgb_S2-1.png" />
</div>
<p>Next, let’s see an Healthy vegetation composite from S2 bands: 8,11,2.</p>
<pre class="r"><code>plotRGB(rst, r=7, g=9, b=1, scale=1E5, stretch="lin")</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P8_plot_b8_b11_b2_S2-1.png" />
</div>
<p>Finally a false color urban using S2 bands: 12,11,4.</p>
<pre class="r"><code>plotRGB(rst, r=10, g=9, b=3, scale=1E5, stretch="lin")</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P8_plot_b12_b11_b4_S2-1.png" />
</div>
<p>Now, let’s load the training samples used in calibration. These data serves as a reference or examples from which the classifier algorithm will “learn” the spectral signatures of each class.</p>
<pre class="r"><code>rstTrain <- raster("./data-raw/berlin/train/berlin_lcz_GT.tif")
# Remove zeros from the train set (background NA)
rstTrain[rstTrain==0] <- NA
# Convert the data to factor/discrete representation
rstTrain <- ratify(rstTrain)
# Change the layer name
names(rstTrain) <- "trainClass"
# Visualize the data
plot(rstTrain, main="Train areas by class for Berlin")</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P8_load_train_data-1.png" />
</div>
<p>Let’s see how many training pixels do we have for each of the 12 classes (<span class="math inline">\(N_{total} = 24537\)</span>):</p>
<pre class="r"><code>tab <- table(values(rstTrain))
print(tab)</code></pre>
<pre><code>##
## 2 4 5 6 8 9 11 12 13 14 16 17
## 1534 577 2448 4010 1654 761 4960 1028 1050 4424 359 1732</code></pre>
<p>Although perhaps not the best approach in some cases, we will convert our raster dataset into a <code>data.frame</code> object so we can use the RF classifier. Take into consideration, that in some cases, depending on the size of your <code>RasterStack</code> and the available memory, using this approach will not be possible. One simple way to overcome this, would be to convert the training raster into a <code>SpatialPoints</code> object and then run the function <code>extract</code>. This way, only specific pixels from the stack are retrieved. In any case, let’s proceed to get pixel values into our calibration data frame:</p>
<pre class="r"><code>rstDF <- na.omit(values(stack(rstTrain, rst)))
rstDF[,"trainClass"] <- as.factor(as.character(rstDF[,"trainClass"]))</code></pre>
<p>As you probably noticed from the code above, <code>NA</code>’s were removed and the reference class column was converted to a categorical/factor variable. In practice, by “removing <code>NA</code>’s”, it means that we are restricting the data only to the set of training pixels in <code>rstTrain</code> (reducing from 428238 to 24537 rows 👍).</p>
<p>Next up, setting some parameters. In the example, we will use holdout cross-validation (HOCV) to evaluate the RF classifier performance. This means that we will use an iterative split set approach, with a training and a testing set. So, for this purpose, we need to define the proportion of instances for training (<code>pTrain</code>); the remaining will be set aside for evaluation. Here I took into consideration the fact that RF tends to take some time to calibrate with large numbers of observations (~ >10000) hence the relatively ‘large’ train proportion. We also need to define the number of repetitions in HOCV (<code>nEvalRounds</code>).</p>
<pre class="r"><code># Number of holdout evaluation rounds
nEvalRounds <- 20
# Proportion of the data used for training the classifier
pTrain <- 0.5</code></pre>
<p>Now let’s initialize some objects that will allow us to store some info on the the classification performance and validation:</p>
<pre class="r"><code>n <- nrow(rstDF)
nClasses <- length(unique(rstDF[,"trainClass"]))
# Initialize objects
confMats <- array(NA, dim = c(nClasses,nClasses,nEvalRounds))
evalMatrix<-matrix(NA, nrow=nEvalRounds, ncol=3,
dimnames=list(paste("R_",1:nEvalRounds,sep=""),
c("Accuracy","Kappa","PSS")))
pb <- txtProgressBar(1, nEvalRounds, style = 3)</code></pre>
<p>Now, with all set, let´s calibrate and evaluate our RF classifier (use comments to guide you through the code):</p>
<pre class="r"><code># Evaluation function
source(url("https://raw.githubusercontent.com/joaofgoncalves/Evaluation/master/eval.R"))
# Run the classifier
for(i in 1:nEvalRounds){
# Create the random index for row selection at each round
sampIdx <- sample(1:n, size = round(n*pTrain))
# Calibrate the RF classifier
rf <- randomForest(y = rstDF[sampIdx, "trainClass"],
x = rstDF[sampIdx, -1],
ntree = 200)
# Predict the class to the test set
testSetPred <- predict(rf, newdata = rstDF[-sampIdx,], type = "response")
# Get the observed class vector
testSetObs <- rstDF[-sampIdx,"trainClass"]
# Evaluate
evalData <- Evaluate(testSetObs, testSetPred)
evalMatrix[i,] <- c(evalData$Metrics["Accuracy",1],
evalData$Metrics["Kappa",1],
evalData$Metrics["PSS",1])
# Store the confusion matrices by eval round
confMats[,,i] <- evalData$ConfusionMatrix
# Classify the whole image with raster::predict function
rstPredClassTMP <- predict(rst, model = rf,
factors = levels(rstDF[,"trainClass"]))
if(i==1){
# Initiate the predicted raster
rstPredClass <- rstPredClassTMP
# Get precision and recall for each class
Precision <- evalData$Metrics["Precision",,drop=FALSE]
Recall <- evalData$Metrics["Recall",,drop=FALSE]
}else{
# Stack the predicted rasters
rstPredClass <- stack(rstPredClass, rstPredClassTMP)
# Get precision and recall for each class
Precision <- rbind(Precision,evalData$Metrics["Precision",,drop=FALSE])
Recall <- rbind(Recall,evalData$Metrics["Recall",,drop=FALSE])
}
setTxtProgressBar(pb,i)
}
# save.image(file = "./data-raw/P8-session.RData")</code></pre>
<p>Three classification evaluation measures will be used: (i) <strong>overall accuracy</strong>, (ii) <strong>Kappa</strong>, and, (iii) <strong>Peirce-skill score</strong> (PSS; aka true-skill statistic). Let’s print out the results by round:</p>
<pre class="r"><code>knitr::kable(evalMatrix, digits = 3)</code></pre>
<table>
<thead>
<tr class="header">
<th></th>
<th align="right">Accuracy</th>
<th align="right">Kappa</th>
<th align="right">PSS</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>R_1</td>
<td align="right">0.788</td>
<td align="right">0.755</td>
<td align="right">0.749</td>
</tr>
<tr class="even">
<td>R_2</td>
<td align="right">0.789</td>
<td align="right">0.756</td>
<td align="right">0.751</td>
</tr>
<tr class="odd">
<td>R_3</td>
<td align="right">0.782</td>
<td align="right">0.748</td>
<td align="right">0.741</td>
</tr>
<tr class="even">
<td>R_4</td>
<td align="right">0.795</td>
<td align="right">0.763</td>
<td align="right">0.757</td>
</tr>
<tr class="odd">
<td>R_5</td>
<td align="right">0.789</td>
<td align="right">0.755</td>
<td align="right">0.749</td>
</tr>
<tr class="even">
<td>R_6</td>
<td align="right">0.789</td>
<td align="right">0.755</td>
<td align="right">0.750</td>
</tr>
<tr class="odd">
<td>R_7</td>
<td align="right">0.784</td>
<td align="right">0.751</td>
<td align="right">0.744</td>
</tr>
<tr class="even">
<td>R_8</td>
<td align="right">0.785</td>
<td align="right">0.751</td>
<td align="right">0.745</td>
</tr>
<tr class="odd">
<td>R_9</td>
<td align="right">0.785</td>
<td align="right">0.751</td>
<td align="right">0.745</td>
</tr>
<tr class="even">
<td>R_10</td>
<td align="right">0.789</td>
<td align="right">0.756</td>
<td align="right">0.750</td>
</tr>
<tr class="odd">
<td>R_11</td>
<td align="right">0.785</td>
<td align="right">0.752</td>
<td align="right">0.746</td>
</tr>
<tr class="even">
<td>R_12</td>
<td align="right">0.786</td>
<td align="right">0.752</td>
<td align="right">0.746</td>
</tr>
<tr class="odd">
<td>R_13</td>
<td align="right">0.786</td>
<td align="right">0.753</td>
<td align="right">0.747</td>
</tr>
<tr class="even">
<td>R_14</td>
<td align="right">0.785</td>
<td align="right">0.751</td>
<td align="right">0.745</td>
</tr>
<tr class="odd">
<td>R_15</td>
<td align="right">0.794</td>
<td align="right">0.762</td>
<td align="right">0.756</td>
</tr>
<tr class="even">
<td>R_16</td>
<td align="right">0.785</td>
<td align="right">0.752</td>
<td align="right">0.745</td>
</tr>
<tr class="odd">
<td>R_17</td>
<td align="right">0.791</td>
<td align="right">0.759</td>
<td align="right">0.753</td>
</tr>
<tr class="even">
<td>R_18</td>
<td align="right">0.782</td>
<td align="right">0.748</td>
<td align="right">0.741</td>
</tr>
<tr class="odd">
<td>R_19</td>
<td align="right">0.781</td>
<td align="right">0.746</td>
<td align="right">0.739</td>
</tr>
<tr class="even">
<td>R_20</td>
<td align="right">0.785</td>
<td align="right">0.751</td>
<td align="right">0.744</td>
</tr>
</tbody>
</table>
<p>Next, calculate some average and sd measures across all rounds:</p>
<pre class="r"><code>round(apply(evalMatrix,2,FUN = function(x,...) c(mean(x,...), sd(x,...))), 3)</code></pre>
<pre><code>## Accuracy Kappa PSS
## [1,] 0.787 0.753 0.747
## [2,] 0.004 0.004 0.005</code></pre>
<p>Overall measures seem to indicate that results are acceptable with very low variation between calibration rounds. Let’s check out some average <strong>precision</strong> (aka positive predictive value, PPV), <strong>recall</strong> (aka true positive rate, TPR) and F1 measures by class:</p>
<pre class="r"><code>avgPrecision <- apply(Precision,2,mean)
print(round(avgPrecision, 3))</code></pre>
<pre><code>## 11 12 13 14 16 17 2 4 5 6 8 9
## 0.963 0.705 0.717 0.906 0.789 0.998 0.657 0.345 0.509 0.698 0.664 0.484</code></pre>
<pre class="r"><code>avgRecall <- apply(Recall,2,mean)
print(round(avgRecall, 3))</code></pre>
<pre><code>## 11 12 13 14 16 17 2 4 5 6 8 9
## 0.979 0.685 0.648 0.960 0.434 1.000 0.611 0.093 0.481 0.869 0.647 0.268</code></pre>
<pre class="r"><code>avgF1 <- (2 * avgPrecision * avgRecall) / (avgPrecision+avgRecall)
print(round(avgF1, 3))</code></pre>
<pre><code>## 11 12 13 14 16 17 2 4 5 6 8 9
## 0.971 0.695 0.681 0.932 0.560 0.999 0.633 0.147 0.495 0.774 0.655 0.345</code></pre>
<p>Well, things are not so great here… 🙍♂️ Some classes, such as 4, 9, 5 (different artificial/urban types) and 16 (bare soil/sand) are not great. This may be a consequence of the loss of information detail due to the 100m re-sampling or some class intermixing, and train data generalization… 🤔 … this requires more investigation 😉</p>
<p>Now let’s check the confusion matrix for the best round:</p>
<pre class="r"><code># Best round for Kappa
evalMatrix[which.max(evalMatrix[,"Kappa"]), , drop=FALSE]</code></pre>
<pre><code>## Accuracy Kappa PSS
## R_4 0.7946858 0.7626567 0.7569426</code></pre>
<pre class="r"><code># Show confusion matrix for the best kappa
cm <- as.data.frame(confMats[,,which.max(evalMatrix[,"Kappa"])])
# Change row/col names
colnames(cm) <- rownames(cm) <- paste("c",levels(rstDF[,"trainClass"]),sep="_")
knitr::kable(cm)</code></pre>
<table>
<thead>
<tr class="header">
<th></th>
<th align="right">c_11</th>
<th align="right">c_12</th>
<th align="right">c_13</th>
<th align="right">c_14</th>
<th align="right">c_16</th>
<th align="right">c_17</th>
<th align="right">c_2</th>
<th align="right">c_4</th>
<th align="right">c_5</th>
<th align="right">c_6</th>
<th align="right">c_8</th>
<th align="right">c_9</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>c_11</td>
<td align="right">2442</td>
<td align="right">30</td>
<td align="right">2</td>
<td align="right">2</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">1</td>
<td align="right">1</td>
<td align="right">8</td>
<td align="right">0</td>
<td align="right">3</td>
</tr>
<tr class="even">
<td>c_12</td>
<td align="right">48</td>
<td align="right">330</td>
<td align="right">57</td>
<td align="right">23</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">1</td>
<td align="right">5</td>
<td align="right">24</td>
<td align="right">1</td>
<td align="right">7</td>
</tr>
<tr class="odd">
<td>c_13</td>
<td align="right">7</td>
<td align="right">43</td>
<td align="right">358</td>
<td align="right">97</td>
<td align="right">1</td>
<td align="right">1</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">2</td>
<td align="right">6</td>
<td align="right">4</td>
<td align="right">3</td>
</tr>
<tr class="even">
<td>c_14</td>
<td align="right">0</td>
<td align="right">20</td>
<td align="right">48</td>
<td align="right">2096</td>
<td align="right">1</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">1</td>
<td align="right">1</td>
<td align="right">11</td>
<td align="right">0</td>
<td align="right">5</td>
</tr>
<tr class="odd">
<td>c_16</td>
<td align="right">0</td>
<td align="right">3</td>
<td align="right">4</td>
<td align="right">29</td>
<td align="right">80</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">4</td>
<td align="right">4</td>
<td align="right">51</td>
<td align="right">5</td>
</tr>
<tr class="even">
<td>c_17</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">914</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
</tr>
<tr class="odd">
<td>c_2</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">4</td>
<td align="right">1</td>
<td align="right">0</td>
<td align="right">507</td>
<td align="right">6</td>
<td align="right">199</td>
<td align="right">23</td>
<td align="right">73</td>
<td align="right">1</td>
</tr>
<tr class="even">
<td>c_4</td>
<td align="right">2</td>
<td align="right">4</td>
<td align="right">6</td>
<td align="right">5</td>
<td align="right">2</td>
<td align="right">0</td>
<td align="right">2</td>
<td align="right">31</td>
<td align="right">114</td>
<td align="right">105</td>
<td align="right">16</td>
<td align="right">9</td>
</tr>
<tr class="odd">
<td>c_5</td>
<td align="right">5</td>
<td align="right">3</td>
<td align="right">3</td>
<td align="right">10</td>
<td align="right">1</td>
<td align="right">0</td>
<td align="right">171</td>
<td align="right">29</td>
<td align="right">619</td>
<td align="right">262</td>
<td align="right">87</td>
<td align="right">14</td>
</tr>
<tr class="even">
<td>c_6</td>
<td align="right">14</td>
<td align="right">19</td>
<td align="right">5</td>
<td align="right">4</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">1</td>
<td align="right">6</td>
<td align="right">142</td>
<td align="right">1753</td>
<td align="right">13</td>
<td align="right">38</td>
</tr>
<tr class="odd">
<td>c_8</td>
<td align="right">2</td>
<td align="right">7</td>
<td align="right">6</td>
<td align="right">15</td>
<td align="right">7</td>
<td align="right">3</td>
<td align="right">78</td>
<td align="right">7</td>
<td align="right">136</td>
<td align="right">35</td>
<td align="right">516</td>
<td align="right">13</td>
</tr>
<tr class="even">
<td>c_9</td>
<td align="right">3</td>
<td align="right">9</td>
<td align="right">10</td>
<td align="right">17</td>
<td align="right">1</td>
<td align="right">0</td>
<td align="right">0</td>
<td align="right">2</td>
<td align="right">10</td>
<td align="right">190</td>
<td align="right">5</td>
<td align="right">104</td>
</tr>
</tbody>
</table>
<p>Since we obtained one classified map for each round we can pull all that information by ensembling it together through a majority vote (i.e., calculating the modal class). The <code>raster</code> package <code>modal</code> function makes it really easy to calculate this:</p>
<pre class="r"><code>rstModalClass <- modal(rstPredClass)
rstModalClassFreq <- modal(rstPredClass, freq=TRUE)
medFreq <- zonal(rstModalClassFreq, rstTrain, fun=median)</code></pre>
<p>Using the modal frequency of the 20 classification rounds, let’s check out which classes obtained the highest ‘uncertainty’:</p>
<pre class="r"><code>colnames(medFreq) <- c("ClassCode","MedianModalFrequency")
medFreq[order(medFreq[,2],decreasing = TRUE),]</code></pre>
<pre><code>## ClassCode MedianModalFrequency
## [1,] 6 20
## [2,] 11 20
## [3,] 12 20
## [4,] 14 20
## [5,] 17 20
## [6,] 2 19
## [7,] 8 19
## [8,] 13 19
## [9,] 5 15
## [10,] 16 14
## [11,] 9 13
## [12,] 4 11</code></pre>
<p>These results somewhat confirm those of the class-wise precision/recall. Classes most often shifting (lower frequency) are those with lower values for these performance measures.</p>
<p>Finally, let’s plot our results of the the final modal classification map (and modal frequency):</p>
<pre class="r"><code>par(mfrow=c(1,2), cex.main=0.8, cex.axis=0.8)
plot(rstModalClass, main = "RF modal land cover class")
plot(rstModalClassFreq, main = "Modal frequency")</code></pre>
<div class="figure">
<img src="https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/img/P8_plot_results-1.png" />
</div>
<p>The map on the right provides some insight to identify areas that are more problematic for the classification process. However, as you can see, the results are acceptable but improvable in many ways.</p>
<p>This concludes our exploration of the raster package and supervised classification for this post. Hope you find it useful! 😄 👍 👍</p>
</div>
</div>
</div>
<script>
// add bootstrap table styles to pandoc tables
function bootstrapStylePandocTables() {
$('tr.header').parent('thead').parent('table').addClass('table table-condensed');
}
$(document).ready(function () {
bootstrapStylePandocTables();
});
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>