-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpost_explain_BERT_LIME.html
316 lines (287 loc) · 24.7 KB
/
post_explain_BERT_LIME.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<link rel="apple-touch-icon" sizes="180x180" href="images/favicon_io_BERT/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="images/favicon_io_BERT/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="images/favicon_io_BERT/favicon-16x16.png">
<link rel="manifest" href="/site.webmanifest">
<meta charset="utf-8">
<meta name="author" content="Noor Aldeen Almusleh">
<meta name="keywords" content="Bidirectional Encoder Representation from Transformers, BERT, Kaggle, Tweets Classification, Local Interpretabale Model-Agnostic Explanations, LIME">
<meta name="description" content="Explain Bidirectional Encoder Representation from Transformers (BERT) predictions of Kaggle's competition (tweets classification) using Local Interpretabale Model-Agnostic Explanations (LIME)">
<title>Explaining BERT Model with LIME</title>
<!-- Google fonts -->
<link href="https://fonts.googleapis.com/css2?family=Montserrat:wght@900&family=Ubuntu&display=swap" rel="stylesheet">
<!-- CSS Stylesheets -->
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-0evHe/X+R7YkIZDRvuzKMRqM+OrBnVFBL6DOitfPri4tjfHxaWutUpFmBp4vmVor" crossorigin="anonymous">
<link rel="stylesheet" href="CSS/styles-posts.css">
<!-- Font Awesome -->
<script src="https://kit.fontawesome.com/ec35dfd770.js" crossorigin="anonymous"></script>
<!-- Bootstrap Scripts -->
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.min.js" integrity="sha384-kjU+l4N0Yf4ZOJErLsIcvOU2qSb74wXpOhqTvwVx3OElZRweTnQ6d31fXEoRD1Jy" crossorigin="anonymous"></script>
<!-- MathJax scripts -->
<script type="text/x-mathjax-config">
MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)'], ['$$','$$']]}});
</script>
<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"></script>
<!-- <script type="text/javascript"
src="http://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script> -->
<meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>
<header class="pb-3 mb-4 border-bottom">
<a href="index.html" class="d-flex align-items-center text-dark text-decoration-none">
<svg xmlns="#" width="0" height="32git" class="me-2" viewBox="0 0 118 94" role="img"><title>Noor Aldeen</title><path fill-rule="evenodd" clip-rule="evenodd" fill="currentColor"></path></svg>
<span class="fs-4">Noor Aldeen</span>
</a>
</header>
<h2>Explaining BERT Model Predictions with LIME</h2>
<hr class="bg-dark border-2 border-top border-dark">
<div class="row">
<div class="col-lg-2 col-md-3 col-sm-12">
<nav id="navbar-example3" class="h-100 flex-column align-items-stretch pe-4 border-end">
<nav class="nav nav-pills bd-sidebar flex-column">
<a class="nav-link" href="#Overview">Overview</a>
<a class="nav-link" href="#Introduction">Introduction</a>
<a class="nav-link" href="#Model-Architecture">Model Architecture</a>
<a class="nav-link" href="#Explain-Sample-Tweets">Explain Sample Tweets</a>
<a class="nav-link" href="#Conclusion">Conclusion</a>
</nav>
<hr style="border-top: dotted 1px;" />
</nav>
</div>
<div class="col-lg-10 col-md-9 col-sm-12">
<div data-bs-spy="scroll" data-bs-target="#navbar-example3" data-bs-smooth-scroll="true" class="scrollspy-example-2" tabindex="0">
<div id="Overview">
<h4>Overview</h4>
<p>
This page aims to demonstrate the capabilities of the well-known <strong>Bidirectional Encoder Representation from Transformers (BERT)</strong> model with two Bidirectional-LSTM layers, a fully connected layer, a drop out layer, and a classification layer for a classification task, specifically the popular <a href="https://www.kaggle.com/">Kaggle</a> Competition <a href="https://www.kaggle.com/c/nlp-getting-started">Natural Language Processing with Disaster Tweets</a>. The use of this architecture resulted in achieving a top 4% score on the leaderboard. To further illustrate the reliability of this model's predictions, we will present 10 randomly selected predictions, and explain their how did the model decided to their class using <strong>Local Interpretable Model-Agnostic Explanations (LIME)</strong>.
</p>
</div>
<div id="Introduction">
<h4>Introduction</h4>
<p>
In the pursuit of finding a model that achieved a top 4% score for the Kaggle's Natural Language Processing with Disaster Tweets competition, we conducted a search. The dataset utilized in this competition is a CSV file containing an <kbd>id</kbd> column, which serves as a unique identifier for each tweet, a <kbd>text</kbd> column with the tweet's text, and several other columns not utilized in the training process. Further information regarding the dataset can be found in the <a href="https://www.kaggle.com/competitions/nlp-getting-started/data">"Data" section</a> of the <a href="https://www.kaggle.com/competitions/nlp-getting-started">competition's page</a>. It is worth noting that the discussion on this page pertains specifically to the training dataset.
<br>
<br>
The current dataset consists of 7613 tweets, with a range in length from 1-31 words, and a character count range of 7-157 characters. These tweets also may contain hashtags, mentions, and links. The tweets are classified into two categories: <em>real disaster</em> and <em>not real disaster</em>, with the distribution as follows:
<br>
<figure class="figure">
<img class="figure-img img-fluid rounded" src="images/explain_BERT/tweets-class-distribution.png" alt="tweets-class-distribution">
<figcaption class="figure-caption text-center">Figure 1: Class distribution across dataset.</figcaption>
</figure>
<br>
Below, we can see the n-grams for both classes. An N-gram is a group of N consecutive words (three in <a href="#fig2">Figure 2</a>) that can be either a long chunk of text or a shorter collection of syllables. N-Gram models utilize sequence data as input and generate a probability distribution of all potential items. From this distribution, a prediction is made based on the likelihood of each item. In addition to next-word prediction, N-Grams can also be utilized in language identification, information retrieval, and DNA sequencing predictions.
<br>
<figure class="figure" id="fig2">
<div class="container">
<div class="row">
<div class="col-lg-6 col-md-12">
<img class="figure-img img-fluid rounded" src="images/explain_BERT/target_0.png" alt="n-gram-for-class-0">
</div>
<div class="col-lg-6 col-md-12">
<img class="figure-img img-fluid rounded" src="images/explain_BERT/target_1.png" alt="n-gram-for-class-0">
</div>
</div>
</div>
<figcaption class="figure-caption text-center">Figure 2: An N-Gram for both classes in the dataset is presented. The purpose of this N-Gram is to demonstrate the appearance of a phrase in each class. </figcaption>
</figure>
<br>
To classify the dataset, we employed the use of Bidirectional Encoder Representations from Transformers (BERT). BERT is a model that has been designed to achieve state-of-the-art results upon fine-tuning on various tasks including classification, question answering, and language inference. In the design of a language model, it is crucial to consider the tasks the model will be attempting to perform. In the case of BERT, the model is attempting to predict masked tokens as well as determining whether sentence B subsequent to sentence A. For a more comprehensive understanding of this model, one can refer to the original research <a href="https://arxiv.org/abs/1810.04805">paper</a>.
</p>
</div>
<div id="Model-Architecture">
<h4>Model Architecture</h4>
<p>
The pre-trained BERT model can be fine-tuned with the addition of a single output layer. However, in our case, a more complex architecture was chosen. First, the text was tokenized and masked as described in the <a href="https://arxiv.org/abs/1810.04805">original paper</a>, and then the sequence output of the model was captured and fed into a bidirectional-LSTM layer with 1024 units and a dropout rate of 90%. This layer produced a sequence output, which was then fed into another bidirectional-LSTM layer with the same hyperparameters. The pooled output of the later layer was passed through a dense layer with 64 hidden units and a ReLU activation function, followed by a dropout layer with a rate of 20%. Finally, the weights were passed to the output layer, which contained two units and utilized the Softmax activation function.
<br>
<br>
For this model, the uncased large BERT model from <a href="https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4">TensorFlow Hub</a> was utilized, featuring 24 Transformer blocks, a hidden size of 1024, and 16 Attention Heads. Additionally, the output layer used two units and a Softmax activation function rather than a single unit and a Sigmoid activation function, allowing for the use of the model's predict method in <a href="https://arxiv.org/abs/1602.04938">Local Interpretable Model-Agnostic Explanations (LIME)</a>.
</p>
</div>
<div id="Explain-Sample-Tweets">
<h4>Explain Sample Tweets</h4>
<p>
The subsequent carousel displays a number of randomly selected samples in which the model attempts to predict the class of the tweets. For each sample, the original tweet, the top words that influenced the model's decision, the predicted and true labels of the tweet are presented.
</p>
<div id="carouselExampleIndicators" class="carousel slide" data-bs-ride="true">
<div class="carousel-indicators">
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="0" class="active" aria-current="true" aria-label="Slide 1"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="1" aria-label="Slide 2"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="2" aria-label="Slide 3"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="3" aria-label="Slide 4"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="4" aria-label="Slide 5"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="5" aria-label="Slide 6"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="6" aria-label="Slide 7"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="7" aria-label="Slide 8"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="8" aria-label="Slide 9"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="9" aria-label="Slide 10"></button>
<button type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide-to="10" aria-label="Slide 1"></button>
</div>
<div class="carousel-inner">
<div class="carousel-item active">
<img src="images/explain_BERT/tweet1.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: Another Mac vuln!\n\nhttps://t.co/OxXRnaB8Un</h6>
<p>
It appears that the model classified this tweet due to the presence of phrases such as "vuln", and the a few latters in the link; "OxXRnaB8Un" which influenced the decision-making process. Additionally, the model's prediction appears to have been made with some uncertainty, as indicated by the presence of words such as "Another", and "Mac".
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Not real disaster
<br>
<strong>True Class:</strong> Not real disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet2.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: Pic of 16yr old PKK suicide bomber who detonated bomb in Turkey Army trench released http://t.co/FVXHoPdf3W</h6>
<p>
It appears that the model classified this tweet correctly with 100% certainty.
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Disaster
<br>
<strong>True Class:</strong> Disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet3.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: I'm on 2 blood pressure meds and it's still probably through the roof! Long before the #PPact story broke I was involved in animal rescue</h6>
<p>
It appears that the model classified this tweet correctly due to the presence of phrases such as "rescue", "roof", and "involved" which influenced the decision-making process. Additionally, the model's prediction appears to have been made with some uncertainty, as indicated by the presence of words such as "animal", and "I".
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Not real disaster
<br>
<strong>True Class:</strong> Not real disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet4.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: @realDonaldTrump to obliterate notion & pattern political style seemly lead voter beliefs Sarcasm Narcissism #RichHomeyDon #Swag #USA #LIKES</h6>
<p>
It appears that the model classified this tweet correctly due to the presence of phrases such as "political", "obliterate", and "beliefs" which influenced the decision-making process. Additionally, the model's prediction appears to have been made with some uncertainty, as indicated by the presence of words such as "LIKES", and the mention "realDonaldTrump".
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Not real disaster
<br>
<strong>True Class:</strong> Not real disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet6.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: Medieval airplane hijacker hard shell: casting the star deviating: dYxTmrYDu</h6>
<p>
It appears that the model has misclassified this tweet due to the presence of phrases such as "star", "hard", and "casting" which influenced the decision-making process. Additionally, the model's prediction appears to have been made with some uncertainty, as indicated by the presence of words such as "hijacker", and "airplane".
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Not real disaster
<br>
<strong>True Class:</strong> Disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet7.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: # handbags Genuine Mulberry Antony Cross Body Messenger Bag Dark Oak Soft Buffalo Leather: å£279.00End Date: W... http://t.co/FTM4RKl8mN</h6>
<p>
It appears that the model classified this tweet correctly due to the presence of phrases such as "W", "Cross", and "Buffalo" which influenced the decision-making process. Additionally, the model's prediction appears to have been made with some uncertainty, as indicated by the presence of other phrases such as "handbags", and "Soft".
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Not real disaster
<br>
<strong>True Class:</strong> Not real disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet7.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: 320 [IR] ICEMOON [AFTERSHOCK] | http://t.co/M4JDZMGJoW | @djicemoon | #Dubstep #TrapMusic #DnB #EDM #Dance #Ices\x89Û_ http://t.co/n0uhAsfkBv</h6>
<p>
It appears that the model classified this tweet correctly due to the presence of phrases such as "AFTERSHOCK", "320", and the links which influenced the decision-making process. Additionally, the model's prediction appears to have been made with some uncertainty, as indicated by the presence of phrases such as "Dance", and "TrapMusic".
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Not real disaster
<br>
<strong>True Class:</strong> Not real disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet8.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: Just came back from camping and returned with a new song which gets recorded tomorrow. Can't wait! #Desolation #TheConspiracyTheory #NewEP</h6>
<p>
It appears that the model misclassified this tweet due to the presence of phrases such as "song", "wait", and "Just" which influenced the decision-making process. Additionally, the model's prediction appears to have been made with some uncertainty, as indicated by the presence of words such as "camping", and the hashtags. <em>Notice however, reading the tweet, it does not sound like a disaster, and thus we believe the model classified it correctly, and the label is incorrect</em>.
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Not real disaster
<br>
<strong>True Class:</strong> Disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet9.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: Evacuation order lifted for town of Roosevelt - Washington Times http://t.co/Kue48Nmjxh</h6>
<p>
It appears that the model classified this tweet correctly mostly due to the presence of the phrases such as "Evacuation" which influenced the decision-making process.
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Disaster
<br>
<strong>True Class:</strong> Disaster
</div>
</div>
</div>
<div class="carousel-item">
<img src="images/explain_BERT/tweet10.png" class="d-block LIME-tweet" alt="example-tweet">
<h6 class="lead">Tweet: And that's because on my planet it's the lone audience of the apocalypse!</h6>
<p>
It appears that the model classified this tweet correctly due to the presence of phrases such as "apocalypse", "planet", and "lone" which influenced the decision-making process. Additionally, the model's prediction appears to have been made with some uncertainty, as indicated by the presence of words such as "audience", and "of".
</p>
<div>
<div class="alert alert-warning alert-dismissible fade show" role="alert">
<strong>Predicted Class:</strong> Not real disaster
<br>
<strong>True Class:</strong> Not real disaster
</div>
</div>
</div>
</div>
<button class="carousel-control-prev" type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide="prev">
<span class="carousel-control-prev-icon" aria-hidden="true"></span>
<span class="visually-hidden">Previous</span>
</button>
<button class="carousel-control-next" type="button" data-bs-target="#carouselExampleIndicators" data-bs-slide="next">
<span class="carousel-control-next-icon" aria-hidden="true"></span>
<span class="visually-hidden">Next</span>
</button>
</div>
</div>
<div id="Conclusion">
<h4>Conclusion</h4>
<p>
Upon analyzing the model's predictions on a selection of randomly chosen tweets, we can consider this model to be a reliable tool for this task. To further increase the reliability of the model, we recommend removing URLs from the tweets as a potential strategy.
</p>
</div>
</div>
</div>
</div>
<footer class="pt-3 mt-4 text-muted border-top footer">
© 2023 Noor Almusleh
</footer>
</body>
</html>