Skip to content

Commit

Permalink
Update evaluation.mdx
Browse files Browse the repository at this point in the history
According to the definition given of Insertion and deletion operation, changed
  • Loading branch information
practice-dump authored Aug 29, 2023
1 parent ab97af9 commit e035ff8
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions chapters/en/chapter5/evaluation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,20 +36,20 @@ insertions and deletions on the *word level*. This means errors are annotated on
| Reference: | the | cat | sat | on | the | mat |
|-------------|-----|-----|---------|-----|-----|-----|
| Prediction: | the | cat | **sit** | on | the | | |
| Label: ||| S ||| D |
| Label: ||| S ||| I |

Here, we have:
* 1 substitution ("sit" instead of "sat")
* 0 insertions
* 1 deletion ("mat" is missing)
* 1 insertions ("mat" is missing)
* 0 deletion

This gives 2 errors in total. To get our error rate, we divide the number of errors by the total number of words in our
reference (N), which for this example is 6:

$$
\begin{aligned}
WER &= \frac{S + I + D}{N} \\
&= \frac{1 + 0 + 1}{6} \\
&= \frac{1 + 1 + 0}{6} \\
&= 0.333
\end{aligned}
$$
Expand Down Expand Up @@ -116,17 +116,17 @@ individual characters, and annotate errors on a character-by-character basis:
| Reference: | t | h | e | | c | a | t | | s | a | t | | o | n | | t | h | e | | m | a | t |
|-------------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
| Prediction: | t | h | e | | c | a | t | | s | **i** | t | | o | n | | t | h | e | | | | |
| Label: |||| |||| || S || ||| |||| | D | D | D |
| Label: |||| |||| || S || ||| |||| | I | I | I |

We can see now that for the word "sit", the "s" and "t" are marked as correct. It's only the "i" which is labelled as a
substitution error (S). Thus, we reward our system for the partially correct prediction 🤝

In our example, we have 1 character substitution, 0 insertions, and 3 deletions. In total, we have 14 characters. So, our CER is:
In our example, we have 1 character substitution, 3 insertions, and 0 deletions. In total, we have 14 characters. So, our CER is:

$$
\begin{aligned}
CER &= \frac{S + I + D}{N} \\
&= \frac{1 + 0 + 3}{14} \\
&= \frac{1 + 3 + 0}{14} \\
&= 0.286
\end{aligned}
$$
Expand Down

0 comments on commit e035ff8

Please sign in to comment.