You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: validate.md
+33-33Lines changed: 33 additions & 33 deletions
Original file line number
Diff line number
Diff line change
@@ -31,11 +31,11 @@ This episode requires you to:
31
31
In outbreak analysis, once you have completed the initial steps of reading and cleaning the case data,
32
32
it's essential to establish an additional foundation layer to ensure the integrity and reliability of subsequent
33
33
analyses. Otherwise you might find that your analysis suddenly stops working when specific variables appear or disappear, or their underlying data types (like `<date>` or `<chr>`) change. Specifically, this additional layer involves: 1) verifying the presence and correct data type of certain columns within
34
-
your dataset, a process commonly referred to as "tagging"; 2) implementing measures to
35
-
check that these tagged columns are not inadvertently deleted during further data processing steps, known as "validation".
34
+
your dataset, a process commonly referred to as **tagging**; 2) implementing measures to
35
+
check that these tagged columns are not inadvertently deleted during further data processing steps, known as **validation**.
36
36
37
37
38
-
This episode focuses tagging and validate outbreak data using the [linelist](https://epiverse-trace.github.io/linelist/)
38
+
This episode focuses on tagging and validate outbreak data using the [linelist](https://epiverse-trace.github.io/linelist/)
39
39
package. Let's start by loading the package `{rio}` to read data and the package `{linelist}`
40
40
to create a linelist object. We'll use the pipe `%>%` to connect some of their functions, including others from
41
41
the package `{dplyr}`, so let's also call to the tidyverse package:
@@ -79,19 +79,19 @@ cleaned_data <- rio::import(
79
79
80
80
81
81
```output
82
-
# A tibble: 15,000 × 10
83
-
v1 case_id age gender status date_onset date_sample row_id
0 commit comments