Merge pull request #125 from nutriverse/dev

candidate release v0.2.1
nutriverse · Dec 7, 2024 · 08ae58a · 08ae58a
2 parents e447f17 + 67b3300
commit 08ae58a
Show file tree

Hide file tree

Showing 193 changed files with 2,302 additions and 31,597 deletions.
diff --git a/.github/workflows/R-CMD-check.yaml b/.github/workflows/R-CMD-check.yaml
@@ -4,7 +4,7 @@ on:
   push:
     branches: [main, master]
   pull_request:
-    branches: [main, master]
+    branches: [main, master, dev]
 
 name: R-CMD-check
 

diff --git a/.gitignore b/.gitignore
@@ -7,7 +7,7 @@
 
 inst/doc
 docs
-/doc/
+/docs/
 /Meta/
 
 /README_files/
@@ -17,3 +17,6 @@ docs
 /vignettes/sample_size_files/
 
 /.quarto/
+
+
+data-raw/ENA_generated_zscores.csv
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,9 +1,11 @@
 Type: Package
 Package: mwana
-Title: An Efficient Workflow for Plausibility Checks and Prevalence Analysis of Wasting in R
-Version: 0.2.0
+Title: An Efficient Workflow for Plausibility Checks and Prevalence Analysis of
+    Wasting in R
+Version: 0.2.1
 Authors@R: c(
-    person("Tomás", "Zaba", , "[email protected]", role = c("aut", "cre", "cph"),
+    person("Tomás", "Zaba", , "[email protected]", 
+           role = c("aut", "cre", "cph"),
            comment = c(ORCID = "0000-0002-7079-3574")),
     person("Ernest", "Guevarra", role = c("aut", "cph"),
            comment = c(ORCID = "0000-0002-4887-4415"))
@@ -13,7 +15,8 @@ Description: A simple and streamlined workflow for plausibility checks and
   Monitoring and Assessment of Relief and Transition (SMART) Methodology
   <https://smartmethodology.org/>, with application in R.
 License: GPL (>= 3)
-URL: https://github.com/nutriverse/mwana, https://nutriverse.io/mwana
+Depends: 
+    R (>= 4.1)
 Imports: 
     dplyr (>= 1.1.4),
     lubridate,
@@ -31,12 +34,12 @@ Suggests:
     quarto,
     spelling,
     testthat (>= 3.0.0)
-Config/testthat/edition: 3
 Encoding: UTF-8
 Language: en-GB
-Roxygen: list(markdown = TRUE)
-RoxygenNote: 7.3.2
-Depends: 
-    R (>= 4.1)
 LazyData: true
+RoxygenNote: 7.3.2
+Roxygen: list(markdown = TRUE)
+URL: https://github.com/nutriverse/mwana, https://nutriverse.io/mwana
+BugReports: https://github.com/nutriverse/mwana/issues
 VignetteBuilder: quarto
+Config/testthat/edition: 3
diff --git a/NEWS.md b/NEWS.md
@@ -1,4 +1,14 @@
-# mwana v0.2.0.9000 (development version)
+# mwana 0.2.1
+
+## General updates
+
+* Updated documentation in README, data documentation function documentation, and vignettes to improve grammar, coherence, and consistency;
+
+* Enforced use of `::` to state external package dependencies;
+
+* Ensured that a code sequence started with a function statement rather than a data.frame piped into a function;
+
+* simplified specific code syntax
 
 <br/>
 
@@ -12,16 +22,16 @@ of wasting by MUAC from non survey data: screenings, sentinel sites, etc.
 ## Bug fixes
 
 * Resolved issues with `mw_neat_output_mfaz()`, `mw_neat_output_wfhz()` and 
-`mw_neat_output_muac()` not returning neat and tidy output for grouped `data.frame` 
-from their respective plausibility checkers.
+`mw_neat_output_muac()` not returning neat and tidy output for grouped `data.frame` from their respective plausibility checkers.
 
-* Resolved issue with `edema` argument in prevalence functions that was not working
-as expected when set to `NULL`.
+* Resolved issue with `edema` argument in prevalence functions that was not working as expected when set to `NULL`.
 
 ## General updates
 
 * Updated general package documentation, including references in vignettes. 
-* Built package using `R` version 4.4.2
+* Built package using R version 4.4.2
+
+<br/>
 
 # mwana v0.1.0
 

diff --git a/R/data.R b/R/data.R
@@ -10,17 +10,17 @@
 #'
 #' | **Variable** | **Description** |
 #' | :--- | :--- |
-#' | *area* | Location where the survey took place |
+#' | *area* | Survey location |
 #' | *dos* | Survey date |
 #' | *cluster* | Primary sampling unit |
 #' | *team* | Enumerator IDs |
-#' | *sex* | Sex, "m" = boys, "f" = girls |
+#' | *sex* | Sex; "m" = boys, "f" = girls |
 #' | *dob* | Date of birth |
 #' | *age* | Age in months, typically estimated using local event calendars |
-#' | *weight* | Weight (kg) |
-#' | *height* | Height (cm) |
-#' | *edema* | Edema, "n" = no, "y" = yes |
-#' | *muac* | Mid-upper arm circumference (mm) |
+#' | *weight* | Weight in kilograms |
+#' | *height* | Height in centimetres |
+#' | *edema* | Edema; "n" = no edema, "y" = with edema |
+#' | *muac* | Mid-upper arm circumference in millimetres |
 #'
 #' @source Anonymous
 #'
@@ -34,35 +34,35 @@
 #' A sample of an already wrangled survey data
 #'
 #' @description
-#' A household budget survey data conducted in Mozambique in
-#' 2019/2020, known as *IOF* (*Inquérito ao Orçamento Familiar* in Portuguese). *IOF*
-#' is a two-stage cluster-based survey, representative at province level (admin 2),
-#'  with probability of the selection of the clusters proportional to the size of
-#'  the population. Its data collection spans for a period of 12 months.
+#' A household budget survey data conducted in Mozambique in 2019/2020, known as
+#' *IOF* (*Inquérito ao Orçamento Familiar* in Portuguese). *IOF* is a two-stage 
+#' cluster-based survey, representative at province level (second administrative 
+#' level), with probability of the selection of the clusters proportional to the 
+#' size of the population. Its data collection spans for a period of 12 months.
 #'
 #' @format A tibble of 2,267 rows and 14 columns.
 #'
 #' |**Variable** | **Description** |
 #' | :--- | :---|
-#' | *province* | The administrative unit (admin 1) where data was collected. |
-#' | *strata* | Rural and Urban |
+#' | *province* | The administrative unit level 1 where data was collected |
+#' | *strata* | Rural or Urban |
 #' | *cluster* | Primary sampling unit |
-#' | *sex* | Sex, "m" = boys, "f" = girls |
-#' | *age* | calculated age in months with two decimal places |
-#' | *weight* | Weight (kg) |
-#' | *height* | Height (cm) |
-#' | *edema* | Edema, "n" = no, "y" = yes |
-#' | *muac* | Mid-upper arm circumference (mm) |
+#' | *sex* | Sex; "m" = boys, "f" = girls |
+#' | *age* | Calculated age in months with two decimal places |
+#' | *weight* | Weight in kilograms |
+#' | *height* | Height in centimetres |
+#' | *edema* | Edema; "n" = no edema, "y" = with edema |
+#' | *muac* | Mid-upper arm circumference in millimetres |
 #' | *wtfactor* | Survey weights |
 #' | *wfhz* | Weight-for-height z-scores with 3 decimal places |
-#' | *flag_wfhz* | Flagged observations. 1=flagged, 0=not flagged |
+#' | *flag_wfhz* | Flagged WFHZ value. 1 = flagged, 0 = not flagged |
 #' | *mfaz* | MUAC-for-age z-scores with 3 decimal places |
-#' | *flag_mfaz* | Flagged observations. 1=flagged, 0=not flagged |
+#' | *flag_mfaz* | Flagged MFAZ value. 1 = flagged, 0 = not flagged |
 #'
 #' @source Mozambique National Institute of Statistics. The data is publicly
 #' available at <https://mozdata.ine.gov.mz/index.php/catalog/88#metadata-data_access>.
 #' Data was wrangled using this package's wranglers. Details about survey design
-#' can be gotten from: <https://mozdata.ine.gov.mz/index.php/catalog/88#metadata-sampling>
+#' can be read from: <https://mozdata.ine.gov.mz/index.php/catalog/88#metadata-sampling>
 #'
 #' @examples
 #' anthro.02
@@ -75,28 +75,28 @@
 #'
 #' @description
 #' `anthro.03` contains survey data of four districts. Each district data set
-#' presents distinct data quality scenarios that requires tailored prevalence
-#' analysis approach: two districts show a problematic WFHZ standard deviation
-#' whilst the remaining are all within range.
+#' presents distinct data quality scenarios that require a specific prevalence
+#' analysis approach. Data from two districts have a problematic WFHZ standard 
+#' deviation. The data from the remaining two districts are all within range.
 #'
-#' This sample data is useful to demonstrate the use of the prevalence functions on
-#' a multiple-area survey data where there can be variations in the rating of
-#' acceptability of the standard deviation, hence require different analyses approaches
-#' for each area to ensure accurate estimation.
+#' This sample data is useful to demonstrate the use of the prevalence functions
+#' on a multiple-domain survey data where there can be variations in the rating
+#' of acceptability of the standard deviation, hence requiring different 
+#' analytical approach for each survey domain to ensure accurate estimation.
 #'
 #' @format A tibble of 943 x 9.
 #'
 #' |**Variable** | **Description** |
 #' | :--- | :---|
-#' | *district* | The location where data was collected |
+#' | *district* | Survey location |
 #' | *cluster* | Primary sampling unit |
 #' | *team* | Survey teams |
-#' | *sex* | Sex, "m" = boys, "f" = girls |
-#' | *age* | calculated age in months with two decimal places |
-#' | *weight* | Weight (kg) |
-#' | *height* | Height (cm) |
-#' | *edema* | Edema, "n" = no, "y" = yes |
-#' | *muac* | Mid-upper arm circumference (mm) |
+#' | *sex* | Sex; "m" = boys, "f" = girls |
+#' | *age* | Calculated age in months with two decimal places |
+#' | *weight* | Weight in kilograms |
+#' | *height* | Height in centimetres |
+#' | *edema* | Edema; "n" = no edema, "y" = with edema |
+#' | *muac* | Mid-upper arm circumference in millimetres |
 #'
 #' @source Anonymous
 #'
@@ -109,35 +109,36 @@
 
 #'
 #'
-#' A sample data of a community-based sentinel site from an anonymized location
+#' A sample data from a community-based sentinel site in an anonymized location
 #'
 #' @description
-#' Data was generated through a community-based sentinel site conducted
-#' across three provinces. Each province's data set presents distinct
-#' data quality scenarios, requiring tailored prevalence analysis:
-#'  + "Province 1" has MFAZ's standard deviation and age ratio test rating of
-#'  acceptability falling within range;
-#'  + "Province 2" has age ratio rated as problematic but with an acceptable
-#'  standard deviation of MFAZ;
-#'  + "Province 3" has both tests rated as problematic.
-#'
-#' This sample data is useful to demonstrate the use of prevalence functions on
-#' a multiple-area survey data where variations in the rating of acceptability of the
-#' standard deviation exist, hence require different analyses approaches for each
-#' area to ensure accurate estimation.
+#' Data was collected from community-based sentinel sites located across three 
+#' provinces. Each provincial data set presents distinct data quality scenarios, 
+#' requiring tailored prevalence analysis:
+#' 
+#' - *Province 1* has a MUAC-for-age z-score standard deviation and age ratio 
+#' test rating of acceptability falling within range
+#' - *Province 2* has age ratio rated as problematic but with an acceptable 
+#' standard deviation of MUAC-for-age z-score
+#' - *"Province 3* has both tests rated as problematic
+#'
+#' This sample data is useful to demonstrate the use of the prevalence functions 
+#' on a multiple-domain survey data where variations in the rating of 
+#' acceptability of the standard deviation exist, hence require different 
+#' analytical approach for each domain to ensure accurate estimation.
 #'
 #' @format A tibble of 3,002 x 8.
 #'
 #' |**Variable** | **Description** |
 #' | :--- | :---|
-#' | *province* | location where data was collected |
+#' | *province* | Survey location |
 #' | *cluster* | Primary sampling unit |
-#' | *sex* | Sex, "m" = boys, "f" = girls |
-#' | *age* | calculated age in months with two decimal places |
-#' | *muac* | Mid-upper arm circumference (mm) |
-#' | *edema* | Edema, "n" = no, "y" = yes |
+#' | *sex* | Sex; "m" = boys, "f" = girls |
+#' | *age* | Calculated age in months with two decimal places |
+#' | *muac* | Mid-upper arm circumference in millimetres |
+#' | *edema* | Edema; "n" = no edema, "y" = with edema |
 #' | *mfaz* | MUAC-for-age z-scores with 3 decimal places |
-#' | *flag_mfaz* | Flagged observations. 1=flagged, 0=not flagged |
+#' | *flag_mfaz* | Flagged MUAC-for-age z-score value; 1 = flagged, 0 = not flagged |
 #'
 #' @source Anonymous
 #'
@@ -149,18 +150,19 @@
 
 
 #'
-#' A sample SMART survey data with WFHZ standard deviation rated as problematic
+#' A sample SMART survey data with weight-for-height z-score standard deviation 
+#' rated as problematic
 #'
 #' @format A tibble with 303 rows and 6 columns.
 #'
 #' | **Variable** | **Description** |
 #' | :--- | :---|
 #' | *cluster* | Primary sampling unit |
-#' | *sex* | Sex, "m" = boys, "f" = girls |
-#' | *age* | calculated age in months with two decimal places |
-#' | *edema* | Edema, "n" = no, "y" = yes |
+#' | *sex* | Sex; "m" = boys, "f" = girls |
+#' | *age* | Calculated age in months with two decimal places |
+#' | *edema* | Edema, "n" = no edema, "y" = with edema |
 #' | *wfhz* | MUAC-for-age z-scores with 3 decimal places |
-#' | *flag_wfhz* | Flagged observations. 1=flagged, 0=not flagged |
+#' | *flag_wfhz* | Flagged weight-for-height z-score value; 1 = flagged, 0 = not flagged |
 #'
 #' @source Anonymous
 #'
@@ -172,16 +174,16 @@
 
 
 #'
-#' A sample MUAC screening data from an anonymized setting
+#' A sample mid-upper arm circumference (MUAC) screening data
 #'
 #' @format A tibble with 661 rows and 4 columns.
 #'
 #' |**Variable** | **Description** |
 #' | :--- | :---|
-#' | *sex* | Sex, "m" = boys, "f" = girls |
-#' | *months* | calculated age in months with two decimal places |
-#' | *edema* | Edema, "n" = no, "y" = yes |
-#' | *muac* | Mid-upper arm circumference (mm) |
+#' | *sex* | Sex; "m" = boys, "f" = girls |
+#' | *months* | Calculated age in months with two decimal places |
+#' | *edema* | Edema, "n" = no edema, "y" = with edema |
+#' | *muac* | Mid-upper arm circumference in millimetres |
 #'
 #' @source Anonymous
 #'
@@ -191,18 +193,18 @@
 "mfaz.01"
 
 #'
-#' A sample SMART survey data with MUAC
+#' A sample SMART survey data with mid-upper arm circumference measurements
 #'
 #' @format A tibble with 303 rows and 7 columns.
 #'
 #' |**Variable** | **Description** |
 #' | :--- | :---|
 #' | *cluster* | Primary sampling unit |
-#' | *sex* | Sex, "m" = boys, "f" = girls |
-#' | *age* | calculated age in months with two decimal places |
-#' | *edema* | Edema, "n" = no, "y" = yes |
+#' | *sex* | Sex; "m" = boys, "f" = girls |
+#' | *age* | Calculated age in months with two decimal places |
+#' | *edema* | Edema, "n" = no edema, "y" = with edema |
 #' | *mfaz* | MUAC-for-age z-scores with 3 decimal places |
-#' | *flag_mfaz* | Flagged observations. 1=flagged, 0=not flagged |
+#' | *flag_mfaz* | Flagged MUAC-for-age z-score value. 1 = flagged, 0 = not flagged |
 #'
 #' @source Anonymous
 #'