You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* addresses #177 & #49 & #47 for winsorizing based on the MAD
* forgot to push updated documentation
* new argument "method", updated NEWS, resolved failed test, #179
* update winsorize.numeric
added raw method
made the code easier to maintain by modularizing it
made doc more explicit about the methods
updated examples to visualize the effect
update NEWS
* minor modifications to docs
* removed tidyr from Suggests, replaced `tidyr::pivot_longer` with `datawizard::data_to_long` in vignette
* added new tests for new winsorization methods, insight::format_message(), data[] <- lapply...
Co-authored-by: RemPsyc <[email protected]>
Co-authored-by: Mattan S. Ben-Shachar <[email protected]>
Copy file name to clipboardExpand all lines: NEWS.md
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -19,6 +19,7 @@ CHANGES
19
19
20
20
* Some of the text formatting helpers (like `text_concatenate()`) gain an
21
21
`enclose` argument, to wrap text elements with surrounding characters.
22
+
*`winsorize` now accepts "raw" and "zscore" methods (in addition to "percentile"). Additionally, when `robust` is set to `TRUE` together with `method = "zscore"`, winsorizes via the median and median absolute deviation (MAD); else via the mean and standard deviation. (@rempsyc, #177, #49, #47).
Copy file name to clipboardExpand all lines: R/winsorize.R
+76-18Lines changed: 76 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -17,13 +17,33 @@
17
17
#' A dataframe with winsorized columns or a winsorized vector.
18
18
#'
19
19
#' @param data Dataframe or vector.
20
-
#' @param threshold The amount of winsorization.
20
+
#' @param threshold The amount of winsorization, depends on the value of `method`:
21
+
#' - For `method = "percentile"`: the amount to winsorize from *each* tail.
22
+
#' - For `method = "zscore"`: the number of *SD*/*MAD*-deviations from the *mean*/*median* (see `robust`)
23
+
#' - For `method = "raw"`: a vector of length 2 with the lower and upper bound for winsorization.
21
24
#' @param verbose Toggle warnings.
25
+
#' @param method One of "percentile" (default), "zscore", or "raw".
26
+
#' @param robust Logical, if TRUE, winsorizing through the "zscore" method is done via the median and the median absolute deviation (MAD); if FALSE, via the mean and the standard deviation.
22
27
#' @param ... Currently not used.
23
28
#'
24
29
#' @examples
25
-
#' winsorize(iris$Sepal.Length, threshold = 0.2)
30
+
#' hist(iris$Sepal.Length, main = "Original data")
0 commit comments