Skip to content

Commit

Permalink
Merge pull request #102 from m-clark/dev
Browse files Browse the repository at this point in the history
read through update
  • Loading branch information
m-clark authored Oct 14, 2024
2 parents 78adc40 + aa33cd6 commit 551b407
Show file tree
Hide file tree
Showing 56 changed files with 9,172 additions and 3,235 deletions.
2 changes: 1 addition & 1 deletion .Rprofile
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ gt = function(..., decimals = 2, title = NULL, subtitle = NULL) {
gt::tab_style(
style = gt::cell_text(color = 'gray25'),
locations = gt::cells_body(
columns = gt::vars( # TODO: update to c() or just drop vars
columns = c( # TODO: update to c() or just drop vars
where(is.numeric)
)
)
Expand Down
8 changes: 4 additions & 4 deletions causal.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -396,7 +396,7 @@ As we can see from this simple demo, linear regression by itself cannot save us
[^biasedcause]: A reminder that a conclusion of 'no effect' is also a causal statement, and can be just as biased as any other statement. Also, you can come to the same *practical* conclusion with a biased estimate as with an unbiased one.


:::{.callout type='note' title='Weighting and Sampling Methods' collapse='true'}
:::{.callout-note title='Weighting and Sampling Methods' collapse='true'}
Common techniques for traditional statistical models used for causal inference include a variety of **weighting** or **sampling** methods. These methods are used to adjust the data so that the 'treatment' groups are more similar, and its effect can be more accurately estimated. Sampling methods include techniques such as stratification and matching, which focus on the selection of the sample as a means to balance treatment and control groups. Weighting methods include inverse probability weighting and propensity score weighting, which focus on adjusting the weights of the observations to make the groups more similar.

These methods are not models themselves, and potentially can be used with just about any model that attempts to estimate the effect of a treatment. An excellent overview of using such methods vs. standard regression/ML can be found on Cross Validated (https://stats.stackexchange.com/a/544958).
Expand Down Expand Up @@ -485,7 +485,7 @@ It is common in uplift modeling to distinguish certain types of individuals or i

We can generalize beyond the marketing context to just think about response to any treatment we might be interested in. It's worthwhile to think about which aspects of your data could correspond to these groups. One of the additional goals in uplift modeling is to identify persuadables for additional treatment efforts, and to avoid wasting money on the lost causes. But to get there, we have to think causally first!

:::{.callout type='note' title='Uplift Modeling in R and Python' collapse='true'}
:::{.callout-note title='Uplift Modeling in R and Python' collapse='true'}

There are more widely used tools for uplift modeling and meta-learners in Python than in R, but there are some options in R as well. In Python you can check out [causalml](https://causalml.readthedocs.io/en/latest/index.html) and [sci-kit uplift](https://www.uplift-modeling.com/en/v0.5.1/index.html) for some nice tutorials and documentation.

Expand Down Expand Up @@ -649,7 +649,7 @@ Although it's often implied as such, *prediction is not just what we do with new
Here are some ways we might think about different modeling contexts:

- **Descriptive Analysis**: A description of data with no modeling focus. We'll use descriptive statistics and visualizations to understand the data. An end product may be an infographic or a report. Even here we may still use models to aid visualizations or otherwise to help us understand the data better.
- **Exploratory Modeling**: Using models for exploration. Focus should be on both prediction and explanation. The former can help inform the strength of the results for future exploration.
- **Exploratory Modeling**: When using models for exploration, focus should probably be on both prediction and explanation. The former can help inform the strength of the results for future exploration, while the latter will often provide useful insights.
- **Causal Modeling**: Using models to understand causal effects. We focus on explanation, and prediction on the current data. We may very well be interested in predictive performance also, and often are in industry.
- **Generalization**: When our goal is generalizing to unseen data, the focus is always on predictive performance. This does not mean we can't use the model to understand the data though, and explanation could possibly be as important.

Expand Down Expand Up @@ -692,6 +692,6 @@ We have only scratched the surface here, and there is a lot more to learn. Here



## Exercise {#causal-exercise}
## Guided Exploration {#causal-exercise}

If you look into causal modeling, you'll find mention of problematic covariates such as [colliders](https://en.wikipedia.org/wiki/Collider_(statistics)) or [confounders](https://en.wikipedia.org/wiki/Confounding). Can you think of a way to determine if something is a collider or confounder that would not involve a statistical approach or model?
Loading

0 comments on commit 551b407

Please sign in to comment.