-
Notifications
You must be signed in to change notification settings - Fork 91
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
reordered - moved Session 10 to 5 instead,
added installation info to Preamble
- Loading branch information
Riinu Ots
committed
Jan 22, 2018
1 parent
1c40f26
commit db9ce4c
Showing
359 changed files
with
6,586 additions
and
4 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
# (PART\*) Part I: Always plot your data first {-} | ||
# Your first R plots | ||
|
||
```{r setup, include=FALSE} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,203 @@ | ||
# (PART\*) Part I: Always plot your data first {-} | ||
# Your first R plots | ||
|
||
|
||
|
||
In this session, we will create 5 beautiful and colourful barplots in less than an hour. Do not worry about understanding every single word or symbol (e.g. the pipe - `%>%`) in the R code you are about to see. The purpose of this session is merely to | ||
|
||
* gain familiarity with the RStudio interface: | ||
+ to know what a script looks like, | ||
+ what is the Environment tab, | ||
+ where do your plots appear. | ||
|
||
## Data | ||
|
||
Load the example dataset which is already saved as an R-Data file (recognisable by the file extension .rda/.RData): | ||
|
||
|
||
```r | ||
library(ggplot2) | ||
|
||
source("1_source_theme.R") | ||
|
||
load("global_burden_disease_long.rda") | ||
#loads two dataframes - mydata and mydata2013 which is a subset of mydata | ||
``` | ||
|
||
After loading the datasets, investigate your Environment tab (top-right): | ||
|
||
|
||
Click on the name `mydata` and it will pop up next to where your script is. Clicking on the blue button is not as useful, but it doesn't do any harm either. Try it. | ||
|
||
## First plot | ||
|
||
|
||
|
||
|
||
```r | ||
mydata %>% #press Control-Shift-M to insert this symbol (pipe) | ||
ggplot(aes(x = year, | ||
y = deaths_millions, | ||
fill = cause, | ||
colour = cause)) + | ||
geom_col() | ||
``` | ||
|
||
![](01_first_interaction_files/figure-epub3/unnamed-chunk-2-1.png)<!-- --> | ||
|
||
|
||
`ggplot()` stands for **grammar of graphics plot** - a user friendly yet flexible alternative to `plot()`. | ||
|
||
`aes()` stands for **aesthetics** - things we can see. | ||
|
||
`geom_()` stands for **geometric**. | ||
|
||
### Question | ||
|
||
Why are there two closing brackets - `))` - after the last aesthetic (colour)? | ||
|
||
|
||
### Exercise | ||
|
||
Plot the number of deaths in Developed and Developing countries for the year 2013: | ||
|
||
![](01_first_interaction_files/figure-epub3/unnamed-chunk-3-1.png)<!-- --> | ||
|
||
|
||
\newpage | ||
|
||
## Comparing bars of different height | ||
|
||
### Stretch each bar to 100% | ||
|
||
|
||
`position="fill"` stretches the bars to show relative contributions: | ||
|
||
|
||
```r | ||
mydata2013 %>% | ||
ggplot(aes(x = location, | ||
y = deaths_millions, | ||
fill = cause, | ||
colour = cause)) + | ||
geom_col(position = "fill") | ||
``` | ||
|
||
![](01_first_interaction_files/figure-epub3/unnamed-chunk-4-1.png)<!-- --> | ||
|
||
|
||
### Plot each bar next to each other | ||
|
||
`position="dodge"` puts the different causes next to each rather (the default is `position="stack"`): | ||
|
||
|
||
```r | ||
mydata2013 %>% | ||
ggplot(aes(x = location, | ||
y = deaths_millions, | ||
fill = cause, | ||
colour = cause)) + | ||
geom_col(position = "dodge") | ||
``` | ||
|
||
![](01_first_interaction_files/figure-epub3/unnamed-chunk-5-1.png)<!-- --> | ||
|
||
|
||
## Facets (panels) | ||
|
||
Going back to the dataframe with all years (1990 -- 2015), add `facet_wrap(~year)` to plot all years at once: | ||
|
||
|
||
|
||
```r | ||
mydata %>% | ||
ggplot(aes(x = location, | ||
y = deaths_millions, | ||
fill = cause, | ||
colour = cause)) + | ||
geom_col() + | ||
facet_wrap(~year) | ||
``` | ||
|
||
![](01_first_interaction_files/figure-epub3/unnamed-chunk-6-1.png)<!-- --> | ||
|
||
|
||
## Extra: using aethetics outside of the aes() | ||
|
||
### Setting a constant fill | ||
|
||
Using the `mydata2013` example again, what does the addition of `fill = "black"` in this code do? Note that putting the `ggplot(aes())` code all on one line not affect the result. | ||
|
||
|
||
```r | ||
mydata2013 %>% | ||
ggplot(aes(x = location, y = deaths_millions, fill = cause, colour = cause)) + | ||
geom_col(fill = "black") | ||
``` | ||
|
||
![](01_first_interaction_files/figure-epub3/unnamed-chunk-7-1.png)<!-- --> | ||
|
||
Setting aesthetics (x, y, fill, colour, etc.) outside of `aes()` sets them to a constant value. R can recognise of a lot of colour names, e.g., try "cornflowerblue", "firebrick", or just "red", "green", "blue", etc. For a full list, Google "Colours in R". R also knows HEX codes, e.g. `fill = "#fec3fc"` is pink. | ||
|
||
|
||
### Exercise | ||
|
||
What is the difference between colour and fill in the context of a barplot? | ||
|
||
Hint: Use `colour = "black"` instead of `fill = "black"` to investigate what `ggplot()` thinks a colour is. | ||
|
||
|
||
```r | ||
mydata2013 %>% | ||
ggplot(aes(x = location, y = deaths_millions, fill = cause, colour = cause))+ | ||
geom_col(colour = "black") | ||
``` | ||
|
||
![](01_first_interaction_files/figure-epub3/unnamed-chunk-8-1.png)<!-- --> | ||
|
||
### Exercise | ||
|
||
Why are some of the words in our code quoted (e.g. `fill = "black"`) whereas others are not (e.g. `x = location`)? | ||
|
||
## Two geoms for barplots: `geom_bar()` or `geom_col()` | ||
|
||
Both `geom_bar()` and `geom_col()` create barplots. If you: | ||
|
||
* Want to visualise the count of different lines in a dataset - use geom_bar() | ||
+ For example, if you are using a patient-level dataset (each line is a patient record): | ||
`mydata %>% | ||
ggplot(aes(x = sex)) + | ||
geom_bar()` | ||
|
||
* Your dataset is already summarised - use geom_col() | ||
+ For example, in the GBD dataset we use here, each line already includes a summarised value (`deaths_millions`) | ||
|
||
If you have used R before, you might have come across `geom_bar(stat = "identity")` which is the same as `geom_col()`. | ||
|
||
|
||
## Solutions | ||
|
||
**1.2.1:** | ||
There is a double closing bracket because `aes()` is wrapped inside `ggplot()` - `ggplot(aes())`. | ||
|
||
|
||
**1.2.2:** | ||
|
||
```r | ||
mydata2013 %>% | ||
ggplot(aes(x = location, | ||
y = deaths_millions, | ||
fill = cause, | ||
colour = cause)) + | ||
geom_col() | ||
``` | ||
|
||
|
||
**1.5.2:** | ||
|
||
On a barplot, the colour aesthetic outlines the fill. In a later session we will see, however, that for points and lines, colour is the main aesthetic to define. | ||
|
||
**1.5.3:** | ||
|
||
Words in quotes are generally something are setting to a constant value (e.g. make all outlines black, rather than colour them based on the cause they are representing). Unquoted words are generally variables (or functions). If the word "function" just threw you, Google "Jesse Maegan: What the h*ck is a function" | ||
|
Oops, something went wrong.