diff --git a/01_introduction.Rmd b/01_introduction.Rmd index b195ba3..09f0825 100644 --- a/01_introduction.Rmd +++ b/01_introduction.Rmd @@ -2,23 +2,29 @@ # Why we love R -We are extremely pleased that you have picked up this book to learn R for health data analysis. -Even if you're already familiar with the R language, you will probably find new approaches here as we make the most of the latest R packages and tools including some we've developed ourselves. -Those already familiar with R are encouraged to still skim through the first two chapters to familiarise with the style of R we recommend. +Thank you for choosing this book on using R for health data analysis. +Even if you're already familiar with the R language, we hope you will find some new approaches here as we make the most of the latest R tools including some we've developed ourselves. +Those already familiar with R are encouraged to still skim through the first few chapters to familiarise yourself with the style of R we recommend. + +R can be used for all the health data science applications we can think of. +From bioinformatics and computational biology, to administrative data analysis and natural language processing, through internet-of-things and wearable data, to machine learning and artificial intelligence, and even public health and epidemiology. +R has it all. Here are the main reasons we love R: * R is versatile and powerful - use it for - - graphics - - all the statistics you can dream of - - machine learning, deep learning - - automated reports - - websites - - books (yes, R can be used to make whole websites or books, this one is written in R) -* R scripts can be reused - gives you efficiency and reproducibility -* It is free to use by anyone, anywhere + - graphics; + - all the statistical tests you can dream of; + - machine learning and deep learning; + - automated reports; + - websites; + - and even books (yes, this book was written entirely in R). +* R scripts can be reused - gives you efficiency and reproducibility. +* It is free to use by anyone, anywhere. + + -![R logo, © 2016 The R Foundation.](images/chapter01/Rlogo.png){width=150px} +![](images/chapter01/Rlogo.png){width=150px} ## Help, what's a script? {#chap01-what-script} \index{RStudio@\textbf{RStudio}!script} @@ -40,9 +46,10 @@ These are notes of what we were doing, both for colleagues as well as our future knitr::include_graphics("images/chapter01/example_script.png") ``` -Lines that do not start with a # are R code. +Lines that do not start with # are R code. This is where the number crunching really happens. -We will cover the details of this R code in the next few chapters, the purpose of this chapter is to describe some of the terminology as well as the interface and tools we use. +We will cover the details of this R code in the next few chapters. +The purpose of this chapter is to describe some of the terminology as well as the interface and tools we use. For the impatient: @@ -52,34 +59,34 @@ For the impatient: Even though R is a language, don't think that after reading this book you should be able to open a blank file and just start typing in R code like an evil computer genius from a movie. This is not what real world programming looks like. -Firstly, you should be copy-pasting and adapting existing R code examples - whether from this book or later from your own previous work. +Firstly, you should be copy-pasting and adapting existing R code examples - whether from this book, the internet, or later from your existing work. Re-writing everything from scratch is not efficient. -Yes, you will understand and eventually remember a lot of it. -But to spend time memorising very specific things that can easily be looked up and copied is simply not necessary. +Yes, you will understand and eventually remember a lot of it, but to spend time memorising specific functions that can easily be looked up and copied is simply not necessary. Secondly, R is an interactive language. Meaning that we "run" R code line by line and get immediate feedback. -We would never write a whole script without trying everything out as we go along. +We do not write a whole script without trying each part out as we go along. Thirdly, do not worry about making mistakes. Celebrate them! -The whole point of R and reproducibility is that manipulations are not applied directly on a dataset but a copy of it. -And that everything is in a script - so if we do make a wrong move (e.g. accidentally overwrite or remove some data) we can always reload it, rerun the steps that worked well and continue figuring our where we went wrong at the end. +The whole point of R and reproducibility is that manipulations are not applied directly on a dataset, but a copy of it. +Everything is in a script, so you can't do anything wrong. +If you make a mistake like accidentally overwriting your data, we can just reload it, rerun the steps that worked well and continue figuring our what went wrong at the end. And since all of these steps are written down in a script, R will redo everything with a single push of a button. -You do not have to redo anything, that's what R is for. +You do not have to repeat a set of mouse clicks from dropdown menus as in other statistical packages, which quickly becomes a blessing. ## What is RStudio? \index{RStudio} RStudio is a free program that makes working with R easier. -An example screen shot of RStudio is shown in Figure \@ref(fig:chap01-fig-rstudio). +An example screenshot of RStudio is shown in Figure \@ref(fig:chap01-fig-rstudio). We have already introduced what is in the top-left pane - the **Script**. ```{r chap01-fig-rstudio, echo = FALSE, fig.cap = "We use RStudio to work with R."} knitr::include_graphics("images/chapter01/rstudio_interface.png") ``` -Now, look at the little Run and Source buttons at the top-right corner of the script pane. +Now, look at the little **Run** and **Source** buttons at the top-right corner of the script pane. Clicking **Run** executes a line of R code. Clicking **Source** executes all lines of R code in the script (it is essentially 'Run all lines'). When you run R code, it gets sent to the **Console** which is the bottom-left panel. @@ -90,13 +97,14 @@ This is where R really lives. > Run all lines (Source): Control+Shift+Enter > (On a Mac, both Control or Command work) -The Console is also where R speaks to us. +The Console is where R speaks to us. When we're lucky, we get results in there - in this example the results of a *t*-test (last line of the script). When we're less lucky, this where also where Errors or Warnings appear. R Errors are a lot less scary than they seem! Yes, if you're using using a regular computer program where all you do is click on some buttons, then getting a proper red error that stops everything is quite unusual. But in programming, Errors are just a way for R to communicate with us. + We see Errors in our own work every single day, they are very normal and do not mean that everything is wrong or that you should give up. Try to re-frame the word Error to mean "feedback", as in "Hello, this is R. I can't continue, this is the feedback I am giving you." The most common Errors you'll see are along the lines of "Error: something not found". @@ -106,8 +114,8 @@ Furthermore, R is case sensitive so capitalisation matters (variable name `lifeE The Console can only print text, so any plots your create in your script appear in the **Plots** pane (bottom-right). Similarly, datasets that you've loaded or created appear in the **Environment** tab. -When you click on a dataset, it pops up in a really nice and fast viewer. -This means you can have a look and scroll through your rows and columns the same way you would in a spreadsheet. +When you click on a dataset, it pops up in a nice viewer that is fast even when there is a lot of data. +This means you can have a look and scroll through your rows and columns, the same way you would with a spreadsheet. ## Getting started \index{installation@\textbf{installation}!tidyverse} @@ -122,8 +130,10 @@ To start using R, you should do these two things: When you first open up RStudio, you'll also want to install some extra packages to extend the base R functionality. You can do this in the **Packages** tab (next to the Plots tab in the bottom-right in Figure \@ref(fig:chap01-fig-rstudio)). +A Package is just a collection of functions (commands) that are not included in the standard R installation, called base-R. + A lot of the functionality introduced in this book comes from the `tidyverse` family of R packages (http://tidyverse.org). -So when you go to **Packages**, click **Install**, type in `tidyverse`, a whole collection of useful and modern packages will be installed. +So when you go to Packages, click **Install**, type in `tidyverse`, and a whole collection of useful and modern packages will be installed. Even though you've installed the `tidyverse` packages, you'll still need to tell R when you're about to use them. We include `library(tidyverse)` at the top of every script we write: @@ -151,41 +161,44 @@ If you are incredibly curious, take a peak at Chapter \@ref(install) for a full \index{errors} The best way to troubleshoot R Errors is to copy-paste them into a search engine (e.g., Google). -Furthermore, searching online is also a great way to learn how to do new specific things or to find code examples. +Searching online is also a great way to learn how to do new specific things or to find code examples. You should copy-paste solutions into your R script to then modify to match what you're trying to do. -We are constantly copying code from online forums or from other's scripts/our own prior scripts. +We are constantly copying code from online forums and our own existing scripts. However, there a many different ways to achieve the same thing in R. -Therefore, sometimes you'll search for help and come across R code that looks nothing like what you've seen in this book. -That's because the `tidyverse` packages as well as using the pipe (` %>%`) are a relatively new way of doing things, but search engines will often prioritise results that have had more views. +Sometimes you'll search for help and come across R code that looks nothing like what you've seen in this book. +The `tidyverse` packages are relatively new and use the pipe (` %>%`), something we'll come on to. +But search engines will often prioritise older results that use a more traditional approach. + So older solutions may come up at the top. Don't get discouraged if you see R code that looks completely different to what you were expecting. Just keep scrolling down or clicking through different answers until you find something that looks a little bit more familiar. -If you're working offline, then RStudio's built in Help tab can help. +If you're working offline, then RStudio's built in **Help** tab is useful. To use the Help tab, click your cursor on something in your code (e.g. `read_csv()`) and press F1. This will show you the definition and some examples. -F1 can be hard to find on some keyboards, an alternative is to type in, e.g., `?read_csv` - this will also open the Help tab for this function. +F1 can be hard to find on some keyboards, an alternative is to type, e.g. `?read_csv`. +This will also open the Help tab for this function. However, the Help tab is only useful if you already know what you are looking for but can't remember exactly how it works. For finding help on things you have not used before, it is best to Google it. -R has about 2 million users so someone somewhere has had the same question or problem. -In addition to the Help tab, RStudio also has a Help drop-down menu at the very top (same row where you find "File", "Edit, etc.). The most notable thing in the Help drop-down menu are the Cheatsheets. +R has about 2 million users so someone somewhere has probably had the same question or problem. + +RStudio also has a Help drop-down menu at the very top (same row where you find "File", "Edit, ...). +The most notable thing in the Help drop-down menu are the Cheatsheets. These tightly packed two-pagers include many of the most useful functions from `tidyverse` packages. -They are not particularly useful to learn from, but invaluable as an *aide-mémoire*. +They are not particularly easy to learn from, but invaluable as an *aide-mémoire*. ## Notation throughout this book - -When in sentences, the names of R packages, functions, and variable names are printed with mono-spaced font, e.g.: `tidyverse`, `mean()`, `lifeExp`. +When mentioned in the text, the names of R packages, functions, and variable names are printed with mono-spaced font, e.g `tidyverse`, `mean()`, `lifeExp`. Otherwise, R code lives in the grey areas known as 'code chunks'. -Lines of R output start with a double ## - this will be numbers or text that R gives us after executing the code. -When printing this output, R also adds a counter at the beginning of every new line, look at the numbers in the square brackets [] below: - +Lines of R *output* start with a double ## - this will be the numbers or text that R gives us after executing code. +R also adds a counter at the beginning of every new line, look at the numbers in the square brackets [] below: ```{r} # colon between two numbers creates a sequence @@ -194,6 +207,7 @@ When printing this output, R also adds a counter at the beginning of every new l Remember, lines of R code that start with # are called comments. We already introduced comments as notes about the R code earlier in this chapter (Section \@ref(chap01-what-script) "Help, what's a script?"), however, there is a second use case for comments. + When you make R code a comment, by adding a # in front of it, it gets 'commented out'. For example, let's say your R script does two things, prints number from 1 to 4, and then numbers from 1001 to 1004: @@ -224,11 +238,11 @@ You may even want to add another real comment to explain why the latter was comm # 1001:1004 ``` -You could of course delete the line(s) altogether, but commenting out is useful if you might want to comment the lines back in later by removing the # from the beginning of the line. -Commenting in only works for R code, text comments within a code chunk or script (e.g., "# Now we're printing bigger numbers:") must always start with a # or you will get an error. +You could of course delete the line altogether, but commenting out is useful as you might want include the lines later by removing the # from the beginning of the line. > Keyboard Shortcut for commenting out/commenting in multiple lines at a time: > Control+Shift+C > (On a Mac, both Control or Command work) -Finally, we use **bold** to highglight a new term we are introducing for the first time. The same term will not be bold in future occurances. +Finally, we use **bold** to highlight a new term we are introducing for the first time. The same term will not be bold in future occurrences. +**Bold** is also used when you need to click on a specific menu item, e.g. click **Install**. diff --git a/02_basics.Rmd b/02_basics.Rmd index 54cbdce..bc93ab6 100644 --- a/02_basics.Rmd +++ b/02_basics.Rmd @@ -5,61 +5,56 @@ editor_options: --- # R Basics - - - - - ```{r setup, include = FALSE} knitr::opts_chunk$set(fig.align = 'center') library(tidyverse) ``` Throughout this book, we are conscious of the balance between theory and practice. -For example, some learners may prefer to get all definitions laid out before they are shown an example making use of the new concepts. -Others, however, would much rather see piecemealed sections of practical examples and explanations. -We strike a balance between these two approaches that works well for most people in our audience. +Some learners may prefer to see all definitions laid out before being shown an example of a new concept. +Others, would rather see practical examples and explanations build up to a full understanding over time. +We strike a balance between these two approaches that works well for most people in the audience. -This means that sometimes we will show you an example that may use words that have not been formally introduced yet. -For example, we start this chapter with data import - as R is nothing without data. -But doing so, we have to use the word "argument" that is only defined two sections later (in \@ref(chap02-objects-functions) "Objects and functions"). +Sometimes we will show you an example that may use words that have not been formally introduced yet. +For example, we start this chapter with data import - R is nothing without data. + +In so doing, we have to use the word "argument", which is only defined two sections later (in \@ref(chap02-objects-functions) "Objects and functions"). A few similar instances arise around statistical concepts in the Data Analysis part of the book. -You will come across sentences along the lines of "this concept will be explained further or become clearer in the next section/chapter". +You will come across sentences along the lines of "this concept will become clearer in the next section". Trust us and just go with it. - The aim of this chapter is to familiarise you with how R works. We will read in data and start basic manipulations. You may want to skip parts of this chapter if you already: -* have found the Import Dataset interface -* know what numbers, characters, factors, and dates look like in R -* are familiar with the terminology around objects, functions, arguments -* have used the pipe: `%>%` -* know how to filter data with operators such as `==, >, <, &, |` -* know how to handle NAs, and why they can behave weirdly in a filter -* have used `mutate()`, `c()`, `paste()`, `if_else()`, and the joins - +* have found the Import Dataset interface; +* know what numbers, characters, factors, and dates look like in R; +* are familiar with the terminology around objects, functions, arguments; +* have used the pipe: `%>%`; +* know how to filter data with operators such as `==, >, <, &, |`; +* know how to handle missing data (NAs), and why they can behave weirdly in a filter; +* have used `mutate()`, `c()`, `paste()`, `if_else()`, and the joins. ## Reading data into R{#chap02-h2-reading-data-into-r} \index{import data} \index{reading data} -We mentioned before that once a table (e.g. from spreadsheet or database) gets read into R we start calling it a `tibble`. -The most common format data comes to us in is CSV (comma separated values). -CSV is basically an uncomplicated spreadsheet with no formatting. +Data usually comes in the form of a table, such as a spreadsheet or database. +In the world of the `tidyverse`, a table read into R gets called a `tibble`. + +A common format in which to receive data is CSV (comma separated values). +CSV is an uncomplicated spreadsheet with no formatting. It is just a single table with rows and columns (no worksheets or formulas). Furthermore, you don't need special software to quickly view a CSV file - a text editor will do, and that includes RStudio. For example, look at "example_data.csv" in the healthyr project's folder in Figure \@ref(fig:chap2-fig-examplecsv) (this is the Files pane at the bottom-right corner of your RStudio). - - ```{r chap2-fig-examplecsv, echo = FALSE, fig.cap="View or import a data file.", out.width="70%"} knitr::include_graphics("images/chapter02/files_csv_example.png") ``` Clicking on a data file gives us two options: "View File" or "Import Dataset". + We will show you how to use the Import Dataset interface in a bit, but for standard CSV files, we don't usually bother with the Import interface and just type in (or copy from a previous script): \index{functions@\textbf{functions}!read\_csv} @@ -69,27 +64,31 @@ example_data <- read_csv("example_data.csv") View(example_data) ``` -There are a couple of things we can say about the first R code chunk of this book. First and foremost: do not panic. -Yes, if you're used to interacting with data by double-clicking on a Spreadsheet that then just opens up, then the above R code does seem a bit involved. +There are a couple of things to say about the first R code chunk of this book. +First and foremost: do not panic. +Yes, if you're used to interacting with data by double-clicking on a spreadsheet that just opens up, then the above R code does seem a bit involved. + However, running the example above also has an immediate visual effect. -As soon as you click Run (or press Control+Enter/Command+Enter) the code above, the dataset immediately shows up in your Environment and opens up - so you can have a look and scroll through the same way you would in Excel or similar. +As soon as you click Run (or press Ctrl+Enter/Command+Enter), the dataset immediately shows up in your Environment and opens in a Viewer. +You can have a look and scroll through the same way you would in Excel or similar. -So what's actually going on in the R code above is: +So what's actually going on in the R code above: -* We load the `tidyverse` packages (as covered in the first chapter of this book) -* We have a CSV file called "example_data.csv", we are using `read_csv()` to read it into R. +* We load the `tidyverse` packages (as covered in the first chapter of this book). +* We have a CSV file called "example_data.csv" and are using `read_csv()` to read it into R. * We are using the assignment arrow `<-` to save it into our Environment using the same name: `example_data`. * The `View(example_data)` line makes it pop up for us to view it. Alternatively, click on `example_data` in the Environment to achieve the exact same thing. -More about the assignment arrow (`<-`) and naming things in R are covered later in this Chapter. Do not worry if everything is not crystal clear just now. -Bear with. +More about the assignment arrow (`<-`) and naming things in R are covered later in this chapter. +Do not worry if everything is not crystal clear just now. ### Import Dataset interface -In the `read_csv()` example above, we read in a file that is in a specific (but common) CSV format. +In the `read_csv()` example above, we read in a file that was in a specific (but common) format. -If your file, however, uses semicolons instead of commas, or commas instead of dots, includes a special number such as 999 to denote missing values, or anything else that makes you think reading it into R is complicated, then...it is not! -RStudio's `Import Dataset` interface (Figure \@ref(fig:chap2-fig-examplecsv)) can handle all of these and more. +However, if your file uses semicolons instead of commas, or commas instead of dots, or a special number for missing values (e.g., 99), or anything else weird or complicated, then we need a different approach. + +RStudio's **Import Dataset** interface (Figure \@ref(fig:chap2-fig-examplecsv)) can handle all of these and more. ```{r chap02-fig-import-tool, echo = FALSE, fig.cap="Import: Some of the special settings your data file might have.", out.width="13cm"} knitr::include_graphics("images/chapter02/import_options.png") @@ -99,24 +98,28 @@ knitr::include_graphics("images/chapter02/import_options.png") knitr::include_graphics("images/chapter02/code_preview.png") ``` -After selecting the specific options to import file, a friendly preview window will show whether R understands the format of the your data. -DO NOT BE tempted to press the `Import` button. +After selecting the specific options to import a particular file, a friendly preview window will show whether R properly understands the format of the your data. + +DO NOT BE tempted to press the **Import** button. Yes, this will read in your dataset once, but means you have to reselect the options every time you come back to RStudio. -Instead, copy-paste the code (e.g., Figure \@ref(fig:chap02-fig-import-code)) into your R script - this way you can use it over and over again. +Instead, copy-paste the code (e.g., Figure \@ref(fig:chap02-fig-import-code)) into your R script. +This way you can use it over and over again. -Ensuring all steps are recorded in scripts make your workflow reproducible by your future self, colleagues, supervisors, and extraterrestrials. +Ensuring that all steps of an analysis are recorded in scripts makes your workflow reproducible by your future self, colleagues, supervisors, and extraterrestrials. >The `Import Dataset` button can also help you to read in Excel, SPSS, Stata, or SAS files (instead of `read_csv()`, it will give you `read_excel()`, `read_sav()`, `read_stata()`, or `read_sas()`). If you've used R before or are using older scripts passed by colleagues, you might see `read.csv()` rather than `read_csv()`. +Note the dot rather than the underscore. -In short, `read_csv()` is faster and more predictable and in all new scripts this is what you should use. -In existing scripts that work and are tested, do not just start replacing `read.csv()` with `read_csv()`. -`read_csv()` handles categorical variables differently ^[It does not silently convert strings to factors, i.e., it defaults to `stringsAsFactors = FALSE`. For those not familiar with the terminology here - don't worry, we will cover this in just a few sections.]. +In short, `read_csv()` is faster and more predictable and in all new scripts is to be recommended. + +In existing scripts that work and are tested, we do not recommend that you start replacing `read.csv()` with `read_csv()`. +For instance, `read_csv()` handles categorical variables differently ^[It does not silently convert strings to factors, i.e., it defaults to `stringsAsFactors = FALSE`. For those not familiar with the terminology here - don't worry, we will cover this in just a few sections.]. An R script written using the `read.csv()` might not work as expected any more if just replaced with `read_csv()`. -> Do not start updating and possibly breaking existing R scripts by replacing base R functions with the tidyverse ones we show here. Do use the modern functions in any new code you write. +> Do not start updating and possibly breaking existing R scripts by replacing base R functions with the tidyverse equivalents we show here. Do use the modern functions in any new code you write. ### Reading in the Global Burden of Disease example dataset @@ -130,7 +133,8 @@ Seattle, United States: Institute for Health Metrics and Evaluation (IHME), 2018 Available from http://ghdx.healthdata.org/gbd-results-tool.] GBD data are publicly available from the website. -Table \@ref(tab:chap2-tab-gbd) and Figure \@ref(fig:chap2-fig-gbd) show a high level version of the project data with just 3 variables: `cause`, `year`, `deaths_millions` (number of people who die of each cause every year). Later, we will be using a longer dataset with different subgroups and we will show you how to summarise comprehensive datasets yourself. +Table \@ref(tab:chap2-tab-gbd) and Figure \@ref(fig:chap2-fig-gbd) show a high level version of the project data with just 3 variables: `cause`, `year`, `deaths_millions` (number of people who die of each cause every year). +Later, we will be using a longer dataset with different subgroups and we will show you how to summarise comprehensive datasets yourself. ```{r, message=F} library(tidyverse) @@ -147,7 +151,7 @@ gbd_short %>% font_size = 10) ``` -```{r chap2-fig-gbd, echo = FALSE, fig.cap="Causes of death from the Global Burden of Disease dataset (Table \\@ref(tab:chap2-tab-gbd)). Data on (B) is the same as (A) but stacked to show the total (sum) of all causes.", fig.height=6, fig.width=6} +```{r chap2-fig-gbd, echo = FALSE, fig.cap="Line and bar charts: Cause of death by year (GBD). Data in (B) are the same as (A) but stacked to show the total of all causes.", fig.height=6, fig.width=6} source("1_source_theme.R") library(patchwork) p1 <- gbd_short %>% @@ -188,7 +192,7 @@ There are three broad types of data: Values within a column all have to be the same type, but a tibble can of course hold columns of different types. Generally, R is very good at figuring out what type of data you have (in programming, this 'figuring out' is called 'parsing'). -For example, when reading in data, it will tell you what it assumed for the columns: +For example, when reading in data, it will tell you what was assumed for each column: ```{r} library(tidyverse) @@ -197,10 +201,10 @@ typesdata <- read_csv("data/typesdata.csv") typesdata ``` -This means that a lot of the time you do not have to worry about those little `` vs `` vs `` labels, R knows what its doing. -But in cases of irregular or faulty input data, or when doing a lot of calculations and modifications your data, we need to be aware of these different types to be able to find and fix mistakes. +This means that a lot of the time you do not have to worry about those little `` vs `` vs `` labels. +But in cases of irregular or faulty input data, or when doing a lot of calculations and modifications to your data, we need to be aware of these different types to be able to find and fix mistakes. -For example, consider a very similar file as above but with a couple of data entry issues: +For example, consider a very similar file as above but with some data entry issues introduced: ```{r} typesdata_faulty <- read_csv("data/typesdata_faulty.csv") @@ -208,12 +212,12 @@ typesdata_faulty <- read_csv("data/typesdata_faulty.csv") typesdata_faulty ``` -Notice R parsed both measurement and date as characters. -The first one is a data entry issue: the person taking the measurement couldn't decide which value to note down (maybe the scale was shifting between the two values) so they included both values and text "or" in the cell. +Notice that R parsed both the measurement and date variables as characters. +Measurement has been parsed as a character because of a data entry issue: the person taking the measurement couldn't decide which value to note down (maybe the scale was shifting between the two values) so they included both values and text "or" in the cell. A numeric variable will also get parsed as a categorical variable if it contains certain typos, e.g., if entered as "3..7" instead of "3.7". -The reason R didn't automatically make sense of the date column is that it can't tell which is the date and which is the year: __02-Jan-17__ could stand for _02-Jan-2017_ as well as _2002-Jan-17_. +The reason R didn't automatically make sense of the date column is that it couldn't tell which is the date and which is the year: __02-Jan-17__ could stand for _02-Jan-2017_ as well as _2002-Jan-17_. Therefore, while a lot of the time you do not have to worry about variable types and can just get on with your analysis, it is important to understand what the different types are to be ready to deal with them when issues arise. @@ -225,7 +229,7 @@ So here we go. \index{variable types@\textbf{variable types}!continuous / numeric} Number are straightforward to handle and don't usually cause trouble. -R usually refers to numbers as `numeric` (or `num`), but sometimes it really gets its nerd on and also calls numbers `integer` or `double`. +R usually refers to numbers as `numeric` (or `num`), but sometimes it really gets its nerd on and calls numbers `integer` or `double`. Integers are numbers without decimal places (e.g., `1, 2, 3`), whereas `double` stands for "Double-precision floating-point" format (e.g., `1.234, 5.67890`). It doesn't usually matter whether R is classifying your continuous data `numeric/num/double/int`, but it is good to be aware of these different terms as you will see them in R messages. @@ -272,7 +276,6 @@ So when using `round()` in the equality statement like this, we get the expected round(measurement_mean, 3) == 3.333 ``` - Which is usually fine, especially if you've finished applying calculations on that number. But when you indent to use it if further calculations, then rounding should be left to the very end - to minimise rounding errors. This is where the `near()` function comes in handy: @@ -290,6 +293,8 @@ This means you get the expected result without having to round the numbers off. ### Character variables \index{variable types@\textbf{variable types}!character} + + **Characters** (sometimes referred to as *strings* or *character strings*) in R are letters, words, or even whole sentences (an example of this may be free text comments). Characters are displayed in-between `""` (or `''`). @@ -317,7 +322,7 @@ You can check everything by just eyeballing the `tibble` using the built in View But for larger datasets, you need to know how to check and then clean data programmatically - you can't go through thousands of values checking they are all as intended without unexpected duplicates or typos. -For most variables (categorical or numeric) we recommend always plotting your data before starting analysis. +For most variables (categorical or numeric), we recommend always plotting your data before starting analysis. But to check for duplicates in a unique identifier, use `count()` with `sort = TRUE`: ```{r} @@ -338,23 +343,23 @@ typesdata %>% \index{variable types@\textbf{variable types}!categorical / factor} **Factors** are fussy characters. -Factors are fussy because they have something called **levels**. -Levels are all the unique values a factor variable could take, e.g. like when we looked at `typesdata$group %>% unique()`. +Factors are fussy because they include something called **levels**. +Levels are all the unique values a factor variable could take, e.g. like when we looked at `typesdata %>% count(group)`. Using factors rather than just characters can be useful because: * The values factor levels can take is fixed. For example, once you tell R that `typesdata$group` is a factor with two levels: Control and Treatment, combining it with other datasets with different spellings or abbreviations for the same variable will generate a warning. This can be helpful but can also be a nuisance when you really do want to add in another level to a `factor` variable. * Levels have an order. -When running statistical tests on grouped data (e.g., Control vs Treatment, Adult vs Child) and the variable is just a character, not a factor, R will use the alphabetically first as the reference level. +When running statistical tests on grouped data (e.g., Control vs Treatment, Adult vs Child) and the variable is just a character, not a factor, R will use the alphabetically first as the reference (comparison) level. Converting a character column into a factor column enables us to define and change the order of its levels. Level order affects many things including regression results and plots: by default, categorical variables are ordered alphabetically. If we want a different order in say a bar plot, we need to convert to a factor and reorder before we plot it. -The plot will then know how the order it better. +The plot will then order the groups correctly. So overall, since health data is often categorical and has a reference (comparison) level, then factors are an essential way to work with these data in R. Nevertheless, the fussiness of factors can sometimes be unhelpful or even frustrating. -A lot more about factor handling will be covered later in the book. +A lot more about factor handling will be covered later (\@ref(chap08-h1)). ### Date/time variables \index{variable types@\textbf{variable types}!date-time} @@ -364,7 +369,8 @@ A lot more about factor handling will be covered later in the book. R is very good for working with dates. For example, it can calculate the number of days/weeks/months between two dates, or it can be used to find a future date is (e.g., "what's the date exactly 60 days from now?"). It also knows about time zones and is happy to parse dates in pretty much any format - as long as you tell R how your date is formatted (e.g., day before month, month name abbreviated, year in 2 or 4 digits, etc.). -Since R displays dates and times between quotes (""), they look similar to characters. However, it is important to know whether R has understood which of your columns contain date/time information, as which are just normal characters. +Since R displays dates and times between quotes (""), they look similar to characters. +However, it is important to know whether R has understood which of your columns contain date/time information, as which are just normal characters. ```{r, message = FALSE} library(lubridate) # lubridate makes working with dates easier @@ -375,7 +381,7 @@ my_datetime <- "2020-12-01 12:00" my_datetime ``` -When printed, the two objects - `current_datetime` and `my_datetime` seem to have the a very similar format. +When printed, the two objects - `current_datetime` and `my_datetime` seem to have the a similar format. But if we try to calculate the difference between these two dates, we get an error: ```{r, error = TRUE} @@ -390,7 +396,6 @@ current_datetime %>% class() my_datetime %>% class() ``` - So we need to tell R that `my_datetime` does indeed include date/time information so we can then use it in calculations: ```{r} @@ -419,7 +424,6 @@ ymd_hm("2021-01-02 12:00") + my_datesdiff But if we want to use the number of days in a normal calculation, e.g., what if a measurement increased by 560 arbitrary units during this time period. We might want to calculate the increase per day like this: - ```{r, error = TRUE} 560/my_datesdiff ``` @@ -431,7 +435,7 @@ We need to convert `my_datesdiff` (which is a difftime value) into a numeric val 560/as.numeric(my_datesdiff) ``` -The lubridate package comes with several convenient functions for parsing dates, e.g., `ymd()`, `mdy()`, `ymd_hm()`, etc. - for a full list see lubridate.tidyverse.org. +The lubridate package comes with several convenient functions for parsing dates, e.g., `ymd()`, `mdy()`, `ymd_hm()`, etc. - for a full list see [lubridate.tidyverse.org](lubridate.tidyverse.org). However, if your date/time variable comes in an extra special format, then use the `parse_date_time()` function where the second argument specifies the format using these helpers: @@ -467,20 +471,14 @@ You can even add plain text into the `format()` function, R will know to put the Sys.time() %>% format("Happy days, the current time is %H:%M %B-%d (%Y)!") ``` - ## Objects and functions {#chap02-objects-functions} \index{objects} \index{functions@\textbf{functions}} - - - There are two fundamental concepts in statistical programming that are important to get straight - objects and functions. The most common object you will be working with is a dataset. This is usually something with rows and columns much like the example in Table \@ref(tab:chap2-tab-examp1). - - ```{r chap2-tab-examp1, echo = FALSE} # TIBBLE hardcoded again in the next chunk, make sure to change in both places! @@ -502,6 +500,7 @@ mydata %>% To get the very small and made-up "dataset" into your Environment, copy and run this code^[`c()` stands for combine and will be introduced in more detail later in this chapter]: ```{r} +library(tidyverse) mydata <- tibble( id = 1:4, sex = c("Male", "Female", "Female", "Male"), @@ -511,18 +510,16 @@ mydata <- tibble( ) ``` - - -Data can live anywhere: on paper, in a Spreadsheet, in an SQL database, or it can live in your R Environment. -We usually initiate and interface R using RStudio, but everything we talk about here (objects, functions, environment) also work when RStudio is not available, but R is. -This can be the case if you are working on a supercomputer that can only serve the R Console, and not an RStudio IDE (reminder from first chapter: Integrated Development Environment). +Data can live anywhere: on paper, in a spreadsheet, in an SQL database, or in your R Environment. +We usually initiate and interface with R using RStudio, but everything we talk about here (objects, functions, environment) also work when RStudio is not available, but R is. +This can be the case if you are working on a supercomputer that can only serve the R Console and not RStudio. ### `data frame/tibble` -So, regularly shaped data in rows and columns is called a table when it lives outside R, but once you read it into R (import it) it gets called a tibble. +So, regularly shaped data in rows and columns is called a table when it lives outside R, but once you read/import it into R it gets called a tibble. If you've used R before, or get given a piece of code that uses `read.csv()` instead of `read_csv()`, you'll have come across the term `data frame`.^[`read.csv()` comes with base R, whereas `read_csv()` comes from the `readr` package within the `tidyverse`. We recommend using `read_csv()`.] -A `tibble` is the modern/`tidyverse` version of data/tables in R. +A `tibble` is the modern/`tidyverse` version of a data frame in R. In most cases, `data frames` and `tibbles` work interchangeably, but `tibbles` often work better. Another great alternative to base R `data frames` are `data tables`. In this book, and for most of our day-to-day work these days, we will use `tibbles`. @@ -550,7 +547,7 @@ mydata A function is a procedure which takes some information (input), does something to it, and passes back the modified information (output). -A simple function that can be applied to numeric data for instance is `mean()`. +A simple function that can be applied to numeric data is `mean()`. R functions always have round brackets after their name. This is for two reasons. @@ -558,7 +555,8 @@ First, it easily differentiates them as functions - you will get used to reading Second, and more importantly, we can put **arguments** in these brackets. Arguments can also be thought of as input. -In data analysis, the most common input for a function is data. For instance, we need to give `mean()` some data to average over. +In data analysis, the most common input for a function is data. +For instance, we need to give `mean()` some data to average over. It does not make sense (nor will it work) to feed `mean()` the whole tibble with multiple columns, including patient IDs and a categorical variable (`sex`). To quickly extract a single column, we use the `$` symbol like this: @@ -585,12 +583,13 @@ But what happens if we try to calculate the average value of `var2` (`r mydata$v mean(mydata$var2) ``` -So why does `mean(mydata$var2)` return `NA` ("Not applicable") rather than the mean of the values included in this column? +So why does `mean(mydata$var2)` return `NA` ("not available") rather than the mean of the values included in this column? That is because the column includes missing values (`NAs`), and R does not want to average over `NAs` implicitly. It is being cautious - what if you didn't know there were missing values for some patients? If you wanted to compare the means of `var1` and `var2` without any further filtering, you would be comparing samples of different sizes. -We might expect to see an `NA` if we tried to, for example, calculate the average of `sex`. And this is indeed the case: +We might expect to see an `NA` if we tried to, for example, calculate the average of `sex`. +And this is indeed the case: ```{r, error=TRUE} mean(mydata$sex) @@ -599,7 +598,7 @@ mean(mydata$sex) Furthermore, R also gives us a pretty clear Warning suggesting it can't compute the mean of an argument that is not numeric or logical. The sentence actually reads pretty fun, as if R was saying it was not logical to calculate the mean of something that is not numeric. -But what R is actually saying that it is happy to calculate the mean of two types of variables: numerics or logicals, but what you have passed is neither. +But, R is actually saying that it is happy to calculate the mean of two types of variables: numerics or logicals, but what you have passed is neither. If you decide to ignore the NAs and want to calculate the mean anyway, you can do so by adding this argument to `mean()`: @@ -631,7 +630,6 @@ Sys.time() ``` - ### Working with objects To save an object in our Environment we use the assignment arrow: @@ -643,7 +641,7 @@ a <- 103 ``` This reads: the object `a` is assigned value `r a`. -`<-` is called "the arrow assignment operator", or "assigment arrow" for short. +`<-` is called "the arrow assignment operator", or "assignment arrow" for short. > Keyboard shortcuts to insert `<-`: > Windows: Alt- @@ -671,7 +669,7 @@ example_sequence <- seq(15, 30) ``` Doing this creates `example_sequence` in our Environment, but it does not print it. -To get it printed, Run it on a separate line like this: +To get it printed, run it on a separate line like this: ```{r} example_sequence @@ -682,7 +680,7 @@ example_sequence > If you run a function without the assignment (`<-`), its results get printed, but not saved as an object. -Finally, R doesn't mind overwriting an existing object, for example (notice how we then include the variable on a new line to get it printed as well as overwritten): +Finally, R doesn't mind overwriting an existing object, for example: ```{r} example_sequence <- example_sequence/2 @@ -690,6 +688,8 @@ example_sequence <- example_sequence/2 example_sequence ``` +Notice how we then include the variable on a new line to get it printed as well as overwritten. + ### `<-` and `=` Note that many people use `=` instead of `<-`. @@ -710,14 +710,14 @@ Note how the example above uses both operators: the assignment arrow for saving * To summarise, objects and functions work hand in hand. Objects are both an input as well as the output of a function (what the function returns). -* When passing data to a function is is usually its first argument, with further arguments used to specify behaviour. +* When passing data to a function, is is usually the first argument, with further arguments used to specify behaviour. * When we say "the function returns", we are referring to its output (or an Error if it's one of those days). * The returned object can be different to its input object. In our `mean()` examples above, the input object was a column (`mydata$var1`: `r mydata$var1`), whereas the output was a single value: `r mean(mydata$var1)`. -* If you've written a line of code that doesn't include the assigment arrow (`<-`), its results would get printed. +* If you've written a line of code that doesn't include the assignment arrow (`<-`), its results would get printed. If you use the assignment arrow, an object holding the results will get saved into the Environment. ## Pipe - `%>%` @@ -736,36 +736,43 @@ library(tidyverse) mydata$var1 %>% mean() ``` -Which reads: "Working with `mydata`, we select a single column called `var1` (with the `$`) **and then** calculate the `mean()`." The pipe becomes especially useful once the analysis includes multiple steps applied one after another. +Which reads: "Working with `mydata`, we select a single column called `var1` (with the `$`) **and then** calculate the `mean()`." +The pipe becomes especially useful once the analysis includes multiple steps applied one after another. A good way to read and think of the pipe is "and then". + + + + + This piping business is not standard R functionality and before using it in a script, you need to tell R this is what you will be doing. The pipe comes from the `magrittr` package (Figure \@ref(fig:chap2-fig-pipe)), but loading the `tidyverse` will also load the pipe. -So library(tidyverse) initialises everything you need (no need to include library(magrittr) explicitly). +So `library(tidyverse)` initialises everything you need. >To insert a pipe `%>%`, use the keyboard shortcut `Ctrl+Shift+M`. With or without the pipe, the general rule "if the result gets printed it doesn't get saved" still applies. -To save the result of the function into a new object (so it shows up in the Environment), you need to add the name of the new object with the assignment arow (`<-`): +To save the result of the function into a new object (so it shows up in the Environment), you need to add the name of the new object with the assignment arrow (`<-`): ```{r} mean_result <- mydata$var1 %>% mean() ``` -```{r chap2-fig-pipe, out.width="70%", echo = FALSE, fig.cap="This is not a pipe. René Magritte inspired artwork by Stefan Milton Bache."} +```{r chap2-fig-pipe, out.width="70%", echo = FALSE, fig.cap="This is not a pipe. René Magritte inspired artwork, by Stefan Milton Bache."} knitr::include_graphics("images/chapter02/magrittr.png") ``` ### Using . to direct the pipe -The pipe usually sends data to the beginning of function brackets (as most of the functions we use expect a tibble as the first argument). -So `mydata %>% lm(dependent~explanatory)` is equivalent to `lm(mydata, dependent~explanatory)`. `lm()` - linear model - will be introduced in detail in Chapter \@ref(chap07-h1). +By default, the pipe sends data to the beginning of the function brackets (as most of the functions we use expect data as the first argument). +So `mydata %>% lm(dependent~explanatory)` is equivalent to `lm(mydata, dependent~explanatory)`. +`lm()` - linear model - will be introduced in detail in Chapter \@ref(chap07-h1). However, the `lm()` function does not expect data as its first argument. `lm()` wants us to specify the variables first (`dependent~explanatory`), and then wants the tibble these columns are in. -So we have to use the `.` to tell the pipe to send the data to the second argument of `lm()`, not the first, e.g. +So we have to use the `.` to tell the pipe to send the data to the second argument of `lm()`, not the first, e.g., ```{r, eval = FALSE} mydata %>% @@ -792,15 +799,18 @@ Other common operators are the ones we use for filtering data - these are arithm This may be for creating subgroups, or for excluding outliers or incomplete cases. The comparison operators that work with numeric data are relatively straightforward: `>, <, >=, <=`. -The first two check whether your values are greater or less than another value, the last two check for "greater than or equal to" and "less than or equal to. These operators are most commonly spotted inside the `filter()` function: +The first two check whether your values are greater or less than another value, the last two check for "greater than or equal to" and "less than or equal to. +These operators are most commonly spotted inside the `filter()` function: ```{r} gbd_short %>% filter(year < 1995) ``` + Here we send the data (`gbd_short`) to the `filter()` and ask it to retain all years that are less than 1995. The resulting tibble only includes the year 1990. Now, if we use the `<=` (less than or equal to) operator, both 1990 and 1995 pass the filter: + ```{r} gbd_short %>% filter(year <= 1995) @@ -817,14 +827,15 @@ gbd_short %>% This reads, take the GBD dataset, send if to the filter and keep rows where year is equal to 1995. -Accidentally using the single equals `=` when double equals is necessary `==` is a very common mistake and still happens to the best of us. It happens so often that the error the `filter()` function gives when using the wrong one also reminds us what the correct one was: +Accidentally using the single equals `=` when double equals is necessary `==` is a very common mistake and still happens to the best of us. +It happens so often that the error the `filter()` function gives when using the wrong one also reminds us what the correct one was: ```{r, error = TRUE} gbd_short %>% filter(year = 1995) ``` -> The answer to 'do you need ==?" is almost always "Yes R, I do, thank you". +> The answer to 'do you need ==?" is almost always, "Yes R, I do, thank you". But that's just because `filter()` is a clever cookie and is used to this very common mistake. There are other useful functions we use these operators in, but they don't always know to tell us that we've just confused `=` for `==`. @@ -958,7 +969,7 @@ mydata %>% In R, the exclamation mark (!) means "not". -Sometimes you want to drop a specific value (e.g. an outlier) from the dataset like this. +Sometimes you want to drop a specific value (e.g., an outlier) from the dataset like this. The small example tibble `mydata` has 4 rows, with the values for `var2` as follows: `r mydata$var2`. We can exclude the row where `var2` is equal to 5 by using the "not equals" (`!=`)^[ `filter(var2 != 5) is equivalent to filter(! var2 == 5)`]: @@ -1098,7 +1109,7 @@ typesdata %>% (We are then using the `select()` function to only choose the three relevant columns.) -Finally, the mutate function can be used to create a new column with a summarised value in it, e.g. the mean of another column: +Finally, the mutate function can be used to create a new column with a summarised value in it, e.g., the mean of another column: ```{r} typesdata %>% @@ -1253,7 +1264,7 @@ right_join(patientdata, labsdata) ### Further notes about joins -* The joins functions (`full_join()`, `inner_join()`, `left_join()`, `right_join()`) will automatically look for matching column names. You can use the `by = ` argument to specify by hand. This is especially useful if the columns are named differently in the datasets, e.g. `left_join(data1, data2, by = c("id" = "patient_id"))`. +* The joins functions (`full_join()`, `inner_join()`, `left_join()`, `right_join()`) will automatically look for matching column names. You can use the `by = ` argument to specify by hand. This is especially useful if the columns are named differently in the datasets, e.g., `left_join(data1, data2, by = c("id" = "patient_id"))`. * The rows do not have to be ordered, the joins match on values within the rows, not the order of the rows within the tibble. diff --git a/docs/healthyr-book.pdf b/docs/healthyr-book.pdf index e79a27e..d467c68 100644 Binary files a/docs/healthyr-book.pdf and b/docs/healthyr-book.pdf differ diff --git a/healthyr-book.log b/healthyr-book.log index 67102d6..1f2aace 100644 --- a/healthyr-book.log +++ b/healthyr-book.log @@ -1,4 +1,4 @@ -This is XeTeX, Version 3.14159265-2.6-0.99992 (TeX Live 2015/Debian) (preloaded format=xelatex 2019.7.9) 6 FEB 2020 23:09 +This is XeTeX, Version 3.14159265-2.6-0.99992 (TeX Live 2015/Debian) (preloaded format=xelatex 2019.7.9) 23 FEB 2020 22:31 entering extended mode restricted \write18 enabled. %&-line parsing enabled. @@ -1663,43 +1663,43 @@ Underfull \hbox (badness 10000) has occurred while \output is active [] [2] -Underfull \vbox (badness 10000) detected at line 238 +Underfull \vbox (badness 10000) detected at line 240 [] -Overfull \hbox (395.75pt too wide) detected at line 238 +Overfull \hbox (395.75pt too wide) detected at line 240 [] [] -Underfull \vbox (badness 10000) detected at line 238 +Underfull \vbox (badness 10000) detected at line 240 [] -Overfull \hbox (395.75pt too wide) detected at line 238 +Overfull \hbox (395.75pt too wide) detected at line 240 [] [] -Underfull \vbox (badness 10000) detected at line 238 +Underfull \vbox (badness 10000) detected at line 240 [] -Overfull \hbox (395.75pt too wide) detected at line 238 +Overfull \hbox (395.75pt too wide) detected at line 240 [] [] -Underfull \vbox (badness 10000) detected at line 238 +Underfull \vbox (badness 10000) detected at line 240 [] -Overfull \hbox (395.75pt too wide) detected at line 238 +Overfull \hbox (395.75pt too wide) detected at line 240 [] [] -Underfull \hbox (badness 10000) detected at line 238 +Underfull \hbox (badness 10000) detected at line 240 [] [] @@ -1790,43 +1790,43 @@ Underfull \hbox (badness 10000) has occurred while \output is active [12 ] -Underfull \vbox (badness 10000) detected at line 240 +Underfull \vbox (badness 10000) detected at line 242 [] -Overfull \hbox (395.75pt too wide) detected at line 240 +Overfull \hbox (395.75pt too wide) detected at line 242 [] [] -Underfull \vbox (badness 10000) detected at line 240 +Underfull \vbox (badness 10000) detected at line 242 [] -Overfull \hbox (395.75pt too wide) detected at line 240 +Overfull \hbox (395.75pt too wide) detected at line 242 [] [] -Underfull \vbox (badness 10000) detected at line 240 +Underfull \vbox (badness 10000) detected at line 242 [] -Overfull \hbox (395.75pt too wide) detected at line 240 +Overfull \hbox (395.75pt too wide) detected at line 242 [] [] -Underfull \vbox (badness 10000) detected at line 240 +Underfull \vbox (badness 10000) detected at line 242 [] -Overfull \hbox (395.75pt too wide) detected at line 240 +Overfull \hbox (395.75pt too wide) detected at line 242 [] [] -Underfull \hbox (badness 10000) detected at line 240 +Underfull \hbox (badness 10000) detected at line 242 [] [] @@ -1845,43 +1845,43 @@ Underfull \hbox (badness 10000) has occurred while \output is active \openout7 = `healthyr-book.lot'. [15] [16] -Underfull \vbox (badness 10000) detected at line 241 +Underfull \vbox (badness 10000) detected at line 243 [] -Overfull \hbox (395.75pt too wide) detected at line 241 +Overfull \hbox (395.75pt too wide) detected at line 243 [] [] -Underfull \vbox (badness 10000) detected at line 241 +Underfull \vbox (badness 10000) detected at line 243 [] -Overfull \hbox (395.75pt too wide) detected at line 241 +Overfull \hbox (395.75pt too wide) detected at line 243 [] [] -Underfull \vbox (badness 10000) detected at line 241 +Underfull \vbox (badness 10000) detected at line 243 [] -Overfull \hbox (395.75pt too wide) detected at line 241 +Overfull \hbox (395.75pt too wide) detected at line 243 [] [] -Underfull \vbox (badness 10000) detected at line 241 +Underfull \vbox (badness 10000) detected at line 243 [] -Overfull \hbox (395.75pt too wide) detected at line 241 +Overfull \hbox (395.75pt too wide) detected at line 243 [] [] -Underfull \hbox (badness 10000) detected at line 241 +Underfull \hbox (badness 10000) detected at line 243 [] [] @@ -1894,7 +1894,7 @@ Underfull \hbox (badness 10000) has occurred while \output is active ] [18] -Underfull \hbox (badness 2781) in paragraph at lines 52--52 +Underfull \hbox (badness 2781) in paragraph at lines 51--51 [][] [][][]\EU1/lmr/m/n/12 (#fig:chap07-fig-bp-personality_type)Scatter and li ne plot. [] @@ -1904,75 +1904,75 @@ ne plot. \openout8 = `healthyr-book.lof'. [20] -Underfull \vbox (badness 10000) detected at line 243 +Underfull \vbox (badness 10000) detected at line 245 [] -Overfull \hbox (395.75pt too wide) detected at line 243 +Overfull \hbox (395.75pt too wide) detected at line 245 [] [] -Underfull \vbox (badness 10000) detected at line 243 +Underfull \vbox (badness 10000) detected at line 245 [] -Overfull \hbox (395.75pt too wide) detected at line 243 +Overfull \hbox (395.75pt too wide) detected at line 245 [] [] -Underfull \vbox (badness 10000) detected at line 243 +Underfull \vbox (badness 10000) detected at line 245 [] -Overfull \hbox (395.75pt too wide) detected at line 243 +Overfull \hbox (395.75pt too wide) detected at line 245 [] [] -Underfull \vbox (badness 10000) detected at line 243 +Underfull \vbox (badness 10000) detected at line 245 [] -Overfull \hbox (395.75pt too wide) detected at line 243 +Overfull \hbox (395.75pt too wide) detected at line 245 [] [] -Underfull \hbox (badness 10000) detected at line 243 +Underfull \hbox (badness 10000) detected at line 245 [] [] LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(0)/m/n' will be -(Font) scaled to size 10.0pt on input line 251. +(Font) scaled to size 10.0pt on input line 253. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(0)/m/n' will be -(Font) scaled to size 7.0pt on input line 251. +(Font) scaled to size 7.0pt on input line 253. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(0)/m/n' will be -(Font) scaled to size 5.0pt on input line 251. +(Font) scaled to size 5.0pt on input line 253. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(1)/m/n' will be -(Font) scaled to size 10.00015pt on input line 251. +(Font) scaled to size 10.00015pt on input line 253. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(1)/m/n' will be -(Font) scaled to size 7.0001pt on input line 251. +(Font) scaled to size 7.0001pt on input line 253. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(1)/m/n' will be -(Font) scaled to size 5.00008pt on input line 251. +(Font) scaled to size 5.00008pt on input line 253. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(2)/m/n' will be -(Font) scaled to size 9.99985pt on input line 251. +(Font) scaled to size 9.99985pt on input line 253. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(2)/m/n' will be -(Font) scaled to size 6.9999pt on input line 251. +(Font) scaled to size 6.9999pt on input line 253. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(2)/m/n' will be -(Font) scaled to size 4.99992pt on input line 251. +(Font) scaled to size 4.99992pt on input line 253. LaTeX Font Info: Font shape `EU1/SourceCodePro(0)/m/n' will be -(Font) scaled to size 6.49994pt on input line 251. +(Font) scaled to size 6.49994pt on input line 253. Font mapping `tex-ansi.tec' for font `Source Code Pro' not found. -Underfull \hbox (badness 10000) in paragraph at lines 255--256 +Underfull \hbox (badness 10000) in paragraph at lines 257--258 []\EU1/lmr/m/n/12 This work is licensed under the Creative Commons Attribution- [] -Overfull \hbox (30.0pt too wide) in paragraph at lines 264--264 +Overfull \hbox (30.0pt too wide) in paragraph at lines 266--266 | [] @@ -1985,43 +1985,43 @@ Underfull \hbox (badness 10000) has occurred while \output is active ] [22] -Underfull \vbox (badness 10000) detected at line 296 +Underfull \vbox (badness 10000) detected at line 298 [] -Overfull \hbox (395.75pt too wide) detected at line 296 +Overfull \hbox (395.75pt too wide) detected at line 298 [] [] -Underfull \vbox (badness 10000) detected at line 296 +Underfull \vbox (badness 10000) detected at line 298 [] -Overfull \hbox (395.75pt too wide) detected at line 296 +Overfull \hbox (395.75pt too wide) detected at line 298 [] [] -Underfull \vbox (badness 10000) detected at line 296 +Underfull \vbox (badness 10000) detected at line 298 [] -Overfull \hbox (395.75pt too wide) detected at line 296 +Overfull \hbox (395.75pt too wide) detected at line 298 [] [] -Underfull \vbox (badness 10000) detected at line 296 +Underfull \vbox (badness 10000) detected at line 298 [] -Overfull \hbox (395.75pt too wide) detected at line 296 +Overfull \hbox (395.75pt too wide) detected at line 298 [] [] -Underfull \hbox (badness 10000) detected at line 296 +Underfull \hbox (badness 10000) detected at line 298 [] [] @@ -2073,38 +2073,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active [24 ] -Underfull \vbox (badness 10000) detected at line 314 +Underfull \vbox (badness 10000) detected at line 316 [] -Overfull \hbox (395.75pt too wide) detected at line 314 +Overfull \hbox (395.75pt too wide) detected at line 316 [] [] -Underfull \vbox (badness 10000) detected at line 314 +Underfull \vbox (badness 10000) detected at line 316 [] -Overfull \hbox (395.75pt too wide) detected at line 314 +Overfull \hbox (395.75pt too wide) detected at line 316 [] [] -Underfull \vbox (badness 10000) detected at line 314 +Underfull \vbox (badness 10000) detected at line 316 [] -Overfull \hbox (395.75pt too wide) detected at line 314 +Overfull \hbox (395.75pt too wide) detected at line 316 [] [] -Underfull \vbox (badness 10000) detected at line 314 +Underfull \vbox (badness 10000) detected at line 316 [] -Overfull \hbox (395.75pt too wide) detected at line 314 +Overfull \hbox (395.75pt too wide) detected at line 316 [] [] @@ -2185,38 +2185,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active [] [2] -Underfull \vbox (badness 10000) detected at line 317 +Underfull \vbox (badness 10000) detected at line 319 [] -Overfull \hbox (395.75pt too wide) detected at line 317 +Overfull \hbox (395.75pt too wide) detected at line 319 [] [] -Underfull \vbox (badness 10000) detected at line 317 +Underfull \vbox (badness 10000) detected at line 319 [] -Overfull \hbox (395.75pt too wide) detected at line 317 +Overfull \hbox (395.75pt too wide) detected at line 319 [] [] -Underfull \vbox (badness 10000) detected at line 317 +Underfull \vbox (badness 10000) detected at line 319 [] -Overfull \hbox (395.75pt too wide) detected at line 317 +Overfull \hbox (395.75pt too wide) detected at line 319 [] [] -Underfull \vbox (badness 10000) detected at line 317 +Underfull \vbox (badness 10000) detected at line 319 [] -Overfull \hbox (395.75pt too wide) detected at line 317 +Overfull \hbox (395.75pt too wide) detected at line 319 [] [] @@ -2233,38 +2233,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active ] File: images/chapter01/example_script.png Graphic file (type QTm) -Overfull \hbox (176.1623pt too wide) in paragraph at lines 376--376 +Overfull \hbox (176.1623pt too wide) in paragraph at lines 378--378 [][] [] [4] [5] File: images/chapter01/rstudio_interface.png Graphic file (type QTm) -Overfull \hbox (313.48265pt too wide) in paragraph at lines 422--422 +Overfull \hbox (313.48265pt too wide) in paragraph at lines 425--425 [][] [] [6] -Overfull \hbox (30.0pt too wide) in paragraph at lines 436--436 +Overfull \hbox (30.0pt too wide) in paragraph at lines 439--439 | [] [7] LaTeX Font Info: Font shape `EU1/SourceCodePro(0)/bx/n' will be -(Font) scaled to size 7.79993pt on input line 485. +(Font) scaled to size 7.79993pt on input line 491. Font mapping `tex-ansi.tec' for font `Source Code Pro Bold' not found. File: images/chapter01/tidyverse_loading_messages.png Graphic file (type QTm) -Overfull \hbox (502.42761pt too wide) in paragraph at lines 489--490 +Overfull \hbox (502.42761pt too wide) in paragraph at lines 495--496 [][] [] [8] [9] LaTeX Font Info: Font shape `EU1/SourceCodePro(0)/m/it' will be -(Font) scaled to size 7.79993pt on input line 542. +(Font) scaled to size 7.79993pt on input line 553. Font mapping `tex-ansi.tec' for font `Source Code Pro Italic' not found. [10] -Overfull \hbox (30.0pt too wide) in paragraph at lines 612--612 +Overfull \hbox (30.0pt too wide) in paragraph at lines 623--623 | [] @@ -2308,38 +2308,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active [12 ] -Underfull \vbox (badness 10000) detected at line 617 +Underfull \vbox (badness 10000) detected at line 629 [] -Overfull \hbox (395.75pt too wide) detected at line 617 +Overfull \hbox (395.75pt too wide) detected at line 629 [] [] -Underfull \vbox (badness 10000) detected at line 617 +Underfull \vbox (badness 10000) detected at line 629 [] -Overfull \hbox (395.75pt too wide) detected at line 617 +Overfull \hbox (395.75pt too wide) detected at line 629 [] [] -Underfull \vbox (badness 10000) detected at line 617 +Underfull \vbox (badness 10000) detected at line 629 [] -Overfull \hbox (395.75pt too wide) detected at line 617 +Overfull \hbox (395.75pt too wide) detected at line 629 [] [] -Underfull \vbox (badness 10000) detected at line 617 +Underfull \vbox (badness 10000) detected at line 629 [] -Overfull \hbox (395.75pt too wide) detected at line 617 +Overfull \hbox (395.75pt too wide) detected at line 629 [] [] @@ -2358,135 +2358,138 @@ File: images/chapter02/import_options.png Graphic file (type QTm) File: images/chapter02/code_preview.png Graphic file (type QTm) [15] -Overfull \hbox (30.0pt too wide) in paragraph at lines 745--745 +Overfull \hbox (30.0pt too wide) in paragraph at lines 767--767 | [] -Overfull \hbox (30.0pt too wide) in paragraph at lines 756--756 +Overfull \hbox (30.0pt too wide) in paragraph at lines 780--780 | [] [16] File: 02_basics_files/figure-latex/chap2-fig-gbd-1.pdf Graphic file (type QTm) [17] [18] [19] -Overfull \hbox (30.0pt too wide) in paragraph at lines 932--932 +Overfull \hbox (30.0pt too wide) in paragraph at lines 957--957 | [] - -Overfull \hbox (30.0pt too wide) in paragraph at lines 950--950 +[20] +Overfull \hbox (30.0pt too wide) in paragraph at lines 975--975 | [] -[20] [21] [22] [23] -Overfull \hbox (28.31612pt too wide) in paragraph at lines 1202--1202 +[21] [22] [23] [24] +Overfull \hbox (28.31612pt too wide) in paragraph at lines 1228--1228 []\EU1/SourceCodePro(0)/m/n/12 ## Error in `-.POSIXt`(my_datetime, current_date time): can only subtract from "POSIXt" objects[] [] -[24] [25] -Overfull \hbox (28.31618pt too wide) in paragraph at lines 1289--1289 +[25] +Overfull \hbox (28.31618pt too wide) in paragraph at lines 1315--1315 []\EU1/SourceCodePro(0)/m/n/12 ## Error in `/.difftime`(560, my_datesdiff): sec ond argument of / cannot be a "difftime" object[] [] + +Overfull \vbox (6.55443pt too high) has occurred while \output is active [] + [26] LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(0)/m/n' will be -(Font) scaled to size 8.0pt on input line 1387. +(Font) scaled to size 8.0pt on input line 1413. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(0)/m/n' will be -(Font) scaled to size 6.0pt on input line 1387. +(Font) scaled to size 6.0pt on input line 1413. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(1)/m/n' will be -(Font) scaled to size 8.00012pt on input line 1387. +(Font) scaled to size 8.00012pt on input line 1413. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(1)/m/n' will be -(Font) scaled to size 6.00009pt on input line 1387. +(Font) scaled to size 6.00009pt on input line 1413. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(2)/m/n' will be -(Font) scaled to size 7.99988pt on input line 1387. +(Font) scaled to size 7.99988pt on input line 1413. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(2)/m/n' will be -(Font) scaled to size 5.99991pt on input line 1387. +(Font) scaled to size 5.99991pt on input line 1413. [27] LaTeX Font Info: Font shape `EU1/SourceCodePro(0)/bx/n' will be -(Font) scaled to size 7.14993pt on input line 1420. +(Font) scaled to size 7.14993pt on input line 1447. Font mapping `tex-ansi.tec' for font `Source Code Pro Bold' not found. [28] [29] [30] -Overfull \hbox (30.0pt too wide) in paragraph at lines 1570--1570 +Overfull \hbox (30.0pt too wide) in paragraph at lines 1599--1599 | [] -Overfull \hbox (30.0pt too wide) in paragraph at lines 1609--1609 +Overfull \hbox (30.0pt too wide) in paragraph at lines 1638--1638 | [] [31] -Overfull \hbox (30.0pt too wide) in paragraph at lines 1662--1662 +Overfull \hbox (30.0pt too wide) in paragraph at lines 1691--1691 | [] [32] LaTeX Font Info: Font shape `EU1/SourceCodePro(0)/m/it' will be -(Font) scaled to size 6.49994pt on input line 1741. +(Font) scaled to size 6.49994pt on input line 1759. Font mapping `tex-ansi.tec' for font `Source Code Pro Italic' not found. [33] -Overfull \hbox (30.0pt too wide) in paragraph at lines 1748--1748 +Overfull \hbox (30.0pt too wide) in paragraph at lines 1780--1780 | [] File: images/chapter02/magrittr.png Graphic file (type QTm) [34] [35] -Overfull \hbox (30.0pt too wide) in paragraph at lines 1884--1884 +Overfull \hbox (30.0pt too wide) in paragraph at lines 1919--1919 | [] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] -Overfull \hbox (30.0pt too wide) in paragraph at lines 2517--2517 +Overfull \hbox (30.0pt too wide) in paragraph at lines 2552--2552 | [] [47] [48] [49] [50] [51] [52] -Underfull \vbox (badness 10000) detected at line 2853 +Underfull \vbox (badness 10000) detected at line 2888 [] -Overfull \hbox (395.75pt too wide) detected at line 2853 +Overfull \hbox (395.75pt too wide) detected at line 2888 [] [] -Underfull \vbox (badness 10000) detected at line 2853 +Underfull \vbox (badness 10000) detected at line 2888 [] -Overfull \hbox (395.75pt too wide) detected at line 2853 +Overfull \hbox (395.75pt too wide) detected at line 2888 [] [] -Underfull \vbox (badness 10000) detected at line 2853 +Underfull \vbox (badness 10000) detected at line 2888 [] -Overfull \hbox (395.75pt too wide) detected at line 2853 +Overfull \hbox (395.75pt too wide) detected at line 2888 [] [] -Underfull \vbox (badness 10000) detected at line 2853 +Underfull \vbox (badness 10000) detected at line 2888 [] -Overfull \hbox (395.75pt too wide) detected at line 2853 +Overfull \hbox (395.75pt too wide) detected at line 2888 [] [] Chapter 3. -Overfull \hbox (30.0pt too wide) in paragraph at lines 2860--2860 +Overfull \hbox (30.0pt too wide) in paragraph at lines 2895--2895 | [] -Underfull \hbox (badness 1112) in paragraph at lines 2869--2871 +Underfull \hbox (badness 1112) in paragraph at lines 2904--2906 []\EU1/lmr/m/n/12 reshape data between the wide and long formats: \EU1/SourceCo dePro(0)/m/n/12 pivot_wider() \EU1/lmr/m/n/12 and [] @@ -2508,7 +2511,7 @@ File: 03_summarising_files/figure-latex/chap03-fig-gbd-1.pdf Graphic file (type [55 ] [56] [57] [58] [59] [60] [61] [62] [63] -Underfull \hbox (badness 1596) in paragraph at lines 3488--3489 +Underfull \hbox (badness 1596) in paragraph at lines 3523--3524 []\EU1/lmr/m/n/12 Let’s say you can remember, whether the deaths column was cal led [] @@ -2516,14 +2519,14 @@ led [64] File: images/wide_long.png Graphic file (type QTm) -Overfull \hbox (446.08545pt too wide) in paragraph at lines 3573--3573 +Overfull \hbox (446.08545pt too wide) in paragraph at lines 3608--3608 [][] [] [65] [66] [67 ] [68] [69] [70] [71] -Underfull \hbox (badness 1838) in paragraph at lines 3922--3923 +Underfull \hbox (badness 1838) in paragraph at lines 3957--3958 \EU1/lmr/m/n/12 Read in the full GBD dataset with variables \EU1/SourceCodePro( 0)/m/n/12 cause\EU1/lmr/m/n/12 , \EU1/SourceCodePro(0)/m/n/12 year\EU1/lmr/m/n/ 12 , \EU1/SourceCodePro(0)/m/n/12 sex\EU1/lmr/m/n/12 , \EU1/SourceCodePro(0)/m/ @@ -2531,7 +2534,7 @@ n/12 income\EU1/lmr/m/n/12 , [] [72] [73] [74] [75] -Overfull \hbox (30.0pt too wide) in paragraph at lines 4135--4135 +Overfull \hbox (30.0pt too wide) in paragraph at lines 4170--4170 | [] @@ -2575,44 +2578,44 @@ Underfull \hbox (badness 10000) has occurred while \output is active [78 ] -Underfull \vbox (badness 10000) detected at line 4176 +Underfull \vbox (badness 10000) detected at line 4211 [] -Overfull \hbox (395.75pt too wide) detected at line 4176 +Overfull \hbox (395.75pt too wide) detected at line 4211 [] [] -Underfull \vbox (badness 10000) detected at line 4176 +Underfull \vbox (badness 10000) detected at line 4211 [] -Overfull \hbox (395.75pt too wide) detected at line 4176 +Overfull \hbox (395.75pt too wide) detected at line 4211 [] [] -Underfull \vbox (badness 10000) detected at line 4176 +Underfull \vbox (badness 10000) detected at line 4211 [] -Overfull \hbox (395.75pt too wide) detected at line 4176 +Overfull \hbox (395.75pt too wide) detected at line 4211 [] [] -Underfull \vbox (badness 10000) detected at line 4176 +Underfull \vbox (badness 10000) detected at line 4211 [] -Overfull \hbox (395.75pt too wide) detected at line 4176 +Overfull \hbox (395.75pt too wide) detected at line 4211 [] [] Chapter 4. -Overfull \hbox (30.0pt too wide) in paragraph at lines 4184--4184 +Overfull \hbox (30.0pt too wide) in paragraph at lines 4219--4219 | [] @@ -2620,7 +2623,7 @@ File: 04_plotting_files/figure-latex/chap04-fig-steps-1.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 60.57353pt on input line 4210. +LaTeX Warning: Float too large for page by 60.57353pt on input line 4245. Underfull \hbox (badness 10000) has occurred while \output is active @@ -2685,13 +2688,13 @@ QTm) File: 04_plotting_files/figure-latex/unnamed-chunk-38-1.pdf Graphic file (type QTm) -Underfull \hbox (badness 6396) in paragraph at lines 5076--5078 +Underfull \hbox (badness 6396) in paragraph at lines 5111--5113 []\EU1/lmr/m/n/12 In the second example, we’re using \EU1/SourceCodePro(0)/m/n/ 12 group_by(continent) \EU1/lmr/m/n/12 followed by [] -Underfull \hbox (badness 1917) in paragraph at lines 5076--5078 +Underfull \hbox (badness 1917) in paragraph at lines 5111--5113 \EU1/SourceCodePro(0)/m/n/12 mutate(country_number = seq_along(country)) \EU1/l mr/m/n/12 to create a new column with [] @@ -2700,38 +2703,38 @@ mr/m/n/12 to create a new column with File: 04_plotting_files/figure-latex/unnamed-chunk-39-1.pdf Graphic file (type QTm) [103] [104] -Underfull \vbox (badness 10000) detected at line 5095 +Underfull \vbox (badness 10000) detected at line 5130 [] -Overfull \hbox (395.75pt too wide) detected at line 5095 +Overfull \hbox (395.75pt too wide) detected at line 5130 [] [] -Underfull \vbox (badness 10000) detected at line 5095 +Underfull \vbox (badness 10000) detected at line 5130 [] -Overfull \hbox (395.75pt too wide) detected at line 5095 +Overfull \hbox (395.75pt too wide) detected at line 5130 [] [] -Underfull \vbox (badness 10000) detected at line 5095 +Underfull \vbox (badness 10000) detected at line 5130 [] -Overfull \hbox (395.75pt too wide) detected at line 5095 +Overfull \hbox (395.75pt too wide) detected at line 5130 [] [] -Underfull \vbox (badness 10000) detected at line 5095 +Underfull \vbox (badness 10000) detected at line 5130 [] -Overfull \hbox (395.75pt too wide) detected at line 5095 +Overfull \hbox (395.75pt too wide) detected at line 5130 [] [] @@ -2781,38 +2784,38 @@ le (type QTm) File: 05_fine_tuning_plots_files/figure-latex/unnamed-chunk-30-1.pdf Graphic fi le (type QTm) [114] -Underfull \vbox (badness 10000) detected at line 5506 +Underfull \vbox (badness 10000) detected at line 5541 [] -Overfull \hbox (395.75pt too wide) detected at line 5506 +Overfull \hbox (395.75pt too wide) detected at line 5541 [] [] -Underfull \vbox (badness 10000) detected at line 5506 +Underfull \vbox (badness 10000) detected at line 5541 [] -Overfull \hbox (395.75pt too wide) detected at line 5506 +Overfull \hbox (395.75pt too wide) detected at line 5541 [] [] -Underfull \vbox (badness 10000) detected at line 5506 +Underfull \vbox (badness 10000) detected at line 5541 [] -Overfull \hbox (395.75pt too wide) detected at line 5506 +Overfull \hbox (395.75pt too wide) detected at line 5541 [] [] -Underfull \vbox (badness 10000) detected at line 5506 +Underfull \vbox (badness 10000) detected at line 5541 [] -Overfull \hbox (395.75pt too wide) detected at line 5506 +Overfull \hbox (395.75pt too wide) detected at line 5541 [] [] @@ -2969,44 +2972,44 @@ Underfull \hbox (badness 10000) has occurred while \output is active [120 ] -Underfull \vbox (badness 10000) detected at line 5521 +Underfull \vbox (badness 10000) detected at line 5556 [] -Overfull \hbox (395.75pt too wide) detected at line 5521 +Overfull \hbox (395.75pt too wide) detected at line 5556 [] [] -Underfull \vbox (badness 10000) detected at line 5521 +Underfull \vbox (badness 10000) detected at line 5556 [] -Overfull \hbox (395.75pt too wide) detected at line 5521 +Overfull \hbox (395.75pt too wide) detected at line 5556 [] [] -Underfull \vbox (badness 10000) detected at line 5521 +Underfull \vbox (badness 10000) detected at line 5556 [] -Overfull \hbox (395.75pt too wide) detected at line 5521 +Overfull \hbox (395.75pt too wide) detected at line 5556 [] [] -Underfull \vbox (badness 10000) detected at line 5521 +Underfull \vbox (badness 10000) detected at line 5556 [] -Overfull \hbox (395.75pt too wide) detected at line 5521 +Overfull \hbox (395.75pt too wide) detected at line 5556 [] [] Chapter 6. -Overfull \hbox (30.0pt too wide) in paragraph at lines 5528--5528 +Overfull \hbox (30.0pt too wide) in paragraph at lines 5563--5563 | [] @@ -3041,11 +3044,11 @@ Graphic file (type QTm) File: images/chapter06/ttest-object.png Graphic file (type QTm) [129] LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(0)/m/n' will be -(Font) scaled to size 9.0pt on input line 5937. +(Font) scaled to size 9.0pt on input line 5972. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(1)/m/n' will be -(Font) scaled to size 9.00014pt on input line 5937. +(Font) scaled to size 9.00014pt on input line 5972. LaTeX Font Info: Font shape `EU1/latinmodern-math.otf(2)/m/n' will be -(Font) scaled to size 8.99986pt on input line 5937. +(Font) scaled to size 8.99986pt on input line 5972. [130] File: 06_working_continuous_files/figure-latex/chap06-fig-line-life-asia-1.pdf Graphic file (type QTm) @@ -3060,22 +3063,22 @@ File: 06_working_continuous_files/figure-latex/unnamed-chunk-18-1.pdf Graphic f ile (type QTm) [136] [137] [138] [139] -Underfull \hbox (badness 10000) detected at line 6365 +Underfull \hbox (badness 10000) detected at line 6400 [][] [] -Underfull \hbox (badness 10000) detected at line 6365 +Underfull \hbox (badness 10000) detected at line 6400 [][][] [] -Underfull \hbox (badness 10000) detected at line 6367 +Underfull \hbox (badness 10000) detected at line 6402 [][] [] -Underfull \hbox (badness 10000) detected at line 6367 +Underfull \hbox (badness 10000) detected at line 6402 [][][] [] @@ -3090,32 +3093,32 @@ ile (type QTm) LaTeX Warning: `!h' float specifier changed to `!ht'. [143] -Underfull \hbox (badness 10000) detected at line 6601 +Underfull \hbox (badness 10000) detected at line 6636 [][] [] -Underfull \hbox (badness 10000) detected at line 6601 +Underfull \hbox (badness 10000) detected at line 6636 [][][] [] -Underfull \hbox (badness 10000) detected at line 6604 +Underfull \hbox (badness 10000) detected at line 6639 [][] [] -Underfull \hbox (badness 10000) detected at line 6604 +Underfull \hbox (badness 10000) detected at line 6639 [][][] [] -Underfull \hbox (badness 10000) detected at line 6619 +Underfull \hbox (badness 10000) detected at line 6654 [][] [] -Underfull \hbox (badness 10000) detected at line 6619 +Underfull \hbox (badness 10000) detected at line 6654 [][][] [] @@ -3131,44 +3134,44 @@ File: 06_working_continuous_files/figure-latex/unnamed-chunk-31-1.pdf Graphic f ile (type QTm) [148] -Underfull \vbox (badness 10000) detected at line 6780 +Underfull \vbox (badness 10000) detected at line 6815 [] -Overfull \hbox (395.75pt too wide) detected at line 6780 +Overfull \hbox (395.75pt too wide) detected at line 6815 [] [] -Underfull \vbox (badness 10000) detected at line 6780 +Underfull \vbox (badness 10000) detected at line 6815 [] -Overfull \hbox (395.75pt too wide) detected at line 6780 +Overfull \hbox (395.75pt too wide) detected at line 6815 [] [] -Underfull \vbox (badness 10000) detected at line 6780 +Underfull \vbox (badness 10000) detected at line 6815 [] -Overfull \hbox (395.75pt too wide) detected at line 6780 +Overfull \hbox (395.75pt too wide) detected at line 6815 [] [] -Underfull \vbox (badness 10000) detected at line 6780 +Underfull \vbox (badness 10000) detected at line 6815 [] -Overfull \hbox (395.75pt too wide) detected at line 6780 +Overfull \hbox (395.75pt too wide) detected at line 6815 [] [] Chapter 7. -Overfull \hbox (30.0pt too wide) in paragraph at lines 6787--6787 +Overfull \hbox (30.0pt too wide) in paragraph at lines 6822--6822 | [] @@ -3183,7 +3186,7 @@ Underfull \hbox (badness 10000) has occurred while \output is active ] File: images/chapter07/1_regression_terms.pdf Graphic file (type QTm) -Underfull \hbox (badness 2181) in paragraph at lines 6858--6859 +Underfull \hbox (badness 2181) in paragraph at lines 6893--6894 []\EU1/lmr/m/n/12 The regression apps and example figures in this chapter have been [] @@ -3192,45 +3195,45 @@ been File: images/chapter07/2_residuals.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 8.84102pt on input line 6864. +LaTeX Warning: Float too large for page by 8.84102pt on input line 6899. -Underfull \hbox (badness 10000) detected at line 6875 +Underfull \hbox (badness 10000) detected at line 6910 [][] [] -Underfull \hbox (badness 10000) detected at line 6875 +Underfull \hbox (badness 10000) detected at line 6910 [][][] [] -Underfull \hbox (badness 10000) detected at line 6877 +Underfull \hbox (badness 10000) detected at line 6912 [][] [] -Underfull \hbox (badness 10000) detected at line 6877 +Underfull \hbox (badness 10000) detected at line 6912 [][][] [] -Underfull \hbox (badness 10000) detected at line 6879 +Underfull \hbox (badness 10000) detected at line 6914 [][] [] -Underfull \hbox (badness 10000) detected at line 6879 +Underfull \hbox (badness 10000) detected at line 6914 [][][] [] -Underfull \hbox (badness 10000) detected at line 6881 +Underfull \hbox (badness 10000) detected at line 6916 [][] [] -Underfull \hbox (badness 10000) detected at line 6881 +Underfull \hbox (badness 10000) detected at line 6916 [][][] [] @@ -3240,7 +3243,7 @@ File: images/chapter07/3_diags.pdf Graphic file (type QTm) File: images/chapter07/4_equation.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 3.41142pt on input line 6949. +LaTeX Warning: Float too large for page by 3.41142pt on input line 6984. [154] [155] [156] File: images/chapter07/5_dags.pdf Graphic file (type QTm) @@ -3248,7 +3251,7 @@ File: images/chapter07/5_dags.pdf Graphic file (type QTm) File: images/chapter07/6_types.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 15.24742pt on input line 7020. +LaTeX Warning: Float too large for page by 15.24742pt on input line 7055. [158] [159] File: images/chapter07/7_confounding.pdf Graphic file (type QTm) @@ -3261,7 +3264,7 @@ e (type QTm) LaTeX Warning: Reference `chap07-fig-types' on page 163 undefined on input line - 7167. + 7202. [163] [164] [165] [166] [167] File: 07_linear_regression_files/figure-latex/unnamed-chunk-14-1.pdf Graphic fi @@ -3278,62 +3281,62 @@ File: 07_linear_regression_files/figure-latex/chap07-diags-example-1.pdf Graphi c file (type QTm) [1 72] -Underfull \hbox (badness 10000) detected at line 7622 +Underfull \hbox (badness 10000) detected at line 7657 [][] [] -Underfull \hbox (badness 10000) detected at line 7622 +Underfull \hbox (badness 10000) detected at line 7657 [][][] [] -Underfull \hbox (badness 10000) detected at line 7624 +Underfull \hbox (badness 10000) detected at line 7659 [][] [] -Underfull \hbox (badness 10000) detected at line 7624 +Underfull \hbox (badness 10000) detected at line 7659 [][][] [] -Underfull \hbox (badness 10000) detected at line 7626 +Underfull \hbox (badness 10000) detected at line 7661 [][] [] -Underfull \hbox (badness 10000) detected at line 7626 +Underfull \hbox (badness 10000) detected at line 7661 [][][] [] -Underfull \hbox (badness 10000) detected at line 7628 +Underfull \hbox (badness 10000) detected at line 7663 [][] [] -Underfull \hbox (badness 10000) detected at line 7628 +Underfull \hbox (badness 10000) detected at line 7663 [][][] [] -Underfull \hbox (badness 10000) detected at line 7630 +Underfull \hbox (badness 10000) detected at line 7665 [][] [] -Underfull \hbox (badness 10000) detected at line 7630 +Underfull \hbox (badness 10000) detected at line 7665 [][][] [] -Underfull \hbox (badness 10000) detected at line 7632 +Underfull \hbox (badness 10000) detected at line 7667 [][] [] -Underfull \hbox (badness 10000) detected at line 7632 +Underfull \hbox (badness 10000) detected at line 7667 [][][] [] @@ -3347,7 +3350,7 @@ pdf Graphic file (type QTm) LaTeX Warning: `!h' float specifier changed to `!ht'. -Overfull \hbox (451.74408pt too wide) in paragraph at lines 8059--8075 +Overfull \hbox (451.74408pt too wide) in paragraph at lines 8094--8110 [] [] @@ -3359,44 +3362,44 @@ File: 07_linear_regression_files/figure-latex/unnamed-chunk-34-1.pdf Graphic fi le (type QTm) [181] [182] -Underfull \vbox (badness 10000) detected at line 8122 +Underfull \vbox (badness 10000) detected at line 8157 [] -Overfull \hbox (395.75pt too wide) detected at line 8122 +Overfull \hbox (395.75pt too wide) detected at line 8157 [] [] -Underfull \vbox (badness 10000) detected at line 8122 +Underfull \vbox (badness 10000) detected at line 8157 [] -Overfull \hbox (395.75pt too wide) detected at line 8122 +Overfull \hbox (395.75pt too wide) detected at line 8157 [] [] -Underfull \vbox (badness 10000) detected at line 8122 +Underfull \vbox (badness 10000) detected at line 8157 [] -Overfull \hbox (395.75pt too wide) detected at line 8122 +Overfull \hbox (395.75pt too wide) detected at line 8157 [] [] -Underfull \vbox (badness 10000) detected at line 8122 +Underfull \vbox (badness 10000) detected at line 8157 [] -Overfull \hbox (395.75pt too wide) detected at line 8122 +Overfull \hbox (395.75pt too wide) detected at line 8157 [] [] Chapter 8. -Overfull \hbox (30.0pt too wide) in paragraph at lines 8129--8129 +Overfull \hbox (30.0pt too wide) in paragraph at lines 8164--8164 | [] @@ -3409,7 +3412,7 @@ Underfull \hbox (badness 10000) has occurred while \output is active ] [184] -Overfull \vbox (0.93793pt too high) detected at line 8288 +Overfull \vbox (0.93793pt too high) detected at line 8323 [] [185] [186] @@ -3439,50 +3442,50 @@ Missing character: There is no ≤ in font [lmroman12-regular]:mapping=tex-text! LaTeX Warning: `!h' float specifier changed to `!ht'. -Underfull \hbox (badness 1642) in paragraph at lines 9007--9008 +Underfull \hbox (badness 1642) in paragraph at lines 9042--9043 []\EU1/lmr/m/n/12 Change the continuous variable summary statsics from to \EU1/ SourceCodePro(0)/m/n/12 median \EU1/lmr/m/n/12 and [] [199] [200] -Underfull \vbox (badness 10000) detected at line 9033 +Underfull \vbox (badness 10000) detected at line 9068 [] -Overfull \hbox (395.75pt too wide) detected at line 9033 +Overfull \hbox (395.75pt too wide) detected at line 9068 [] [] -Underfull \vbox (badness 10000) detected at line 9033 +Underfull \vbox (badness 10000) detected at line 9068 [] -Overfull \hbox (395.75pt too wide) detected at line 9033 +Overfull \hbox (395.75pt too wide) detected at line 9068 [] [] -Underfull \vbox (badness 10000) detected at line 9033 +Underfull \vbox (badness 10000) detected at line 9068 [] -Overfull \hbox (395.75pt too wide) detected at line 9033 +Overfull \hbox (395.75pt too wide) detected at line 9068 [] [] -Underfull \vbox (badness 10000) detected at line 9033 +Underfull \vbox (badness 10000) detected at line 9068 [] -Overfull \hbox (395.75pt too wide) detected at line 9033 +Overfull \hbox (395.75pt too wide) detected at line 9068 [] [] Chapter 9. -Overfull \hbox (30.0pt too wide) in paragraph at lines 9040--9040 +Overfull \hbox (30.0pt too wide) in paragraph at lines 9075--9075 | [] @@ -3502,18 +3505,18 @@ File: images/chapter09/1_or.pdf Graphic file (type QTm) File: images/chapter09/2_prob_logodds.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 41.34337pt on input line 9222. +LaTeX Warning: Float too large for page by 41.34337pt on input line 9257. [205] [206] File: images/chapter09/4_equation.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 16.9754pt on input line 9245. +LaTeX Warning: Float too large for page by 16.9754pt on input line 9280. File: images/chapter09/6_types.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 16.9754pt on input line 9270. +LaTeX Warning: Float too large for page by 16.9754pt on input line 9305. File: images/chapter09/7_interactions.pdf Graphic file (type QTm) [207] [208] [209] [210] [211] @@ -3528,42 +3531,42 @@ file (type QTm) LaTeX Warning: `!h' float specifier changed to `!ht'. [214] -Underfull \hbox (badness 10000) detected at line 9537 +Underfull \hbox (badness 10000) detected at line 9572 [][] [] -Underfull \hbox (badness 10000) detected at line 9537 +Underfull \hbox (badness 10000) detected at line 9572 [][][] [] -Underfull \hbox (badness 10000) detected at line 9539 +Underfull \hbox (badness 10000) detected at line 9574 [][] [] -Underfull \hbox (badness 10000) detected at line 9539 +Underfull \hbox (badness 10000) detected at line 9574 [][][] [] -Underfull \hbox (badness 10000) detected at line 9541 +Underfull \hbox (badness 10000) detected at line 9576 [][] [] -Underfull \hbox (badness 10000) detected at line 9541 +Underfull \hbox (badness 10000) detected at line 9576 [][][] [] -Underfull \hbox (badness 10000) detected at line 9543 +Underfull \hbox (badness 10000) detected at line 9578 [][] [] -Underfull \hbox (badness 10000) detected at line 9543 +Underfull \hbox (badness 10000) detected at line 9578 [][][] [] @@ -3587,62 +3590,62 @@ File: 09_logistic_regression_files/figure-latex/unnamed-chunk-17-1.pdf Graphic file (type QTm) [219 ] [220] [221] -Underfull \hbox (badness 10000) detected at line 9806 +Underfull \hbox (badness 10000) detected at line 9841 [][] [] -Underfull \hbox (badness 10000) detected at line 9806 +Underfull \hbox (badness 10000) detected at line 9841 [][][] [] -Underfull \hbox (badness 10000) detected at line 9808 +Underfull \hbox (badness 10000) detected at line 9843 [][] [] -Underfull \hbox (badness 10000) detected at line 9808 +Underfull \hbox (badness 10000) detected at line 9843 [][][] [] -Underfull \hbox (badness 10000) detected at line 9810 +Underfull \hbox (badness 10000) detected at line 9845 [][] [] -Underfull \hbox (badness 10000) detected at line 9810 +Underfull \hbox (badness 10000) detected at line 9845 [][][] [] -Underfull \hbox (badness 10000) detected at line 9812 +Underfull \hbox (badness 10000) detected at line 9847 [][] [] -Underfull \hbox (badness 10000) detected at line 9812 +Underfull \hbox (badness 10000) detected at line 9847 [][][] [] -Underfull \hbox (badness 10000) detected at line 9814 +Underfull \hbox (badness 10000) detected at line 9849 [][] [] -Underfull \hbox (badness 10000) detected at line 9814 +Underfull \hbox (badness 10000) detected at line 9849 [][][] [] -Underfull \hbox (badness 10000) detected at line 9816 +Underfull \hbox (badness 10000) detected at line 9851 [][] [] -Underfull \hbox (badness 10000) detected at line 9816 +Underfull \hbox (badness 10000) detected at line 9851 [][][] [] @@ -3667,12 +3670,12 @@ LaTeX Warning: `!h' float specifier changed to `!ht'. LaTeX Warning: `!h' float specifier changed to `!ht'. [226] -Overfull \hbox (314.6321pt too wide) in paragraph at lines 10192--10208 +Overfull \hbox (314.6321pt too wide) in paragraph at lines 10227--10243 [] [] [227] -Overfull \hbox (314.6321pt too wide) in paragraph at lines 10259--10275 +Overfull \hbox (314.6321pt too wide) in paragraph at lines 10294--10310 [] [] @@ -3689,12 +3692,12 @@ File: 09_logistic_regression_files/figure-latex/unnamed-chunk-41-1.pdf Graphic file (type QTm) [232 ] [233] -Overfull \hbox (314.6321pt too wide) in paragraph at lines 10641--10665 +Overfull \hbox (314.6321pt too wide) in paragraph at lines 10676--10700 [] [] -Overfull \hbox (37.85204pt too wide) in paragraph at lines 10641--10665 +Overfull \hbox (37.85204pt too wide) in paragraph at lines 10676--10700 [] [] @@ -3741,44 +3744,44 @@ Underfull \hbox (badness 10000) has occurred while \output is active [236 ] -Underfull \vbox (badness 10000) detected at line 10705 +Underfull \vbox (badness 10000) detected at line 10740 [] -Overfull \hbox (395.75pt too wide) detected at line 10705 +Overfull \hbox (395.75pt too wide) detected at line 10740 [] [] -Underfull \vbox (badness 10000) detected at line 10705 +Underfull \vbox (badness 10000) detected at line 10740 [] -Overfull \hbox (395.75pt too wide) detected at line 10705 +Overfull \hbox (395.75pt too wide) detected at line 10740 [] [] -Underfull \vbox (badness 10000) detected at line 10705 +Underfull \vbox (badness 10000) detected at line 10740 [] -Overfull \hbox (395.75pt too wide) detected at line 10705 +Overfull \hbox (395.75pt too wide) detected at line 10740 [] [] -Underfull \vbox (badness 10000) detected at line 10705 +Underfull \vbox (badness 10000) detected at line 10740 [] -Overfull \hbox (395.75pt too wide) detected at line 10705 +Overfull \hbox (395.75pt too wide) detected at line 10740 [] [] Chapter 10. -Overfull \hbox (30.0pt too wide) in paragraph at lines 10712--10712 +Overfull \hbox (30.0pt too wide) in paragraph at lines 10747--10747 | [] @@ -3801,7 +3804,7 @@ QTm) File: 10_survival_files/figure-latex/unnamed-chunk-24-1.pdf Graphic file (type QTm) [248] [249] -Overfull \hbox (0.29604pt too wide) in paragraph at lines 11422--11423 +Overfull \hbox (0.29604pt too wide) in paragraph at lines 11457--11458 []\EU1/lmr/m/n/12 Hierarchical structure in your data can be accommodated with cluster or frailty [] @@ -3810,38 +3813,38 @@ File: 10_survival_files/figure-latex/unnamed-chunk-27-1.pdf Graphic file (type QTm) [250] [251] [25 2] [253] -Underfull \vbox (badness 10000) detected at line 11585 +Underfull \vbox (badness 10000) detected at line 11620 [] -Overfull \hbox (395.75pt too wide) detected at line 11585 +Overfull \hbox (395.75pt too wide) detected at line 11620 [] [] -Underfull \vbox (badness 10000) detected at line 11585 +Underfull \vbox (badness 10000) detected at line 11620 [] -Overfull \hbox (395.75pt too wide) detected at line 11585 +Overfull \hbox (395.75pt too wide) detected at line 11620 [] [] -Underfull \vbox (badness 10000) detected at line 11585 +Underfull \vbox (badness 10000) detected at line 11620 [] -Overfull \hbox (395.75pt too wide) detected at line 11585 +Overfull \hbox (395.75pt too wide) detected at line 11620 [] [] -Underfull \vbox (badness 10000) detected at line 11585 +Underfull \vbox (badness 10000) detected at line 11620 [] -Overfull \hbox (395.75pt too wide) detected at line 11585 +Overfull \hbox (395.75pt too wide) detected at line 11620 [] [] @@ -3961,38 +3964,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active [258 ] -Underfull \vbox (badness 10000) detected at line 11593 +Underfull \vbox (badness 10000) detected at line 11628 [] -Overfull \hbox (395.75pt too wide) detected at line 11593 +Overfull \hbox (395.75pt too wide) detected at line 11628 [] [] -Underfull \vbox (badness 10000) detected at line 11593 +Underfull \vbox (badness 10000) detected at line 11628 [] -Overfull \hbox (395.75pt too wide) detected at line 11593 +Overfull \hbox (395.75pt too wide) detected at line 11628 [] [] -Underfull \vbox (badness 10000) detected at line 11593 +Underfull \vbox (badness 10000) detected at line 11628 [] -Overfull \hbox (395.75pt too wide) detected at line 11593 +Overfull \hbox (395.75pt too wide) detected at line 11628 [] [] -Underfull \vbox (badness 10000) detected at line 11593 +Underfull \vbox (badness 10000) detected at line 11628 [] -Overfull \hbox (395.75pt too wide) detected at line 11593 +Overfull \hbox (395.75pt too wide) detected at line 11628 [] [] @@ -4011,31 +4014,31 @@ File: images/chapter11/3_help.pdf Graphic file (type QTm) File: images/chapter11/2_anatomy_rotated.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 16.9754pt on input line 11718. +LaTeX Warning: Float too large for page by 16.9754pt on input line 11753. [262] [263] [264] [265] File: images/chapter11/4_notebook_options_rotated.pdf Graphic file (type QTm) -LaTeX Warning: Float too large for page by 3.96342pt on input line 11852. +LaTeX Warning: Float too large for page by 3.96342pt on input line 11887. [266] [267] [268] [269] -Underfull \hbox (badness 10000) detected at line 12017 +Underfull \hbox (badness 10000) detected at line 12052 [][] [] -Underfull \hbox (badness 10000) detected at line 12017 +Underfull \hbox (badness 10000) detected at line 12052 [][][] [] -Underfull \hbox (badness 10000) detected at line 12033 +Underfull \hbox (badness 10000) detected at line 12068 [][] [] -Underfull \hbox (badness 10000) detected at line 12033 +Underfull \hbox (badness 10000) detected at line 12068 [][][] [] @@ -4079,38 +4082,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active [272 ] -Underfull \vbox (badness 10000) detected at line 12079 +Underfull \vbox (badness 10000) detected at line 12114 [] -Overfull \hbox (395.75pt too wide) detected at line 12079 +Overfull \hbox (395.75pt too wide) detected at line 12114 [] [] -Underfull \vbox (badness 10000) detected at line 12079 +Underfull \vbox (badness 10000) detected at line 12114 [] -Overfull \hbox (395.75pt too wide) detected at line 12079 +Overfull \hbox (395.75pt too wide) detected at line 12114 [] [] -Underfull \vbox (badness 10000) detected at line 12079 +Underfull \vbox (badness 10000) detected at line 12114 [] -Overfull \hbox (395.75pt too wide) detected at line 12079 +Overfull \hbox (395.75pt too wide) detected at line 12114 [] [] -Underfull \vbox (badness 10000) detected at line 12079 +Underfull \vbox (badness 10000) detected at line 12114 [] -Overfull \hbox (395.75pt too wide) detected at line 12079 +Overfull \hbox (395.75pt too wide) detected at line 12114 [] [] @@ -4175,38 +4178,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active [284 ] -Underfull \vbox (badness 10000) detected at line 12493 +Underfull \vbox (badness 10000) detected at line 12528 [] -Overfull \hbox (395.75pt too wide) detected at line 12493 +Overfull \hbox (395.75pt too wide) detected at line 12528 [] [] -Underfull \vbox (badness 10000) detected at line 12493 +Underfull \vbox (badness 10000) detected at line 12528 [] -Overfull \hbox (395.75pt too wide) detected at line 12493 +Overfull \hbox (395.75pt too wide) detected at line 12528 [] [] -Underfull \vbox (badness 10000) detected at line 12493 +Underfull \vbox (badness 10000) detected at line 12528 [] -Overfull \hbox (395.75pt too wide) detected at line 12493 +Overfull \hbox (395.75pt too wide) detected at line 12528 [] [] -Underfull \vbox (badness 10000) detected at line 12493 +Underfull \vbox (badness 10000) detected at line 12528 [] -Overfull \hbox (395.75pt too wide) detected at line 12493 +Overfull \hbox (395.75pt too wide) detected at line 12528 [] [] @@ -4222,7 +4225,7 @@ Underfull \hbox (badness 10000) has occurred while \output is active [285 ] [286] -Overfull \hbox (30.0pt too wide) in paragraph at lines 12565--12565 +Overfull \hbox (30.0pt too wide) in paragraph at lines 12600--12600 | [] @@ -4273,89 +4276,89 @@ Underfull \hbox (badness 10000) has occurred while \output is active [292 ] -Underfull \vbox (badness 10000) detected at line 12636 +Underfull \vbox (badness 10000) detected at line 12671 [] -Overfull \hbox (395.75pt too wide) detected at line 12636 +Overfull \hbox (395.75pt too wide) detected at line 12671 [] [] -Underfull \vbox (badness 10000) detected at line 12636 +Underfull \vbox (badness 10000) detected at line 12671 [] -Overfull \hbox (395.75pt too wide) detected at line 12636 +Overfull \hbox (395.75pt too wide) detected at line 12671 [] [] -Underfull \vbox (badness 10000) detected at line 12636 +Underfull \vbox (badness 10000) detected at line 12671 [] -Overfull \hbox (395.75pt too wide) detected at line 12636 +Overfull \hbox (395.75pt too wide) detected at line 12671 [] [] -Underfull \vbox (badness 10000) detected at line 12636 +Underfull \vbox (badness 10000) detected at line 12671 [] -Overfull \hbox (395.75pt too wide) detected at line 12636 +Overfull \hbox (395.75pt too wide) detected at line 12671 [] [] Chapter 14. -Underfull \hbox (badness 10000) detected at line 12655 +Underfull \hbox (badness 10000) detected at line 12690 [][] [] -Underfull \hbox (badness 10000) detected at line 12655 +Underfull \hbox (badness 10000) detected at line 12690 [][][] [] -Underfull \hbox (badness 10000) detected at line 12657 +Underfull \hbox (badness 10000) detected at line 12692 [][] [] -Underfull \hbox (badness 10000) detected at line 12657 +Underfull \hbox (badness 10000) detected at line 12692 [][][] [] -Underfull \hbox (badness 10000) detected at line 12659 +Underfull \hbox (badness 10000) detected at line 12694 [][] [] -Underfull \hbox (badness 10000) detected at line 12659 +Underfull \hbox (badness 10000) detected at line 12694 [][][] [] -Underfull \hbox (badness 10000) detected at line 12661 +Underfull \hbox (badness 10000) detected at line 12696 [][] [] -Underfull \hbox (badness 10000) detected at line 12661 +Underfull \hbox (badness 10000) detected at line 12696 [][][] [] -Underfull \hbox (badness 10000) detected at line 12663 +Underfull \hbox (badness 10000) detected at line 12698 [][] [] -Underfull \hbox (badness 10000) detected at line 12663 +Underfull \hbox (badness 10000) detected at line 12698 [][][] [] @@ -4367,32 +4370,32 @@ Underfull \hbox (badness 10000) has occurred while \output is active [293 ] [294] -Underfull \hbox (badness 10000) detected at line 12724 +Underfull \hbox (badness 10000) detected at line 12759 [][] [] -Underfull \hbox (badness 10000) detected at line 12724 +Underfull \hbox (badness 10000) detected at line 12759 [][][] [] -Underfull \hbox (badness 10000) detected at line 12726 +Underfull \hbox (badness 10000) detected at line 12761 [][] [] -Underfull \hbox (badness 10000) detected at line 12726 +Underfull \hbox (badness 10000) detected at line 12761 [][][] [] -Underfull \hbox (badness 10000) detected at line 12728 +Underfull \hbox (badness 10000) detected at line 12763 [][] [] -Underfull \hbox (badness 10000) detected at line 12728 +Underfull \hbox (badness 10000) detected at line 12763 [][][] [] @@ -4429,11 +4432,11 @@ bserved data: missing_pairs | missing_compare \EU1/lmr/m/n/10 301 [] [301] -Underfull \vbox (badness 10000) detected at line 13036 +Underfull \vbox (badness 10000) detected at line 13071 [] -Underfull \vbox (badness 10000) detected at line 13036 +Underfull \vbox (badness 10000) detected at line 13071 [] [302] @@ -4443,7 +4446,7 @@ bserved data: missing_pairs | missing_compare \EU1/lmr/m/n/10 303 [] [303] -Overfull \hbox (4.9164pt too wide) in paragraph at lines 13124--13124 +Overfull \hbox (4.9164pt too wide) in paragraph at lines 13159--13159 []\EU1/SourceCodePro(0)/m/n/12 ## Either the test of multivariate normality or homoscedasticity (or both) is rejected.[] [] @@ -4488,38 +4491,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active [310 ] -Underfull \vbox (badness 10000) detected at line 13446 +Underfull \vbox (badness 10000) detected at line 13481 [] -Overfull \hbox (395.75pt too wide) detected at line 13446 +Overfull \hbox (395.75pt too wide) detected at line 13481 [] [] -Underfull \vbox (badness 10000) detected at line 13446 +Underfull \vbox (badness 10000) detected at line 13481 [] -Overfull \hbox (395.75pt too wide) detected at line 13446 +Overfull \hbox (395.75pt too wide) detected at line 13481 [] [] -Underfull \vbox (badness 10000) detected at line 13446 +Underfull \vbox (badness 10000) detected at line 13481 [] -Overfull \hbox (395.75pt too wide) detected at line 13446 +Overfull \hbox (395.75pt too wide) detected at line 13481 [] [] -Underfull \vbox (badness 10000) detected at line 13446 +Underfull \vbox (badness 10000) detected at line 13481 [] -Overfull \hbox (395.75pt too wide) detected at line 13446 +Overfull \hbox (395.75pt too wide) detected at line 13481 [] [] @@ -4571,38 +4574,38 @@ Underfull \hbox (badness 10000) has occurred while \output is active [318 ] -Underfull \vbox (badness 10000) detected at line 13766 +Underfull \vbox (badness 10000) detected at line 13801 [] -Overfull \hbox (395.75pt too wide) detected at line 13766 +Overfull \hbox (395.75pt too wide) detected at line 13801 [] [] -Underfull \vbox (badness 10000) detected at line 13766 +Underfull \vbox (badness 10000) detected at line 13801 [] -Overfull \hbox (395.75pt too wide) detected at line 13766 +Overfull \hbox (395.75pt too wide) detected at line 13801 [] [] -Underfull \vbox (badness 10000) detected at line 13766 +Underfull \vbox (badness 10000) detected at line 13801 [] -Overfull \hbox (395.75pt too wide) detected at line 13766 +Overfull \hbox (395.75pt too wide) detected at line 13801 [] [] -Underfull \vbox (badness 10000) detected at line 13766 +Underfull \vbox (badness 10000) detected at line 13801 [] -Overfull \hbox (395.75pt too wide) detected at line 13766 +Overfull \hbox (395.75pt too wide) detected at line 13801 [] [] @@ -4615,7 +4618,7 @@ Underfull \hbox (badness 10000) has occurred while \output is active [319 ] [320] -Underfull \hbox (badness 10000) in paragraph at lines 13870--13871 +Underfull \hbox (badness 10000) in paragraph at lines 13905--13906 []\EU1/lmr/m/n/12 If you already have LaTex installed on your computer, the [] @@ -4763,23 +4766,23 @@ Underfull \hbox (badness 10000) has occurred while \output is active ] [326] [327] [328]) -Package atveryend Info: Empty hook `BeforeClearDocument' on input line 13877. -Package atveryend Info: Empty hook `AfterLastShipout' on input line 13877. +Package atveryend Info: Empty hook `BeforeClearDocument' on input line 13912. +Package atveryend Info: Empty hook `AfterLastShipout' on input line 13912. (./healthyr-book.aux) -Package atveryend Info: Empty hook `AtVeryEndDocument' on input line 13877. -Package atveryend Info: Empty hook `AtEndAfterFileList' on input line 13877. +Package atveryend Info: Empty hook `AtVeryEndDocument' on input line 13912. +Package atveryend Info: Empty hook `AtEndAfterFileList' on input line 13912. LaTeX Warning: There were undefined references. LaTeX Warning: There were multiply-defined labels. -Package atveryend Info: Empty hook `AtVeryVeryEnd' on input line 13877. +Package atveryend Info: Empty hook `AtVeryVeryEnd' on input line 13912. ) Here is how much of TeX's memory you used: 32207 strings out of 495579 560459 string characters out of 6187323 - 790917 words of memory out of 5000000 + 789872 words of memory out of 5000000 34943 multiletter control sequences out of 15000+600000 13568 words of font info for 110 fonts, out of 8000000 for 9000 14 hyphenation exceptions out of 8191 diff --git a/index.Rmd b/index.Rmd index b3047fc..1566319 100755 --- a/index.Rmd +++ b/index.Rmd @@ -46,7 +46,7 @@ lapply(healthyr_notebooks, function(pkg) { # Preface {-} -**DRAFT** +**For draft version** This is the electronic version of the HealthyR book to be published by CRC Press/Chapman & Hall in summer 2020. The electronic version will always be freely available. @@ -63,28 +63,28 @@ This work is licensed under the Creative Commons Attribution-NonCommercial-NoDer > We are drowning in information but starved for knowledge. > John Naisbitt -In this age of information, the manipulation, analysis and interpretation of data has become paramount. +In this age of information, the manipulation, analysis and interpretation of data has become a fundamental part of professional life. Nowhere more so than in the delivery of healthcare. From the understanding of disease and the development of new treatments, to the diagnosis and management of individual patients, the use of data and technology is now an integral part of the business of healthcare. Those working in healthcare interact daily with data, often without realising it. The conversion of this avalanche of information to useful knowledge is essential for high quality patient care. An important part of this information revolution is the opportunity for everybody to become involved in data analysis. -This democratisation of data analysis is driven in part by the open source software movement – no longer do we require expensive specialised software to do this. +This democratisation is driven in part by the open source software movement – no longer do we require expensive specialised software to do this. -The statistical programming language, R, is firmly at the heart of this! +The statistical programming language, R, is firmly at the heart of this. -This book will take an individual with little or no experience in data analysis all the way through to performing sophisticated analyses. -We emphasise the importance of understanding the underlying data with liberal use of plotting, rather than relying on opaque and possibly poorly understand statistical tests. +This book will take an individual with little or no experience in data science, all the way through to the execution of sophisticated analyses. +We emphasise the importance of truly understanding the underlying data with liberal use of plotting, rather than relying on opaque and possibly poorly understood statistical tests. There are numerous examples included that can be adapted for your own data, together with our own R packages with easy-to-use functions. We have a lot of fun teaching this course and focus on making the material as accessible as possible. -We banish equations in favour of code and use examples rather than lengthy explanations. -We are grateful to the many individuals and students who have helped refine these and welcome suggestions and bug reports via https://github.com/SurgicalInformatics. +We equations to a minimum in favour of code, and use examples rather than lengthy explanations. +We are grateful to the many individuals and students who have helped refine this book and welcome suggestions and bug reports via https://github.com/SurgicalInformatics. -Ewen Harrison and Riinu Ots +Ewen Harrison and Riinu Pius -August 2019 +March 2020 @@ -93,7 +93,7 @@ August 2019 ## Acknowledgments {-} -A lot of people helped us when we were writing the book. +Katie Connor, Tom Drake, Cameron Fairfield, Peter Hall, Stephen Knight, Kenneth McLean, Lisa Norman, Katie Shaw, Michael Ramage, Einar Pius, Olivia Swann. @@ -107,10 +107,10 @@ Ewen is a surgeon and Riinu is a physicist. And they're both data scientists too. They dabble with a few programming languages and are generally all over technology. They are most enthusiastic about the R statistical programming language and have a combined experience of 25 years using it. -They work at The University of Edinburgh and have taught R to hundreds of healthcare professionals and researchers. +They work at the University of Edinburgh and have taught R to hundreds of healthcare professionals and researchers. They believe a first introduction to R and statistical programming should be relatively jargon-free and outcome-oriented (get those pretty plots out). -The understanding of complicated concepts will come over time with practise and experience, not through a re-telling of the history of computing bit-by-byte, or with the inclusion of the equations of every statistical test (although Ewen has sneaked a few equations in). +The understanding of complicated concepts will come over time with practice and experience, not through a re-telling of the history of computing bit-by-byte, or with the inclusion of the underlying equations for each statistical test (although Ewen has sneaked a few equations in). Overall, they hope to make the text fun and accessible. Just like them. diff --git a/latex/before_body.tex b/latex/before_body.tex index 6057406..17e6ba0 100644 --- a/latex/before_body.tex +++ b/latex/before_body.tex @@ -6,7 +6,9 @@ \thispagestyle{empty} \begin{center} -"The future is already here — it's just not very evenly distributed." - William Gibson +``The future is already here — it's just not evenly distributed.'' + +William Gibson %\includegraphics{images/dedication.pdf} \end{center}