Merge branch 'master' of github.com:datacarpentry/R-ecology-lesson

datacarpentry · Jul 1, 2019 · 1eaef4b · 1eaef4b
2 parents f947d84 + e8826b1
commit 1eaef4b
Show file tree

Hide file tree

Showing 5 changed files with 95 additions and 60 deletions.
diff --git a/03-dplyr.Rmd b/03-dplyr.Rmd
@@ -68,7 +68,7 @@ Then, to load the package type:
 
 ```{r, message = FALSE, purl = FALSE}
 ## load the tidyverse packages, incl. dplyr
-library("tidyverse")
+library(tidyverse)
 ```
 
 ## What are **`dplyr`** and **`tidyr`**?

diff --git a/AUTHORS b/AUTHORS
@@ -31,7 +31,7 @@ Ethan White <[email protected]>
 Francisco Rodriguez-Sanchez <[email protected]>
 Francois Michonneau <[email protected]>
 Fred Boehm <[email protected]>
-GMoncrieff <[email protected]>
+Glenn Moncrieff <[email protected]>
 Hao Ye <[email protected]>
 Harriet Dashnow <[email protected]>
 Hilmar Lapp <[email protected]>
@@ -41,7 +41,7 @@ Jarrett Byrnes <[email protected]>
 Jeffrey W Hollister <[email protected]>
 Jieming Chen <[email protected]>
 Jillian Dunic <[email protected]>
-Jon <[email protected]>
+Jon Petters <[email protected]>
 Jonathan Keane <[email protected]>
 Joseph Stachelek <[email protected]>
 Josh Herr <[email protected]>
@@ -92,9 +92,9 @@ Will Furnass <[email protected]>
 Will Pearse <[email protected]>
 Ye Li <[email protected]>
 Zena Lapp <[email protected]>
-ab604 <[email protected]>
-ashander <[email protected]>
-cengel <[email protected]>
+Alistair Bailey <[email protected]>
+Jaime Ashander <[email protected]>
+Claudia Engel <[email protected]>
 Brian Seok <[email protected]>
 sfn_brt <[email protected]>
 suparee <[email protected]>
diff --git a/README.md b/README.md
@@ -51,4 +51,3 @@ maintainers, or come chat with us on the [Slack Channel for this lesson](https:/
 * Auriel Fournier
 * François Michonneau
 * Brian Seok
-* Shiva Guru
diff --git a/instructor-notes.md b/instructor-notes.md
@@ -6,7 +6,8 @@ root: .
 
 ## Dataset
 
-The data used for this lesson are in the figshare repository at: https://doi.org/10.6084/m9.figshare.1314459
+The data used for this lesson are in the figshare repository at: 
+https://doi.org/10.6084/m9.figshare.1314459
 
 This lesson uses mostly `combined.csv`. The 3 other csv files: `plots.csv`,
 `species.csv` and `surveys.csv` are only needed for the lesson on databases.
@@ -39,23 +40,25 @@ this file, so the participants can follow along.
 
 Some learners may have previous R installations. On Mac, if a new install
 is performed, the learner's system will create a symbolic link, pointing to the
-new install as 'Current.' Sometimes this process does not occur, and, even though
-a new R is installed and can be accessed via the R console, RStudio does not find it.
-The net result of this is that the learner's RStudio will be running an older R install.
-This will cause package installations to fail. This can be fixed at the terminal. First,
-check for the appropriate R installation in the library;
+new install as 'Current.' Sometimes this process does not occur, and, even
+though a new R is installed and can be accessed via the R console, RStudio does 
+not find it. The net result of this is that the learner's RStudio will be
+running an older R install. This will cause package installations to fail. This 
+can be fixed at the terminal. First, check for the appropriate R installation in 
+the library;
 
 ```
 ls -l /Library/Frameworks/R.framework/Versions/
 ```
 
-We are currently using R 3.4.x. If it isn't there, they will need to install it. If it
-is present, you will need to set the symbolic link to Current to point to the 3.4.x 
-directory:
+We are currently using R 3.6.x. If it isn't there, they will need to install it. 
+If it is present, you will need to set the symbolic link to Current to point to 
+the 3.6.x directory:
 
 ```
-ln -s /Library/Frameworks/R.framework/Versions/3.4.x /Library/Frameworks/R.framework/Version/Current
+ln -s /Library/Frameworks/R.framework/Versions/3.6.x /Library/Frameworks/R.framework/Version/Current
 ```
+
 Then restart RStudio.
 
 ## Narrative
@@ -81,7 +84,6 @@ Then restart RStudio.
   point about how workshops are a great way to create community of learners that
   can help each others during and after the workshop.
 
-
 ### Intro to R
 
 * When going over the section on assignments, make
@@ -102,23 +104,22 @@ The two main goals for this lessons are:
   exposed to it. The content of the lesson should be enough for learners to
   avoid common mistakes with them.
 
-### Manipulating data with dplyr
+### Manipulating data
 
 * For this lesson make sure that learners are comfortable using pipes.
 * There is also sometimes some confusion on what the arguments of `group_by`
   should be.
-
-### Using tidyr to reshape data for plotting
 * This lesson uses the tidyr package to reshape data for plotting
-* After this lesson students should be familiar with the spread() and gather() functions available in tidyr
+* After this lesson students should be familiar with the spread() and gather() 
+  functions available in tidyr
 
-### Visualizing data with ggplot2
+### Visualizing data
 
 * This lesson is a broad overview of ggplot2 and focuses on (1) getting familiar
   with the layering system of ggplot2, (2) using the argument `group` in the
   `aes()` function, (3) basic customization of the plots.
 
-### Using databases from R
+### R and SQL
 
 * Ideally this lesson is best taught at the end of the workshop (as a capstone
   example) to illustrate how the tools covered can integrate with each
@@ -149,15 +150,25 @@ Alternatively you can go to CRAN and download the package and install from ZIP
 file
 -   Tools > Install Packages > set to 'from Zip/TAR'
 
-It is important that R, and the R packages be installed locally, not on a network drive. If a learner is using a machine with multiple users where their account is not based locally this can create a variety of issues (This often happens on university computers). Hopefully the learner will realize these issues before hand, but depending on the machine and how the IT folks that service the computer have things set up, it may be very difficult to impossible to make R work without their help. 
+It is important that R, and the R packages be installed locally, not on a
+network drive. If a learner is using a machine with multiple users where their 
+account is not based locally this can create a variety of issues (This often 
+happens on university computers). Hopefully the learner will realize these
+issues before hand, but depending on the machine and how the IT folks that 
+service the computer have things set up, it may be very difficult to impossible 
+to make R work without their help. 
 
-If learners are having issues with one package, they may have issues with another. Its often easier to make sure they have all the needed packages installed at one time, rather then deal with these issues over and over. [Here is a list of all necessary packages for these lessons.](https://github.com/datacarpentry/R-ecology-lesson/blob/master/needed_packages.R)
+If learners are having issues with one package, they may have issues with 
+another. Its often easier to make sure they have all the needed packages 
+installed at one time, rather then deal with these issues over and over.
+[Here is a list of all necessary packages for these lessons.](https://github.com/datacarpentry/R-ecology-lesson/blob/master/needed_packages.R)
 
 ## Other Resources
 
-If you encounter a problem during a workshop, feel free to contact the
-maintainers by email
-or
+If you encounter a problem during a workshop, feel free to contact the 
+maintainers by email or
 [open an issue](https://github.com/datacarpentry/R-ecology-lesson/issues/new).
 
-For a more in-depth coverage of topics of the workshops, you may want to read "[R for Data Science](http://r4ds.had.co.nz/)" by Hadley Wickham and Garrett Grolemund.
+For a more in-depth coverage of topics of the workshops, you may want to read
+"[R for Data Science](http://r4ds.had.co.nz/)" by Hadley Wickham and Garrett 
+Grolemund.
diff --git a/reference.md b/reference.md
@@ -1,8 +1,10 @@
 Cheat sheet of functions used in the lessons
 
-
 ## Lesson 1 -- Introduction to R
 
+  * `sqrt()`    # calculate the square root
+  * `round()`   # round a number
+  * `args()`    # find what arguments a function takes
   * `length()`  # how many elements are in a particular vector
   * `class() `  # the class (the type of element) of an object
   * `str() `    # an overview of the object and the elements it contains
@@ -15,57 +17,80 @@ Cheat sheet of functions used in the lessons
 
 ## Lesson 2 -- Starting with data
 
-  * `download.file() `          # download files from the internet to your computer
-  * `read.csv() `               # load CSV file into R memory
-  * `head() `                   # check the top (the first 6 lines) of an object including data frames
-  * `factor() `                 # create factors
-  * `levels() `                 # check levels of a factor
-  * `nlevels() `                # check number of levels of a factor
-  * `as.numeric(levels(x))[x] ` # convert factors where the levels appear as numbers  to a numeric vector
-
-## Lesson 3 -- Introducing data.frame
-
-  * `data.frame()`  # create a data frame
+  * `download.file() ` # download files from the internet to your computer
+  * `read.csv() `   # load CSV file into R memory
+  * `head() `       # shows the first 6 rows
+  * `View()`        # invoke a spreadsheet-style data viewer
+  * `read.table()`  # load a file in table format into R memory
+  * `str() `        # check structure of the object and information about the class, length and content of each column
   * `dim() `        # check dimension of data frame
   * `nrow() `       # returns the number of rows
   * `ncol() `       # returns the number of  columns
-  * `head() `       # shows the first 6 rows
   * `tail() `       # shows the last 6 rows
   * `names() `      # returns the column names (synonym of colnames() for data frame objects)
   * `rownames() `   # returns the row names
-  * `str() `        # check structure of the object and information about the class, length and content of each column
   * `summary() `    # summary statistics for each column
-  * `seq() `        # generates a sequence of numbers
+  * `factor() `      # create factors
+  * `levels() `      # check levels of a factor
+  * `nlevels() `     # check number of levels of a factor
+  * `as.character()` # convert an object to a character vector
+  * `as.numeric()`   # convert an object to a numeric vector
+  * `as.numeric(as.character(x))` # convert factors where the levels appear as characters to a numeric vector
+  * `as.numeric(levels(x))[x]` # convert factors where the levels appear as numbers  to a numeric vector
+  * `plot()`  # plot an object
+  * `data.frame()`  # create a data.frame object
+  * `ymd()` # convert a vector representing year, month, and day to a Date vector
+  * `paste()` # concatenate vectors after converting to character
 
-## Lesson 4 -- Aggregating and analyzing data with dplyr
+## Lesson 3 -- Manipulating, analyzing and exporting data with tidyverse
 
-  * `install.packages()` # install a CRAN package in R
-  * `library() `         # load installed package into the current session
+  * `read_csv()` # load a csv formatted file into R memory
+  * `str()` # check structure of the object and information about the class, length and content of each column
+  * `View()` # invoke a spreadsheet-style data viewer
   * `select() `          # select columns of a data frame
   * `filter() `          # allows you to select a subset of rows in a data frame
   * `%>% `               # pipes to select and filter at the same time
   * `mutate() `          # create new columns based on the values in existing columns
+  * `head() `       # shows the first 6 rows
   * `group_by() `        # split the data into groups, apply some analysis to each group, and then combine the results.
   * `summarize() `       # collapses each group into a single-row summary of that group
-  * `tally()`            # counts the total number of records for each category.
-  * `write.csv() `       # save CSV file
+  * `mean()` # calculate the mean value of a vector  
+  * `!is.na()`   # test if there are no missing values
+  * `print()` # print values to the console
+  * `min()` # return the minimum value of a vector
+  * `arrange()` # arrange rows by variables
+  * `desc()` # transform a vector into a format that will be sorted in descending order
+  * `count()` # counts the total number of records for each category
+  * `spread()` # reshape a data frame by a key-value pair across multiple columns
+  * `gather()` # reshape a data frame by collapsing into a key-value pair
+  * `n_distinct()` # get a count of unique values
+  * `write_csv()` # save to a csv formatted file
 
-## Lesson 5 -- Data visualization with ggplot2
+## Lesson 4 -- Data visualization with ggplot2
 
-  * `ggplot2(data= , aes(x= , y= )) + geom_point( ) + facet_wrap () +
-    theme_bw() + theme() `
+  * `read_csv()` # load a csv formatted file into R memory
+  * `ggplot2(data= , aes(x= , y= )) + geom_point( ) + facet_wrap () + theme_bw() + theme() `
   * `aes()` # by selecting the variables to be plotted and the variables to
     define the presentation such as plotting size, shape color, etc.
   * `geom_` # graphical representation of the data in the plot (points, lines, bars). To add a geom to the plot use + operator
   * `facet_wrap()` # allows to split one plot into multiple plots based on a factor included in the dataset
+  * `labs()` # set labels to plot
   * `theme_bw()`   # set the background to white
   * `theme()`      # used to locally modify one or more theme elements in a specific ggplot object
-  *
-## Lesson 6 -- R and SQL
+  * `grid.arrange()` # combine and arrange multiple ggplots into a single figure
+  * `ggsave()` # save a ggplot
+
+## Lesson 5 -- SQL databases and R
 
-  * `src_sqlite`  # connect dplyr to a SQLite database file
+  * `dir.create()` # create a directory
+  * `download.file() ` # download files from the internet to your computer
+  * `dbConnect()` # create a connection to a database
+  * `SQLite()` # connect to a SQLite database
+  * `src_dbi()` # connect dplyr to a DBI-compatible database file
   * `tbl`         # connect to a table within a database
-  * `collect`     # retrieve all the results from the database
-  * `explain`     # show the SQL translation of a dplyr query
-  * `inner_join`  # perform an inner join between two tables
-  * `copy_to`     # copy a data frame as a table into a database
+  * `sql()` # combine character vectors into a single SQL expression 
+  * `show_query()` # show which SQL commands are sent to the database
+  * `collect()`     # retrieve all the results from the database
+  * `inner_join()`  # perform an inner join between two tables
+  * `src_sqlite()` # connect dplyr to a SQLite database file
+  * `copy_to()`     # copy a data frame as a table into a database