Skip to content

Commit

Permalink
extinction unit
Browse files Browse the repository at this point in the history
  • Loading branch information
cboettig committed Nov 14, 2017
1 parent 345f228 commit c7509ec
Show file tree
Hide file tree
Showing 36 changed files with 1,985 additions and 6 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@
.Rhistory
*.zip
*.xlsx
blogdown
182 changes: 182 additions & 0 deletions content/assignment/extinction.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
---
title: "Extinctions Unit"
author: "Your name, partner name"
maketitle: true
output: github_document
---



```{r include=FALSE}
library("tidyverse")
library("httr")
library("jsonlite")
#library("printr")
knitr::opts_chunk$set(comment=NA)
```



## Extinctions Module

_Are we experiencing the sixth great extinction?_

What is the current pace of extinction? Is it accelerating? How does it compare to background extinction rates?

## Background

- [Section Intro Video](https://youtu.be/QsH6ytm89GI)
- [Ceballos et al (2015)](http://doi.org/10.1126/sciadv.1400253)

## Computational Topics

- Accessing data from a RESTful API
- Error handling
- JSON data format
- Regular expressions
- Working with missing values

## Additional references:

- http://www.hhmi.org/biointeractive/biodiversity-age-humans (Video)
- [Barnosky et al. (2011)](http://doi.org/10.1038/nature09678)
- [Pimm et al (2014)](http://doi.org/10.1126/science.1246752)
- [Sandom et al (2014)](http://dx.doi.org/10.1098/rspb.2013.3254)


# Getting started (based on live code session)




## CURL and REST

Use the `httr` package to make a single API query against the following endpoint: `http://api.iucnredlist.org/index/species/Acaena-exigua.json`

```{r}
```

Examine the response and the content of the response. Can you tell if the call was successful? What was the return type object? Can you parse the return into an R object? Can you represent the return data as a data.frame?

```{r}
```



# Working with Regular Expressions

- [Self-guided Tutorial](http://regexone.com/)
- [Cheetsheet](http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/)
- [stringr RegEx Vignette](https://cran.r-project.org/web/packages/stringr/vignettes/regular-expressions.html)


One of the entries in the response contains a field that may contain some information on when the species went extinct. Identify the appropriate column and extract this information using a *regular expression*, as discussed in the live code exercise.


```{r}
```




# Calculating Extinction Rates: Putting it all together

First, to know what queries to make to the IUCN REST API, we need a list of extinct species names. This information can be downloaded from the IUCN website, but unfortunately this is not easily automated. Thus we'll download the data file using a copy already prepared for the course:


```{r}
extinct = read_csv("https://espm-157.github.io/extinction-module/extinct.csv")
extinct
```


Write a function to extract the rationale for the extinction for all extinct species in the data set (see above file)

```{r}
```

Test your function on a subset of the data before attempting the full data set. Use our `dplyr` pipe syntax to iterate over your function.



```{r}
```

Now create a function that can extract the date from the rationale, and include this function in your data analysis pipeline.


```{r}
```




## Histogram of Extinction Dates

We can get a sense for the tempo of extinctions by plotting extinctions since 1500 in 25-year interval bins.

```{r}
```

# Exercises


# Question 1: Extinctions by group

A. Compute the number of extinctions from 1500 - 1900 and from 1900 to present of each of the following taxonomic groups:

- Vertebrates
- Mammals
- Birds
- Fish
- Amphibians
- Reptiles
- Insects
- Plants

Compare your estimates to Table 1 of [Ceballos et al (2015)](http://doi.org/10.1126/sciadv.1400253).


## Question 2: Weighing by number of species


The number of species going extinct per century in a given taxonomic group will be influenced by how many species are present in the group to begin with. (For an obvious example, the number of vertebrate extinctions is always going to be higher than the number of mammal extinctions, since mammals are vertebrates). Overall, these numbers do not change greatly over a period of a few hundred years, so we were able to make the relative comparisons between the roughly pre-industrial and post-industrial periods above.

As discussed by Tony Barnosky in the introductory video (or in [Ceballos et al (2015)](http://doi.org/10.1126/sciadv.1400253) paper), if we want to compare these extinction rates against the long-term palentological record, it is necessary to weigh the rates by the total number of species. That is, to compute the number of extinctions per million species per year (MSY; equivalently, the number extinctions per 10,000 species per 100 years).

A) First, we will compute how many species are present in each of the taxonomic groups. To do so, we need a table that has not only extinct species, but all assessed species. We will once again query this information from the IUCN API.


This is going to involve a lot of data -- more than the API can serve in a single chunk. Instead, the API breaks the returns up into groups of 10,000 species per page (see API docs: http://apiv3.iucnredlist.org/api/v3/docs#species). Luckily, the API also tells us the total number of species:

http://apiv3.iucnredlist.org/api/v3/speciescount?token=9bb4facb6d23f48efbf424bb05c0c1ef1cf6f468393bc745d42179ac4aca5fee

The code below queries the first page. How many pages will we need to get all the data? Modify the example below to collect all of the data into a single DataFrame. Note the use of `append` to add data to an existing data.frame with matching column labels.


```{r}
```


B) Based on the complete data, write queries that count the number of species in each group. Then use these numbers to compute MSY, the number extinctions per 10,000 species per 100 years, for each of the groups listed in Question 1. How do your estimates compare to the overall historical average of about 2 MSY?

## Question 3: Improving our algorithm


In parsing the data with regular expressions, we encountered certain data that resulted in missing values. Identify and investigate the strings for which we were not able to extract a date value.

- Why did the date extraction fail?
- Can you deduce an approximate date by examining the text?
- Can you modify the regular expression to reduce the number of missing values?
- How do these missing values impact our overall estimate of the extinction rate? (In which direction, and by approximately what amount?)


## Question 4: Looking forward (bonus)

Plot the MSY rates in intervals of 50 years for each of the groups as a line plot (compare to Figure 1a of [Ceballos et al (2015)](http://doi.org/10.1126/sciadv.1400253) paper). Compute the slope of these curves to forecast the extinction rate in 2100.
147 changes: 147 additions & 0 deletions content/assignment/extinction.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
title: "Extinctions Unit"
author: "Your name, partner name"
maketitle: true
output: github_document
---



<div id="extinctions-module" class="section level2">
<h2>Extinctions Module</h2>
<p><em>Are we experiencing the sixth great extinction?</em></p>
<p>What is the current pace of extinction? Is it accelerating? How does it compare to background extinction rates?</p>
</div>
<div id="background" class="section level2">
<h2>Background</h2>
<ul>
<li><a href="https://youtu.be/QsH6ytm89GI">Section Intro Video</a></li>
<li><a href="http://doi.org/10.1126/sciadv.1400253">Ceballos et al (2015)</a></li>
</ul>
</div>
<div id="computational-topics" class="section level2">
<h2>Computational Topics</h2>
<ul>
<li>Accessing data from a RESTful API</li>
<li>Error handling</li>
<li>JSON data format</li>
<li>Regular expressions</li>
<li>Working with missing values</li>
</ul>
</div>
<div id="additional-references" class="section level2">
<h2>Additional references:</h2>
<ul>
<li><a href="http://www.hhmi.org/biointeractive/biodiversity-age-humans" class="uri">http://www.hhmi.org/biointeractive/biodiversity-age-humans</a> (Video)</li>
<li><a href="http://doi.org/10.1038/nature09678">Barnosky et al. (2011)</a></li>
<li><a href="http://doi.org/10.1126/science.1246752">Pimm et al (2014)</a></li>
<li><a href="http://dx.doi.org/10.1098/rspb.2013.3254">Sandom et al (2014)</a></li>
</ul>
</div>
<div id="getting-started-based-on-live-code-session" class="section level1">
<h1>Getting started (based on live code session)</h1>
<div id="curl-and-rest" class="section level2">
<h2>CURL and REST</h2>
<p>Use the <code>httr</code> package to make a single API query against the following endpoint: <code>http://api.iucnredlist.org/index/species/Acaena-exigua.json</code></p>
<p>Examine the response and the content of the response. Can you tell if the call was successful? What was the return type object? Can you parse the return into an R object? Can you represent the return data as a data.frame?</p>
</div>
</div>
<div id="working-with-regular-expressions" class="section level1">
<h1>Working with Regular Expressions</h1>
<ul>
<li><a href="http://regexone.com/">Self-guided Tutorial</a></li>
<li><a href="http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/">Cheetsheet</a></li>
<li><a href="https://cran.r-project.org/web/packages/stringr/vignettes/regular-expressions.html">stringr RegEx Vignette</a></li>
</ul>
<p>One of the entries in the response contains a field that may contain some information on when the species went extinct. Identify the appropriate column and extract this information using a <em>regular expression</em>, as discussed in the live code exercise.</p>
</div>
<div id="calculating-extinction-rates-putting-it-all-together" class="section level1">
<h1>Calculating Extinction Rates: Putting it all together</h1>
<p>First, to know what queries to make to the IUCN REST API, we need a list of extinct species names. This information can be downloaded from the IUCN website, but unfortunately this is not easily automated. Thus we’ll download the data file using a copy already prepared for the course:</p>
<pre class="r"><code>extinct = read_csv(&quot;https://espm-157.github.io/extinction-module/extinct.csv&quot;)</code></pre>
<pre><code>Parsed with column specification:
cols(
.default = col_character(),
`Species ID` = col_integer(),
`Red List criteria version` = col_double(),
`Year assessed` = col_integer()
)</code></pre>
<pre><code>See spec(...) for full column specifications.</code></pre>
<pre class="r"><code>extinct</code></pre>
<pre><code># A tibble: 834 x 23
`Species ID` Kingdom Phylum Class Order
&lt;int&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt; &lt;chr&gt;
1 44072 PLANTAE TRACHEOPHYTA MAGNOLIOPSIDA ROSALES
2 195373 PLANTAE TRACHEOPHYTA MAGNOLIOPSIDA EUPHORBIALES
3 37854 PLANTAE TRACHEOPHYTA MAGNOLIOPSIDA EUPHORBIALES
4 199821 PLANTAE TRACHEOPHYTA MAGNOLIOPSIDA EUPHORBIALES
5 82 ANIMALIA ARTHROPODA INSECTA EPHEMEROPTERA
6 167 ANIMALIA MOLLUSCA GASTROPODA STYLOMMATOPHORA
7 170 ANIMALIA MOLLUSCA GASTROPODA STYLOMMATOPHORA
8 173 ANIMALIA MOLLUSCA GASTROPODA STYLOMMATOPHORA
9 174 ANIMALIA MOLLUSCA GASTROPODA STYLOMMATOPHORA
10 179 ANIMALIA MOLLUSCA GASTROPODA STYLOMMATOPHORA
# ... with 824 more rows, and 18 more variables: Family &lt;chr&gt;,
# Genus &lt;chr&gt;, Species &lt;chr&gt;, Authority &lt;chr&gt;, `Infraspecific
# rank` &lt;chr&gt;, `Infraspecific name` &lt;chr&gt;, `Infraspecific
# authority` &lt;chr&gt;, `Stock/subpopulation` &lt;chr&gt;, Synonyms &lt;chr&gt;, `Common
# names (Eng)` &lt;chr&gt;, `Common names (Fre)` &lt;chr&gt;, `Common names
# (Spa)` &lt;chr&gt;, `Red List status` &lt;chr&gt;, `Red List criteria` &lt;chr&gt;, `Red
# List criteria version` &lt;dbl&gt;, `Year assessed` &lt;int&gt;, `Population
# trend` &lt;chr&gt;, Petitioned &lt;chr&gt;</code></pre>
<p>Write a function to extract the rationale for the extinction for all extinct species in the data set (see above file)</p>
<p>Test your function on a subset of the data before attempting the full data set. Use our <code>dplyr</code> pipe syntax to iterate over your function.</p>
<p>Now create a function that can extract the date from the rationale, and include this function in your data analysis pipeline.</p>
<div id="histogram-of-extinction-dates" class="section level2">
<h2>Histogram of Extinction Dates</h2>
<p>We can get a sense for the tempo of extinctions by plotting extinctions since 1500 in 25-year interval bins.</p>
</div>
</div>
<div id="exercises" class="section level1">
<h1>Exercises</h1>
</div>
<div id="question-1-extinctions-by-group" class="section level1">
<h1>Question 1: Extinctions by group</h1>
<p>A. Compute the number of extinctions from 1500 - 1900 and from 1900 to present of each of the following taxonomic groups:</p>
<ul>
<li>Vertebrates</li>
<li>Mammals</li>
<li>Birds</li>
<li>Fish</li>
<li>Amphibians</li>
<li>Reptiles</li>
<li>Insects</li>
<li>Plants</li>
</ul>
<p>Compare your estimates to Table 1 of <a href="http://doi.org/10.1126/sciadv.1400253">Ceballos et al (2015)</a>.</p>
<div id="question-2-weighing-by-number-of-species" class="section level2">
<h2>Question 2: Weighing by number of species</h2>
<p>The number of species going extinct per century in a given taxonomic group will be influenced by how many species are present in the group to begin with. (For an obvious example, the number of vertebrate extinctions is always going to be higher than the number of mammal extinctions, since mammals are vertebrates). Overall, these numbers do not change greatly over a period of a few hundred years, so we were able to make the relative comparisons between the roughly pre-industrial and post-industrial periods above.</p>
<p>As discussed by Tony Barnosky in the introductory video (or in <a href="http://doi.org/10.1126/sciadv.1400253">Ceballos et al (2015)</a> paper), if we want to compare these extinction rates against the long-term palentological record, it is necessary to weigh the rates by the total number of species. That is, to compute the number of extinctions per million species per year (MSY; equivalently, the number extinctions per 10,000 species per 100 years).</p>
<ol style="list-style-type: upper-alpha">
<li>First, we will compute how many species are present in each of the taxonomic groups. To do so, we need a table that has not only extinct species, but all assessed species. We will once again query this information from the IUCN API.</li>
</ol>
<p>This is going to involve a lot of data – more than the API can serve in a single chunk. Instead, the API breaks the returns up into groups of 10,000 species per page (see API docs: <a href="http://apiv3.iucnredlist.org/api/v3/docs#species" class="uri">http://apiv3.iucnredlist.org/api/v3/docs#species</a>). Luckily, the API also tells us the total number of species:</p>
<p><a href="http://apiv3.iucnredlist.org/api/v3/speciescount?token=9bb4facb6d23f48efbf424bb05c0c1ef1cf6f468393bc745d42179ac4aca5fee" class="uri">http://apiv3.iucnredlist.org/api/v3/speciescount?token=9bb4facb6d23f48efbf424bb05c0c1ef1cf6f468393bc745d42179ac4aca5fee</a></p>
<p>The code below queries the first page. How many pages will we need to get all the data? Modify the example below to collect all of the data into a single DataFrame. Note the use of <code>append</code> to add data to an existing data.frame with matching column labels.</p>
<ol start="2" style="list-style-type: upper-alpha">
<li>Based on the complete data, write queries that count the number of species in each group. Then use these numbers to compute MSY, the number extinctions per 10,000 species per 100 years, for each of the groups listed in Question 1. How do your estimates compare to the overall historical average of about 2 MSY?</li>
</ol>
</div>
<div id="question-3-improving-our-algorithm" class="section level2">
<h2>Question 3: Improving our algorithm</h2>
<p>In parsing the data with regular expressions, we encountered certain data that resulted in missing values. Identify and investigate the strings for which we were not able to extract a date value.</p>
<ul>
<li>Why did the date extraction fail?<br />
</li>
<li>Can you deduce an approximate date by examining the text?</li>
<li>Can you modify the regular expression to reduce the number of missing values?<br />
</li>
<li>How do these missing values impact our overall estimate of the extinction rate? (In which direction, and by approximately what amount?)</li>
</ul>
</div>
<div id="question-4-looking-forward-bonus" class="section level2">
<h2>Question 4: Looking forward (bonus)</h2>
<p>Plot the MSY rates in intervals of 50 years for each of the groups as a line plot (compare to Figure 1a of <a href="http://doi.org/10.1126/sciadv.1400253">Ceballos et al (2015)</a> paper). Compute the slope of these curves to forecast the extinction rate in 2100.</p>
</div>
</div>
Binary file added content/assignment/images/albatross.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/aquaculture_map.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/cover.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/crs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/examples.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/lumpsucker.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/lumpsucker.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/raster_concept.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/singletomulti.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/assignment/images/sst.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit c7509ec

Please sign in to comment.