@@ -37,10 +37,10 @@ So far, you have seen the basics of manipulating data frames with our nordic dat
3737
3838::::::::::::::::::::::::::::::::::::::::: instructor
3939
40- Pay attention to and explain the errors and warnings generated from the
40+ Pay attention to and explain the errors and warnings generated from the
4141examples in this episode.
4242
43- :::::::::::::::::::::::::::::::::::::::::
43+ :::::::::::::::::::::::::::::::::::::::::
4444
4545``` {r, echo=TRUE}
4646gapminder <- read.csv("data/gapminder_data.csv")
@@ -75,7 +75,7 @@ gapminder <- read.csv("https://datacarpentry.org/r-intro-geospatial/data/gapmind
7575
7676- You can read directly from excel spreadsheets without
7777 converting them to plain text first by using the [ readxl] ( https://cran.r-project.org/package=readxl ) package.
78-
78+
7979
8080::::::::::::::::::::::::::::::::::::::::::::::::::
8181
@@ -86,10 +86,12 @@ always do is check out what the data looks like with `str`:
8686str(gapminder)
8787```
8888
89- We can also examine individual columns of the data frame with our ` class ` function:
89+ We can also examine individual columns of the data frame with the ` class ` or
90+ 'typeof' functions.:
9091
9192``` {r}
9293class(gapminder$year)
94+ typeof(gapminder$year)
9395class(gapminder$country)
9496str(gapminder$country)
9597```
@@ -281,6 +283,59 @@ tail(gapminder_norway)
281283
282284To understand why R is giving us a warning when we try to add this row, let's learn a little more about factors.
283285
286+
287+ ## Removing columns and rows in data frames
288+
289+ To remove columns from a data frame, we can use the 'subset' function.
290+ This function allows us to remove columns using their names:
291+
292+ ``` {r}
293+ life_expectancy <- subset(gapminder, select = -c(continent, pop, gdpPercap))
294+ head(life_expectancy)
295+ ```
296+
297+ We can also use a logical vector to achieve the same result. Make sure the
298+ vector's length match the number of columns in the data frame (to avoid vector
299+ recycling):
300+
301+ ``` {r}
302+ life_expectancy <- gapminder[c(TRUE, TRUE, FALSE, FALSE, TRUE, FALSE)]
303+ head(life_expectancy)
304+ ```
305+
306+ Alternatively, we can use column's positions:
307+
308+ ``` {r}
309+ life_expectancy <- gapminder[-c(3, 4, 6)]
310+ head(life_expectancy)
311+ ```
312+
313+ Note that the easy way to remove rows from a data frame is selecting the rows
314+ we want to keep instead.
315+ Anyway, to remove rows from a data frame, we can use their positions:
316+
317+ ``` {r}
318+ # Filter data for Afghanistan during the 20th century:
319+ afghanistan_20c <- gapminder[gapminder$country == "Afghanistan" &
320+ gapminder$year > 2000, ]
321+
322+ # Now remove data for 2002, that is, the first row:
323+ afghanistan_20c[-1, ]
324+ ```
325+
326+
327+ An interesting case is removing rows containing NAs:
328+
329+ ``` {r}
330+ # Turn some values into NAs:
331+ afghanistan_20c <- gapminder[gapminder$country == "Afghanistan", ]
332+ afghanistan_20c[afghanistan_20c$year < 2007, "year"] <- NA
333+
334+ # Remove NAs
335+ na.omit(afghanistan_20c)
336+ ```
337+
338+
284339## Factors
285340
286341Here is another thing to look out for: in a ` factor ` , each different value
0 commit comments