diff --git a/_freeze/problem-sets/ps-10/execute-results/html.json b/_freeze/problem-sets/ps-10/execute-results/html.json new file mode 100644 index 00000000..69e24a64 --- /dev/null +++ b/_freeze/problem-sets/ps-10/execute-results/html.json @@ -0,0 +1,15 @@ +{ + "hash": "a6352b4b43855cb24b2f76516def7003", + "result": { + "engine": "knitr", + "markdown": "---\ntitle: \"Problem Set Stats Bootcamp - class 10\"\nsubtitle: \"Stats intro and history\"\nauthor: \"Neelanjan Mukherjee\"\ndate: last-modified\n---\n\n::: {.cell}\n::: {.cell-output .cell-output-stderr}\n\n```\n── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──\n✔ dplyr 1.1.4 ✔ readr 2.1.5\n✔ forcats 1.0.0 ✔ stringr 1.5.1\n✔ ggplot2 3.5.1 ✔ tibble 3.2.1\n✔ lubridate 1.9.3 ✔ tidyr 1.3.1\n✔ purrr 1.0.2 \n── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──\n✖ dplyr::filter() masks stats::filter()\n✖ dplyr::lag() masks stats::lag()\nℹ Use the conflicted package () to force all conflicts to become errors\n\nAttaching package: 'cowplot'\n\n\nThe following object is masked from 'package:lubridate':\n\n stamp\n```\n\n\n:::\n:::\n\n\n\n\n## Explore coin flip distribution characteristics\n\nWhen we flip a fair coin multiple times (`numFlips`) in a row, we expect to get heads (or tails) 50% of the time on average. This is not always the case for a single round of flipping, but if we do multiple rounds with (`numRounds`) that average should be 50%.\n\n## Problem \\# 1\n\nIn class, we simulated coin flip experiments using two different coins that were either fair (0.5 prob of head) or unfair (0.9 prob of head). We varied the number of flips in a single round (`numFlips`) and the number of rounds of flipping (`numRounds`). For this assignment, use the same to coins and use all possible combinations of `numFlips` and `numRounds` from the table below. Make sure to `set.seed(9)` if you want to get the same results as in the answer key.\n\n| `numFlips` | `numRounds` |\n|------------|-------------|\n| 5 | 10 |\n| 500 | 100 |\n\n: parameters to explore\n\n1. Create a function called `flippy` that allows your to set `numFlips`, `numRounds`, and `prob` of H.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nflippy <- function(nFlip, nRound, myProb) {\n}\n```\n:::\n\n\n\n\n2. Plot your result. There should be 4 plots total. Create a 2x2 grid by using `faceting`.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# we want 5 and 500 flips\n\n# we want to do 10 rounds\n\n# we want to do 100 rounds\n\n# combine data and tidy\n\n# plot and facet\n```\n:::\n\n\n\n\nDescribe (couple sentences) how increasing `numFlips` and `numRounds` alters:\n\n3\\. The estimation of the true mean.\n\n4\\. The spread around the true mean.\n\n## Problem \\# 2\n\nWe have seen that scientists played a major role in the eugenics movement, which greatly influenced society through policies. Explore the \\[eugenics archive\\](https://www.eugenicsarchive.ca/), pick on event from the `timeline` or the `connections` section. Describe what happened in a couple sentences and your thoughts on the matter in another couple sentences. Alternatively, you can identify events in Colorado history to describe and comment on. Obviously, there is no wrong answer here. I may pick some of your responses and anonymously share them with the rest of the class.\n", + "supporting": [], + "filters": [ + "rmarkdown/pagebreak.lua" + ], + "includes": {}, + "engineDependencies": {}, + "preserve": {}, + "postProcess": true + } +} \ No newline at end of file diff --git a/_freeze/slides/slides-03/execute-results/html.json b/_freeze/slides/slides-03/execute-results/html.json index 58dd5ad5..d4cb2bbf 100644 --- a/_freeze/slides/slides-03/execute-results/html.json +++ b/_freeze/slides/slides-03/execute-results/html.json @@ -2,16 +2,12 @@ "hash": "112c7ed043d3d2686c6aba6bf3aa6abc", "result": { "engine": "knitr", - "markdown": "---\ntitle: \"R Bootcamp - Day 3\"\nsubtitle: \"dplyr\"\nauthor: \"Jay Hesselberth\"\n---\n\n\n\n\n\n\n## Class 3 outline {.smaller}\n\n* Introduce _dplyr_ & today's datasets (Exercise 1)\n* Review basic functions of _dplyr_\n * core dplyr verbs: \n - `arrange` (Exercise 2)\n - `filter` (Exercise 3)\n - `select` (Exercise 4)\n - `mutate` and the pipe (Exercise 5)\n - `summarise` (Exercise 6)\n * modify scope of verbs using: `group_by` (Exercise 7)\n * and many others! `rename`, `count`, `add_row`, `add_column`, `distinct`,\n `sample_n`, `sample_frac`, `slice`, `pull` (Exercise 8)\n\n## dplyr overview\n\ndplyr: \n\n* provides a set of tools for efficiently manipulating data sets in R. \n* is extremely fast even with large data sets. \n* follows the tidyverse grammar and philosophy; human-readable and intuitive\n* encourages linking of verbs together using pipes `|>` (or the older `%>%`)\n\n## Today's datasets {.smaller}\n\n* We will use a data set that comes with the `dplyr` package to explore its functions. \n\n* `dplyr::starwars` contains data for characters from Star Wars.\n\n. . .\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nstarwars\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 C-3PO 167 75 gold yellow 112 none mascu…\n 3 R2-D2 96 32 white, bl… red 33 none mascu…\n 4 Darth V… 202 136 none white yellow 41.9 male mascu…\n 5 Leia Or… 150 49 brown light brown 19 fema… femin…\n 6 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 7 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 8 R5-D4 97 32 white, red red NA none mascu…\n 9 Biggs D… 183 84 black light brown 24 male mascu…\n10 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n. . .\n\nExplore `starwars` in the console with `head()`, `View()`, and `summary()`.\n\n## dplyr package {.smaller}\n\n`dplyr` is a grammar of data manipulation, providing a consistent set of\nverbs that help you solve the most common data manipulation challenges:\n\n - `arrange()` changes the ordering of the rows.\n - `filter()` picks cases based on their values.\n - `select()` picks variables based on their names.\n - `mutate()` adds new variables that are functions of existing variables\n - `summarise()` reduces multiple values down to a single summary.\n \n. . .\n\n- These all combine naturally with `group_by()` which allows you to perform\nany operation \"by group\". \n\n- Pipes `|>` allows different functions to be used together to create a\nworkflow. `x |> f(y)` turns into `f(x, y)`\n\n## arrange - Syntax\n\n- `arrange()` orders rows by values of one or more columns (low to high).\n- The `desc()` helper orders high to low. \n\n. . .\n\n```r\narrange(data = ..., )\n```\n\n## arrange - Exercise 2 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# default is to arrange in ascending order\narrange(starwars, height)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Yoda 66 17 white green brown 896 male mascu…\n 2 Ratts T… 79 15 none grey, blue unknown NA male mascu…\n 3 Wicket … 88 20 brown brown brown 8 male mascu…\n 4 Dud Bolt 94 45 none blue, grey yellow NA male mascu…\n 5 R2-D2 96 32 white, bl… red 33 none mascu…\n 6 R4-P17 96 NA none silver, r… red, blue NA none femin…\n 7 R5-D4 97 32 white, red red NA none mascu…\n 8 Sebulba 112 40 none grey, red orange NA male mascu…\n 9 Gasgano 122 NA none white, bl… black NA male mascu…\n10 Watto 137 NA black blue, grey yellow NA male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## arrange - Exercise 2 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# arrange in descending order\narrange(starwars, desc(height))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Yarael … 264 NA none white yellow NA male mascu…\n 2 Tarfful 234 136 brown brown blue NA male mascu…\n 3 Lama Su 229 88 none grey black NA male mascu…\n 4 Chewbac… 228 112 brown unknown blue 200 male mascu…\n 5 Roos Ta… 224 82 none grey orange NA male mascu…\n 6 Grievous 216 159 none brown, wh… green, y… NA male mascu…\n 7 Taun We 213 NA none grey black NA fema… femin…\n 8 Rugor N… 206 NA none green orange NA male mascu…\n 9 Tion Me… 206 80 none grey black NA male mascu…\n10 Darth V… 202 136 none white yellow 41.9 male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## arrange - Exercise 2 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# arrange by multiple columns\narrange(starwars, height, mass)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Yoda 66 17 white green brown 896 male mascu…\n 2 Ratts T… 79 15 none grey, blue unknown NA male mascu…\n 3 Wicket … 88 20 brown brown brown 8 male mascu…\n 4 Dud Bolt 94 45 none blue, grey yellow NA male mascu…\n 5 R2-D2 96 32 white, bl… red 33 none mascu…\n 6 R4-P17 96 NA none silver, r… red, blue NA none femin…\n 7 R5-D4 97 32 white, red red NA none mascu…\n 8 Sebulba 112 40 none grey, red orange NA male mascu…\n 9 Gasgano 122 NA none white, bl… black NA male mascu…\n10 Watto 137 NA black blue, grey yellow NA male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## filter - Syntax\n\n- `filter()` chooses rows/cases where conditions are true.\n\n```r\nfilter(data = ..., )\n```\n\n## filter - Exercise 3 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(starwars, skin_color == \"light\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 11 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Leia Or… 150 49 brown light brown 19 fema… femin…\n 2 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 3 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 4 Biggs D… 183 84 black light brown 24 male mascu…\n 5 Lobot 175 79 none light blue 37 male mascu…\n 6 Padmé A… 185 45 brown light brown 46 fema… femin…\n 7 Cordé 157 NA brown light brown NA \n 8 Dormé 165 NA brown light brown NA fema… femin…\n 9 Raymus … 188 79 brown light brown NA male mascu…\n10 Rey NA NA brown light hazel NA fema… femin…\n11 Poe Dam… NA NA brown light brown NA male mascu…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## filter - Exercise 3 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(starwars, height < 150)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 10 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 R2-D2 96 32 white, bl… red 33 none mascu…\n 2 R5-D4 97 32 white, red red NA none mascu…\n 3 Yoda 66 17 white green brown 896 male mascu…\n 4 Wicket … 88 20 brown brown brown 8 male mascu…\n 5 Watto 137 NA black blue, grey yellow NA male mascu…\n 6 Sebulba 112 40 none grey, red orange NA male mascu…\n 7 Ratts T… 79 15 none grey, blue unknown NA male mascu…\n 8 Dud Bolt 94 45 none blue, grey yellow NA male mascu…\n 9 Gasgano 122 NA none white, bl… black NA male mascu…\n10 R4-P17 96 NA none silver, r… red, blue NA none femin…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## filter - Exercise 3 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(\n starwars,\n mass > mean(mass, na.rm = TRUE)\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 10 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Darth V… 202 136 none white yellow 41.9 male mascu…\n 2 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 3 Chewbac… 228 112 brown unknown blue 200 male mascu…\n 4 Jabba D… 175 1358 green-tan… orange 600 herm… mascu…\n 5 Jek Ton… 180 110 brown fair blue NA \n 6 IG-88 200 140 none metal red 15 none mascu…\n 7 Bossk 190 113 none green red 53 male mascu…\n 8 Dexter … 198 102 none brown yellow NA male mascu…\n 9 Grievous 216 159 none brown, wh… green, y… NA male mascu…\n10 Tarfful 234 136 brown brown blue NA male mascu…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## filter - Exercise 3 {.smaller}\n\nFilter out cases where `hair_color` is `NA`\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(starwars, is.na(hair_color))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 5 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n1 C-3PO 167 75 gold yellow 112 none mascu…\n2 R2-D2 96 32 white, bl… red 33 none mascu…\n3 R5-D4 97 32 white, red red NA none mascu…\n4 Greedo 173 74 green black 44 male mascu…\n5 Jabba De… 175 1358 green-tan… orange 600 herm… mascu…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## filter - Exercise 3 {.smaller}\n\n* The most frequently used comparison operators are:\n\n- `>`, `<`, `>=`, `<=`, `==` (equal), `!=` (not equal)\n- `is.na()`, `!is.na()`, and `%in%` (contained in a vector of cases). \n\n. . .\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(\n starwars,\n skin_color %in% c(\"light\", \"fair\", \"pale\")\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 33 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 Leia Or… 150 49 brown light brown 19 fema… femin…\n 3 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 4 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 5 Biggs D… 183 84 black light brown 24 male mascu…\n 6 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n 7 Anakin … 188 84 blond fair blue 41.9 male mascu…\n 8 Wilhuff… 180 NA auburn, g… fair blue 64 male mascu…\n 9 Han Solo 180 80 brown fair brown 29 male mascu…\n10 Wedge A… 170 77 brown fair hazel 21 male mascu…\n# ℹ 23 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n--- \n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# can also store as a named vector and use %in% with the vector\ncolor <- c(\"light\", \"fair\", \"pale\")\nfilter(starwars, skin_color %in% color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 33 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 Leia Or… 150 49 brown light brown 19 fema… femin…\n 3 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 4 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 5 Biggs D… 183 84 black light brown 24 male mascu…\n 6 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n 7 Anakin … 188 84 blond fair blue 41.9 male mascu…\n 8 Wilhuff… 180 NA auburn, g… fair blue 64 male mascu…\n 9 Han Solo 180 80 brown fair brown 29 male mascu…\n10 Wedge A… 170 77 brown fair hazel 21 male mascu…\n# ℹ 23 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n---\n\nConditions can be combined using `&` (and), `|` (or). \n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(\n starwars,\n skin_color == \"light\" | eye_color == \"brown\"\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 25 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Leia Or… 150 49 brown light brown 19 fema… femin…\n 2 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 3 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 4 Biggs D… 183 84 black light brown 24 male mascu…\n 5 Han Solo 180 80 brown fair brown 29 male mascu…\n 6 Yoda 66 17 white green brown 896 male mascu…\n 7 Boba Fe… 183 78.2 black fair brown 31.5 male mascu…\n 8 Lando C… 177 79 black dark brown 31 male mascu…\n 9 Lobot 175 79 none light blue 37 male mascu…\n10 Arvel C… NA NA brown fair brown NA male mascu…\n# ℹ 15 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n---\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(\n starwars,\n skin_color == \"light\" & eye_color == \"brown\"\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 7 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n1 Leia Org… 150 49 brown light brown 19 fema… femin…\n2 Biggs Da… 183 84 black light brown 24 male mascu…\n3 Padmé Am… 185 45 brown light brown 46 fema… femin…\n4 Cordé 157 NA brown light brown NA \n5 Dormé 165 NA brown light brown NA fema… femin…\n6 Raymus A… 188 79 brown light brown NA male mascu…\n7 Poe Dame… NA NA brown light brown NA male mascu…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## select - Syntax\n\n- `select` extracts one or more columns from a table \n\n```r\nselect(data = ..., ) \n```\n\n## select - Exercise 4 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# select *only* the variable `hair_color`\nselect(starwars, hair_color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 1\n hair_color \n \n 1 blond \n 2 \n 3 \n 4 none \n 5 brown \n 6 brown, grey \n 7 brown \n 8 \n 9 black \n10 auburn, white\n# ℹ 77 more rows\n```\n\n\n:::\n:::\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# drop the variable `hair_color`\nselect(starwars, -hair_color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 13\n name height mass skin_color eye_color birth_year sex gender homeworld\n \n 1 Luke Sky… 172 77 fair blue 19 male mascu… Tatooine \n 2 C-3PO 167 75 gold yellow 112 none mascu… Tatooine \n 3 R2-D2 96 32 white, bl… red 33 none mascu… Naboo \n 4 Darth Va… 202 136 white yellow 41.9 male mascu… Tatooine \n 5 Leia Org… 150 49 light brown 19 fema… femin… Alderaan \n 6 Owen Lars 178 120 light blue 52 male mascu… Tatooine \n 7 Beru Whi… 165 75 light blue 47 fema… femin… Tatooine \n 8 R5-D4 97 32 white, red red NA none mascu… Tatooine \n 9 Biggs Da… 183 84 light brown 24 male mascu… Tatooine \n10 Obi-Wan … 182 77 fair blue-gray 57 male mascu… Stewjon \n# ℹ 77 more rows\n# ℹ 4 more variables: species , films , vehicles ,\n# starships \n```\n\n\n:::\n:::\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nselect(starwars, hair_color, skin_color, eye_color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 3\n hair_color skin_color eye_color\n \n 1 blond fair blue \n 2 gold yellow \n 3 white, blue red \n 4 none white yellow \n 5 brown light brown \n 6 brown, grey light blue \n 7 brown light blue \n 8 white, red red \n 9 black light brown \n10 auburn, white fair blue-gray\n# ℹ 77 more rows\n```\n\n\n:::\n:::\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# select variables `hair_color` through `eye_color`\nselect(starwars, hair_color:eye_color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 3\n hair_color skin_color eye_color\n \n 1 blond fair blue \n 2 gold yellow \n 3 white, blue red \n 4 none white yellow \n 5 brown light brown \n 6 brown, grey light blue \n 7 brown light blue \n 8 white, red red \n 9 black light brown \n10 auburn, white fair blue-gray\n# ℹ 77 more rows\n```\n\n\n:::\n:::\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# drop variables `hair_color` through `eye_color`\nselect(starwars, !(hair_color:eye_color))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 11\n name height mass birth_year sex gender homeworld species films vehicles\n \n 1 Luke S… 172 77 19 male mascu… Tatooine Human \n 2 C-3PO 167 75 112 none mascu… Tatooine Droid \n 3 R2-D2 96 32 33 none mascu… Naboo Droid \n 4 Darth … 202 136 41.9 male mascu… Tatooine Human \n 5 Leia O… 150 49 19 fema… femin… Alderaan Human \n 6 Owen L… 178 120 52 male mascu… Tatooine Human \n 7 Beru W… 165 75 47 fema… femin… Tatooine Human \n 8 R5-D4 97 32 NA none mascu… Tatooine Droid \n 9 Biggs … 183 84 24 male mascu… Tatooine Human \n10 Obi-Wa… 182 77 57 male mascu… Stewjon Human \n# ℹ 77 more rows\n# ℹ 1 more variable: starships \n```\n\n\n:::\n:::\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# `starts_with`, `ends_with`, `contains`\nselect(starwars, ends_with(\"color\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 3\n hair_color skin_color eye_color\n \n 1 blond fair blue \n 2 gold yellow \n 3 white, blue red \n 4 none white yellow \n 5 brown light brown \n 6 brown, grey light blue \n 7 brown light blue \n 8 white, red red \n 9 black light brown \n10 auburn, white fair blue-gray\n# ℹ 77 more rows\n```\n\n\n:::\n:::\n\n\n\n## mutate - Syntax\n\n- `mutate()` to compute new columns\n\n![](../img/dplyr/mutate.png)\n\n---\n\n```r\nmutate(data = ..., = funs())\nmutate(data = ..., , funs(x))\n```\n\n. . .\n\nor with the the pipe `|>`\n\nUseful when multiple functions act sequentially on a dataframe.\n\n```r\ndata |>\n mutate(, funs(x)) \n```\n\n## mutate (& pipe |>)- Exercise 5 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# create a new column to display height in meters\nmutate(starwars, height_m = height / 100)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 15\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 C-3PO 167 75 gold yellow 112 none mascu…\n 3 R2-D2 96 32 white, bl… red 33 none mascu…\n 4 Darth V… 202 136 none white yellow 41.9 male mascu…\n 5 Leia Or… 150 49 brown light brown 19 fema… femin…\n 6 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 7 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 8 R5-D4 97 32 white, red red NA none mascu…\n 9 Biggs D… 183 84 black light brown 24 male mascu…\n10 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n# ℹ 77 more rows\n# ℹ 6 more variables: homeworld , species , films ,\n# vehicles , starships , height_m \n```\n\n\n:::\n:::\n\n\n\n## mutate (& pipe |>)- Exercise 5 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# using the pipe to feed data into multiple functions sequentially\nstarwars |>\n mutate(height_m = height / 100) |>\n select(name, height_m, height, everything())\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 15\n name height_m height mass hair_color skin_color eye_color birth_year sex \n \n 1 Luke … 1.72 172 77 blond fair blue 19 male \n 2 C-3PO 1.67 167 75 gold yellow 112 none \n 3 R2-D2 0.96 96 32 white, bl… red 33 none \n 4 Darth… 2.02 202 136 none white yellow 41.9 male \n 5 Leia … 1.5 150 49 brown light brown 19 fema…\n 6 Owen … 1.78 178 120 brown, gr… light blue 52 male \n 7 Beru … 1.65 165 75 brown light blue 47 fema…\n 8 R5-D4 0.97 97 32 white, red red NA none \n 9 Biggs… 1.83 183 84 black light brown 24 male \n10 Obi-W… 1.82 182 77 auburn, w… fair blue-gray 57 male \n# ℹ 77 more rows\n# ℹ 6 more variables: gender , homeworld , species ,\n# films , vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## mutate (& pipe |>)- Exercise 5 {.smaller}\n\nMutate allows you to refer to columns that you’ve just created\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars |>\n mutate(\n height_m = height / 100,\n BMI = mass / (height_m^2)\n ) |>\n select(name, BMI, everything())\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 16\n name BMI height mass hair_color skin_color eye_color birth_year sex \n \n 1 Luke Sky… 26.0 172 77 blond fair blue 19 male \n 2 C-3PO 26.9 167 75 gold yellow 112 none \n 3 R2-D2 34.7 96 32 white, bl… red 33 none \n 4 Darth Va… 33.3 202 136 none white yellow 41.9 male \n 5 Leia Org… 21.8 150 49 brown light brown 19 fema…\n 6 Owen Lars 37.9 178 120 brown, gr… light blue 52 male \n 7 Beru Whi… 27.5 165 75 brown light blue 47 fema…\n 8 R5-D4 34.0 97 32 white, red red NA none \n 9 Biggs Da… 25.1 183 84 black light brown 24 male \n10 Obi-Wan … 23.2 182 77 auburn, w… fair blue-gray 57 male \n# ℹ 77 more rows\n# ℹ 7 more variables: gender , homeworld , species ,\n# films , vehicles , starships , height_m \n```\n\n\n:::\n:::\n\n\n\n---\n\nOutput needs to be saved into a new data frame since dplyr does not \"change\" the original dataframe.\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars_bmi <- starwars |>\n mutate(\n height_m = height / 100,\n BMI = mass / (height_m^2)\n ) |>\n select(name, BMI, everything())\n```\n:::\n\n\n\n---\n\n## Using `case_when()`clauses with `mutate()`. {.smaller}\n\nLet's create a new variable `tall_short` based on other values.\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code code-line-numbers=\"3-6\"}\nstarwars |>\n mutate(\n tall_short = case_when(\n height > 160 ~ \"tall\",\n .default = \"short\"\n )\n ) |>\n select(name, tall_short, everything())\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 15\n name tall_short height mass hair_color skin_color eye_color birth_year\n \n 1 Luke Skyw… tall 172 77 blond fair blue 19 \n 2 C-3PO tall 167 75 gold yellow 112 \n 3 R2-D2 short 96 32 white, bl… red 33 \n 4 Darth Vad… tall 202 136 none white yellow 41.9\n 5 Leia Orga… short 150 49 brown light brown 19 \n 6 Owen Lars tall 178 120 brown, gr… light blue 52 \n 7 Beru Whit… tall 165 75 brown light blue 47 \n 8 R5-D4 short 97 32 white, red red NA \n 9 Biggs Dar… tall 183 84 black light brown 24 \n10 Obi-Wan K… tall 182 77 auburn, w… fair blue-gray 57 \n# ℹ 77 more rows\n# ℹ 7 more variables: sex , gender , homeworld , species ,\n# films , vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## `group_by()` & `summarise()` - Exercise 6\n\n`group_by` creates a grouped copy of a table. \n\n* This changes the unit of analysis from the complete data set to individual groups. \n* dplyr verbs automatically detect grouped tables and calculate \"by group\". \n\n. . .\n\n```r\ngroup_by(data = ..., )\n```\n\n## group_by - Syntax\n\n* `group_by()` creates a grouped tibble. \n* This changes the unit of analysis from the complete dataset to individual groups. \n* Then, when you use the dplyr verbs on a grouped data frame they'll be automatically applied \"by group\". \n\n. . .\n\n```r\ngroup_by(data = ..., )\n```\n\n## group_by + summarize - Exercise 7 {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars |>\n group_by(species)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n# Groups: species [38]\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 C-3PO 167 75 gold yellow 112 none mascu…\n 3 R2-D2 96 32 white, bl… red 33 none mascu…\n 4 Darth V… 202 136 none white yellow 41.9 male mascu…\n 5 Leia Or… 150 49 brown light brown 19 fema… femin…\n 6 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 7 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 8 R5-D4 97 32 white, red red NA none mascu…\n 9 Biggs D… 183 84 black light brown 24 male mascu…\n10 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n## summarize - syntax\n\n* `summarize()` takes named expressions and calculates a summary based on group.\n\n. . .\n\n```r\nsummarize(data = ..., name = expression)\n```\n\n## Calculate a summary statistic *by species* {.smaller}\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars |>\n group_by(species) |>\n summarise(\n height = mean(height, na.rm = TRUE)\n )\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 38 × 2\n species height\n \n 1 Aleena 79 \n 2 Besalisk 198 \n 3 Cerean 198 \n 4 Chagrian 196 \n 5 Clawdite 168 \n 6 Droid 131.\n 7 Dug 112 \n 8 Ewok 88 \n 9 Geonosian 183 \n10 Gungan 209.\n# ℹ 28 more rows\n```\n\n\n:::\n:::\n\n\n\n---\n\nCalucate multiple summary statistics.\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars |>\n group_by(species, gender) |>\n summarise(\n height = mean(height, na.rm = TRUE),\n mass = mean(mass, na.rm = TRUE)\n )\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 42 × 4\n# Groups: species [38]\n species gender height mass\n \n 1 Aleena masculine 79 15 \n 2 Besalisk masculine 198 102 \n 3 Cerean masculine 198 82 \n 4 Chagrian masculine 196 NaN \n 5 Clawdite feminine 168 55 \n 6 Droid feminine 96 NaN \n 7 Droid masculine 140 69.8\n 8 Dug masculine 112 40 \n 9 Ewok masculine 88 20 \n10 Geonosian masculine 183 80 \n# ℹ 32 more rows\n```\n\n\n:::\n:::\n", + "markdown": "---\ntitle: \"R Bootcamp - Day 3\"\nsubtitle: \"dplyr\"\nauthor: \"Jay Hesselberth\"\n---\n\n\n\n\n\n\n\n## Class 3 outline {.smaller}\n\n* Introduce _dplyr_ & today's datasets (Exercise 1)\n* Review basic functions of _dplyr_\n * core dplyr verbs: \n - `arrange` (Exercise 2)\n - `filter` (Exercise 3)\n - `select` (Exercise 4)\n - `mutate` and the pipe (Exercise 5)\n - `summarise` (Exercise 6)\n * modify scope of verbs using: `group_by` (Exercise 7)\n * and many others! `rename`, `count`, `add_row`, `add_column`, `distinct`,\n `sample_n`, `sample_frac`, `slice`, `pull` (Exercise 8)\n\n## dplyr overview\n\ndplyr: \n\n* provides a set of tools for efficiently manipulating data sets in R. \n* is extremely fast even with large data sets. \n* follows the tidyverse grammar and philosophy; human-readable and intuitive\n* encourages linking of verbs together using pipes `|>` (or the older `%>%`)\n\n## Today's datasets {.smaller}\n\n* We will use a data set that comes with the `dplyr` package to explore its functions. \n\n* `dplyr::starwars` contains data for characters from Star Wars.\n\n. . .\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nstarwars\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 C-3PO 167 75 gold yellow 112 none mascu…\n 3 R2-D2 96 32 white, bl… red 33 none mascu…\n 4 Darth V… 202 136 none white yellow 41.9 male mascu…\n 5 Leia Or… 150 49 brown light brown 19 fema… femin…\n 6 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 7 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 8 R5-D4 97 32 white, red red NA none mascu…\n 9 Biggs D… 183 84 black light brown 24 male mascu…\n10 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n. . .\n\nExplore `starwars` in the console with `head()`, `View()`, and `summary()`.\n\n## dplyr package {.smaller}\n\n`dplyr` is a grammar of data manipulation, providing a consistent set of\nverbs that help you solve the most common data manipulation challenges:\n\n - `arrange()` changes the ordering of the rows.\n - `filter()` picks cases based on their values.\n - `select()` picks variables based on their names.\n - `mutate()` adds new variables that are functions of existing variables\n - `summarise()` reduces multiple values down to a single summary.\n \n. . .\n\n- These all combine naturally with `group_by()` which allows you to perform\nany operation \"by group\". \n\n- Pipes `|>` allows different functions to be used together to create a\nworkflow. `x |> f(y)` turns into `f(x, y)`\n\n## arrange - Syntax\n\n- `arrange()` orders rows by values of one or more columns (low to high).\n- The `desc()` helper orders high to low. \n\n. . .\n\n```r\narrange(data = ..., )\n```\n\n## arrange - Exercise 2 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# default is to arrange in ascending order\narrange(starwars, height)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Yoda 66 17 white green brown 896 male mascu…\n 2 Ratts T… 79 15 none grey, blue unknown NA male mascu…\n 3 Wicket … 88 20 brown brown brown 8 male mascu…\n 4 Dud Bolt 94 45 none blue, grey yellow NA male mascu…\n 5 R2-D2 96 32 white, bl… red 33 none mascu…\n 6 R4-P17 96 NA none silver, r… red, blue NA none femin…\n 7 R5-D4 97 32 white, red red NA none mascu…\n 8 Sebulba 112 40 none grey, red orange NA male mascu…\n 9 Gasgano 122 NA none white, bl… black NA male mascu…\n10 Watto 137 NA black blue, grey yellow NA male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## arrange - Exercise 2 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# arrange in descending order\narrange(starwars, desc(height))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Yarael … 264 NA none white yellow NA male mascu…\n 2 Tarfful 234 136 brown brown blue NA male mascu…\n 3 Lama Su 229 88 none grey black NA male mascu…\n 4 Chewbac… 228 112 brown unknown blue 200 male mascu…\n 5 Roos Ta… 224 82 none grey orange NA male mascu…\n 6 Grievous 216 159 none brown, wh… green, y… NA male mascu…\n 7 Taun We 213 NA none grey black NA fema… femin…\n 8 Rugor N… 206 NA none green orange NA male mascu…\n 9 Tion Me… 206 80 none grey black NA male mascu…\n10 Darth V… 202 136 none white yellow 41.9 male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## arrange - Exercise 2 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# arrange by multiple columns\narrange(starwars, height, mass)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Yoda 66 17 white green brown 896 male mascu…\n 2 Ratts T… 79 15 none grey, blue unknown NA male mascu…\n 3 Wicket … 88 20 brown brown brown 8 male mascu…\n 4 Dud Bolt 94 45 none blue, grey yellow NA male mascu…\n 5 R2-D2 96 32 white, bl… red 33 none mascu…\n 6 R4-P17 96 NA none silver, r… red, blue NA none femin…\n 7 R5-D4 97 32 white, red red NA none mascu…\n 8 Sebulba 112 40 none grey, red orange NA male mascu…\n 9 Gasgano 122 NA none white, bl… black NA male mascu…\n10 Watto 137 NA black blue, grey yellow NA male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## filter - Syntax\n\n- `filter()` chooses rows/cases where conditions are true.\n\n```r\nfilter(data = ..., )\n```\n\n## filter - Exercise 3 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(starwars, skin_color == \"light\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 11 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Leia Or… 150 49 brown light brown 19 fema… femin…\n 2 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 3 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 4 Biggs D… 183 84 black light brown 24 male mascu…\n 5 Lobot 175 79 none light blue 37 male mascu…\n 6 Padmé A… 185 45 brown light brown 46 fema… femin…\n 7 Cordé 157 NA brown light brown NA \n 8 Dormé 165 NA brown light brown NA fema… femin…\n 9 Raymus … 188 79 brown light brown NA male mascu…\n10 Rey NA NA brown light hazel NA fema… femin…\n11 Poe Dam… NA NA brown light brown NA male mascu…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## filter - Exercise 3 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(starwars, height < 150)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 10 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 R2-D2 96 32 white, bl… red 33 none mascu…\n 2 R5-D4 97 32 white, red red NA none mascu…\n 3 Yoda 66 17 white green brown 896 male mascu…\n 4 Wicket … 88 20 brown brown brown 8 male mascu…\n 5 Watto 137 NA black blue, grey yellow NA male mascu…\n 6 Sebulba 112 40 none grey, red orange NA male mascu…\n 7 Ratts T… 79 15 none grey, blue unknown NA male mascu…\n 8 Dud Bolt 94 45 none blue, grey yellow NA male mascu…\n 9 Gasgano 122 NA none white, bl… black NA male mascu…\n10 R4-P17 96 NA none silver, r… red, blue NA none femin…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## filter - Exercise 3 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(\n starwars,\n mass > mean(mass, na.rm = TRUE)\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 10 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Darth V… 202 136 none white yellow 41.9 male mascu…\n 2 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 3 Chewbac… 228 112 brown unknown blue 200 male mascu…\n 4 Jabba D… 175 1358 green-tan… orange 600 herm… mascu…\n 5 Jek Ton… 180 110 brown fair blue NA \n 6 IG-88 200 140 none metal red 15 none mascu…\n 7 Bossk 190 113 none green red 53 male mascu…\n 8 Dexter … 198 102 none brown yellow NA male mascu…\n 9 Grievous 216 159 none brown, wh… green, y… NA male mascu…\n10 Tarfful 234 136 brown brown blue NA male mascu…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## filter - Exercise 3 {.smaller}\n\nFilter out cases where `hair_color` is `NA`\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(starwars, is.na(hair_color))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 5 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n1 C-3PO 167 75 gold yellow 112 none mascu…\n2 R2-D2 96 32 white, bl… red 33 none mascu…\n3 R5-D4 97 32 white, red red NA none mascu…\n4 Greedo 173 74 green black 44 male mascu…\n5 Jabba De… 175 1358 green-tan… orange 600 herm… mascu…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## filter - Exercise 3 {.smaller}\n\n* The most frequently used comparison operators are:\n\n- `>`, `<`, `>=`, `<=`, `==` (equal), `!=` (not equal)\n- `is.na()`, `!is.na()`, and `%in%` (contained in a vector of cases). \n\n. . .\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(\n starwars,\n skin_color %in% c(\"light\", \"fair\", \"pale\")\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 33 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 Leia Or… 150 49 brown light brown 19 fema… femin…\n 3 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 4 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 5 Biggs D… 183 84 black light brown 24 male mascu…\n 6 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n 7 Anakin … 188 84 blond fair blue 41.9 male mascu…\n 8 Wilhuff… 180 NA auburn, g… fair blue 64 male mascu…\n 9 Han Solo 180 80 brown fair brown 29 male mascu…\n10 Wedge A… 170 77 brown fair hazel 21 male mascu…\n# ℹ 23 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n--- \n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# can also store as a named vector and use %in% with the vector\ncolor <- c(\"light\", \"fair\", \"pale\")\nfilter(starwars, skin_color %in% color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 33 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 Leia Or… 150 49 brown light brown 19 fema… femin…\n 3 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 4 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 5 Biggs D… 183 84 black light brown 24 male mascu…\n 6 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n 7 Anakin … 188 84 blond fair blue 41.9 male mascu…\n 8 Wilhuff… 180 NA auburn, g… fair blue 64 male mascu…\n 9 Han Solo 180 80 brown fair brown 29 male mascu…\n10 Wedge A… 170 77 brown fair hazel 21 male mascu…\n# ℹ 23 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n---\n\nConditions can be combined using `&` (and), `|` (or). \n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(\n starwars,\n skin_color == \"light\" | eye_color == \"brown\"\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 25 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Leia Or… 150 49 brown light brown 19 fema… femin…\n 2 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 3 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 4 Biggs D… 183 84 black light brown 24 male mascu…\n 5 Han Solo 180 80 brown fair brown 29 male mascu…\n 6 Yoda 66 17 white green brown 896 male mascu…\n 7 Boba Fe… 183 78.2 black fair brown 31.5 male mascu…\n 8 Lando C… 177 79 black dark brown 31 male mascu…\n 9 Lobot 175 79 none light blue 37 male mascu…\n10 Arvel C… NA NA brown fair brown NA male mascu…\n# ℹ 15 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n---\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nfilter(\n starwars,\n skin_color == \"light\" & eye_color == \"brown\"\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 7 × 14\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n1 Leia Org… 150 49 brown light brown 19 fema… femin…\n2 Biggs Da… 183 84 black light brown 24 male mascu…\n3 Padmé Am… 185 45 brown light brown 46 fema… femin…\n4 Cordé 157 NA brown light brown NA \n5 Dormé 165 NA brown light brown NA fema… femin…\n6 Raymus A… 188 79 brown light brown NA male mascu…\n7 Poe Dame… NA NA brown light brown NA male mascu…\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## select - Syntax\n\n- `select` extracts one or more columns from a table \n\n```r\nselect(data = ..., ) \n```\n\n## select - Exercise 4 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# select *only* the variable `hair_color`\nselect(starwars, hair_color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 1\n hair_color \n \n 1 blond \n 2 \n 3 \n 4 none \n 5 brown \n 6 brown, grey \n 7 brown \n 8 \n 9 black \n10 auburn, white\n# ℹ 77 more rows\n```\n\n\n:::\n:::\n\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# drop the variable `hair_color`\nselect(starwars, -hair_color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 13\n name height mass skin_color eye_color birth_year sex gender homeworld\n \n 1 Luke Sky… 172 77 fair blue 19 male mascu… Tatooine \n 2 C-3PO 167 75 gold yellow 112 none mascu… Tatooine \n 3 R2-D2 96 32 white, bl… red 33 none mascu… Naboo \n 4 Darth Va… 202 136 white yellow 41.9 male mascu… Tatooine \n 5 Leia Org… 150 49 light brown 19 fema… femin… Alderaan \n 6 Owen Lars 178 120 light blue 52 male mascu… Tatooine \n 7 Beru Whi… 165 75 light blue 47 fema… femin… Tatooine \n 8 R5-D4 97 32 white, red red NA none mascu… Tatooine \n 9 Biggs Da… 183 84 light brown 24 male mascu… Tatooine \n10 Obi-Wan … 182 77 fair blue-gray 57 male mascu… Stewjon \n# ℹ 77 more rows\n# ℹ 4 more variables: species , films , vehicles ,\n# starships \n```\n\n\n:::\n:::\n\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nselect(starwars, hair_color, skin_color, eye_color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 3\n hair_color skin_color eye_color\n \n 1 blond fair blue \n 2 gold yellow \n 3 white, blue red \n 4 none white yellow \n 5 brown light brown \n 6 brown, grey light blue \n 7 brown light blue \n 8 white, red red \n 9 black light brown \n10 auburn, white fair blue-gray\n# ℹ 77 more rows\n```\n\n\n:::\n:::\n\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# select variables `hair_color` through `eye_color`\nselect(starwars, hair_color:eye_color)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 3\n hair_color skin_color eye_color\n \n 1 blond fair blue \n 2 gold yellow \n 3 white, blue red \n 4 none white yellow \n 5 brown light brown \n 6 brown, grey light blue \n 7 brown light blue \n 8 white, red red \n 9 black light brown \n10 auburn, white fair blue-gray\n# ℹ 77 more rows\n```\n\n\n:::\n:::\n\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# drop variables `hair_color` through `eye_color`\nselect(starwars, !(hair_color:eye_color))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 11\n name height mass birth_year sex gender homeworld species films vehicles\n \n 1 Luke S… 172 77 19 male mascu… Tatooine Human \n 2 C-3PO 167 75 112 none mascu… Tatooine Droid \n 3 R2-D2 96 32 33 none mascu… Naboo Droid \n 4 Darth … 202 136 41.9 male mascu… Tatooine Human \n 5 Leia O… 150 49 19 fema… femin… Alderaan Human \n 6 Owen L… 178 120 52 male mascu… Tatooine Human \n 7 Beru W… 165 75 47 fema… femin… Tatooine Human \n 8 R5-D4 97 32 NA none mascu… Tatooine Droid \n 9 Biggs … 183 84 24 male mascu… Tatooine Human \n10 Obi-Wa… 182 77 57 male mascu… Stewjon Human \n# ℹ 77 more rows\n# ℹ 1 more variable: starships \n```\n\n\n:::\n:::\n\n\n\n\n## select - Exercise 4 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# `starts_with`, `ends_with`, `contains`\nselect(starwars, ends_with(\"color\"))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 3\n hair_color skin_color eye_color\n \n 1 blond fair blue \n 2 gold yellow \n 3 white, blue red \n 4 none white yellow \n 5 brown light brown \n 6 brown, grey light blue \n 7 brown light blue \n 8 white, red red \n 9 black light brown \n10 auburn, white fair blue-gray\n# ℹ 77 more rows\n```\n\n\n:::\n:::\n\n\n\n\n## mutate - Syntax\n\n- `mutate()` to compute new columns\n\n![](../img/dplyr/mutate.png)\n\n---\n\n```r\nmutate(data = ..., = funs())\nmutate(data = ..., , funs(x))\n```\n\n. . .\n\nor with the the pipe `|>`\n\nUseful when multiple functions act sequentially on a dataframe.\n\n```r\ndata |>\n mutate(, funs(x)) \n```\n\n## mutate (& pipe |>)- Exercise 5 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# create a new column to display height in meters\nmutate(starwars, height_m = height / 100)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 15\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 C-3PO 167 75 gold yellow 112 none mascu…\n 3 R2-D2 96 32 white, bl… red 33 none mascu…\n 4 Darth V… 202 136 none white yellow 41.9 male mascu…\n 5 Leia Or… 150 49 brown light brown 19 fema… femin…\n 6 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 7 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 8 R5-D4 97 32 white, red red NA none mascu…\n 9 Biggs D… 183 84 black light brown 24 male mascu…\n10 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n# ℹ 77 more rows\n# ℹ 6 more variables: homeworld , species , films ,\n# vehicles , starships , height_m \n```\n\n\n:::\n:::\n\n\n\n\n## mutate (& pipe |>)- Exercise 5 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\n# using the pipe to feed data into multiple functions sequentially\nstarwars |>\n mutate(height_m = height / 100) |>\n select(name, height_m, height, everything())\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 15\n name height_m height mass hair_color skin_color eye_color birth_year sex \n \n 1 Luke … 1.72 172 77 blond fair blue 19 male \n 2 C-3PO 1.67 167 75 gold yellow 112 none \n 3 R2-D2 0.96 96 32 white, bl… red 33 none \n 4 Darth… 2.02 202 136 none white yellow 41.9 male \n 5 Leia … 1.5 150 49 brown light brown 19 fema…\n 6 Owen … 1.78 178 120 brown, gr… light blue 52 male \n 7 Beru … 1.65 165 75 brown light blue 47 fema…\n 8 R5-D4 0.97 97 32 white, red red NA none \n 9 Biggs… 1.83 183 84 black light brown 24 male \n10 Obi-W… 1.82 182 77 auburn, w… fair blue-gray 57 male \n# ℹ 77 more rows\n# ℹ 6 more variables: gender , homeworld , species ,\n# films , vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## mutate (& pipe |>)- Exercise 5 {.smaller}\n\nMutate allows you to refer to columns that you’ve just created\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars |>\n mutate(\n height_m = height / 100,\n BMI = mass / (height_m^2)\n ) |>\n select(name, BMI, everything())\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 16\n name BMI height mass hair_color skin_color eye_color birth_year sex \n \n 1 Luke Sky… 26.0 172 77 blond fair blue 19 male \n 2 C-3PO 26.9 167 75 gold yellow 112 none \n 3 R2-D2 34.7 96 32 white, bl… red 33 none \n 4 Darth Va… 33.3 202 136 none white yellow 41.9 male \n 5 Leia Org… 21.8 150 49 brown light brown 19 fema…\n 6 Owen Lars 37.9 178 120 brown, gr… light blue 52 male \n 7 Beru Whi… 27.5 165 75 brown light blue 47 fema…\n 8 R5-D4 34.0 97 32 white, red red NA none \n 9 Biggs Da… 25.1 183 84 black light brown 24 male \n10 Obi-Wan … 23.2 182 77 auburn, w… fair blue-gray 57 male \n# ℹ 77 more rows\n# ℹ 7 more variables: gender , homeworld , species ,\n# films , vehicles , starships , height_m \n```\n\n\n:::\n:::\n\n\n\n\n---\n\nOutput needs to be saved into a new data frame since dplyr does not \"change\" the original dataframe.\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars_bmi <- starwars |>\n mutate(\n height_m = height / 100,\n BMI = mass / (height_m^2)\n ) |>\n select(name, BMI, everything())\n```\n:::\n\n\n\n\n---\n\n## Using `case_when()`clauses with `mutate()`. {.smaller}\n\nLet's create a new variable `tall_short` based on other values.\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code code-line-numbers=\"3-6\"}\nstarwars |>\n mutate(\n tall_short = case_when(\n height > 160 ~ \"tall\",\n .default = \"short\"\n )\n ) |>\n select(name, tall_short, everything())\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 15\n name tall_short height mass hair_color skin_color eye_color birth_year\n \n 1 Luke Skyw… tall 172 77 blond fair blue 19 \n 2 C-3PO tall 167 75 gold yellow 112 \n 3 R2-D2 short 96 32 white, bl… red 33 \n 4 Darth Vad… tall 202 136 none white yellow 41.9\n 5 Leia Orga… short 150 49 brown light brown 19 \n 6 Owen Lars tall 178 120 brown, gr… light blue 52 \n 7 Beru Whit… tall 165 75 brown light blue 47 \n 8 R5-D4 short 97 32 white, red red NA \n 9 Biggs Dar… tall 183 84 black light brown 24 \n10 Obi-Wan K… tall 182 77 auburn, w… fair blue-gray 57 \n# ℹ 77 more rows\n# ℹ 7 more variables: sex , gender , homeworld , species ,\n# films , vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## `group_by()` & `summarise()` - Exercise 6\n\n`group_by` creates a grouped copy of a table. \n\n* This changes the unit of analysis from the complete data set to individual groups. \n* dplyr verbs automatically detect grouped tables and calculate \"by group\". \n\n. . .\n\n```r\ngroup_by(data = ..., )\n```\n\n## group_by - Syntax\n\n* `group_by()` creates a grouped tibble. \n* This changes the unit of analysis from the complete dataset to individual groups. \n* Then, when you use the dplyr verbs on a grouped data frame they'll be automatically applied \"by group\". \n\n. . .\n\n```r\ngroup_by(data = ..., )\n```\n\n## group_by + summarize - Exercise 7 {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars |>\n group_by(species)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 87 × 14\n# Groups: species [38]\n name height mass hair_color skin_color eye_color birth_year sex gender\n \n 1 Luke Sk… 172 77 blond fair blue 19 male mascu…\n 2 C-3PO 167 75 gold yellow 112 none mascu…\n 3 R2-D2 96 32 white, bl… red 33 none mascu…\n 4 Darth V… 202 136 none white yellow 41.9 male mascu…\n 5 Leia Or… 150 49 brown light brown 19 fema… femin…\n 6 Owen La… 178 120 brown, gr… light blue 52 male mascu…\n 7 Beru Wh… 165 75 brown light blue 47 fema… femin…\n 8 R5-D4 97 32 white, red red NA none mascu…\n 9 Biggs D… 183 84 black light brown 24 male mascu…\n10 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…\n# ℹ 77 more rows\n# ℹ 5 more variables: homeworld , species , films ,\n# vehicles , starships \n```\n\n\n:::\n:::\n\n\n\n\n## summarize - syntax\n\n* `summarize()` takes named expressions and calculates a summary based on group.\n\n. . .\n\n```r\nsummarize(data = ..., name = expression)\n```\n\n## Calculate a summary statistic *by species* {.smaller}\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars |>\n group_by(species) |>\n summarise(\n height = mean(height, na.rm = TRUE)\n )\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 38 × 2\n species height\n \n 1 Aleena 79 \n 2 Besalisk 198 \n 3 Cerean 198 \n 4 Chagrian 196 \n 5 Clawdite 168 \n 6 Droid 131.\n 7 Dug 112 \n 8 Ewok 88 \n 9 Geonosian 183 \n10 Gungan 209.\n# ℹ 28 more rows\n```\n\n\n:::\n:::\n\n\n\n\n---\n\nCalucate multiple summary statistics.\n\n\n\n\n::: {.cell output-location='fragment'}\n\n```{.r .cell-code}\nstarwars |>\n group_by(species, gender) |>\n summarise(\n height = mean(height, na.rm = TRUE),\n mass = mean(mass, na.rm = TRUE)\n )\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\n`summarise()` has grouped output by 'species'. You can override using the\n`.groups` argument.\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 42 × 4\n# Groups: species [38]\n species gender height mass\n \n 1 Aleena masculine 79 15 \n 2 Besalisk masculine 198 102 \n 3 Cerean masculine 198 82 \n 4 Chagrian masculine 196 NaN \n 5 Clawdite feminine 168 55 \n 6 Droid feminine 96 NaN \n 7 Droid masculine 140 69.8\n 8 Dug masculine 112 40 \n 9 Ewok masculine 88 20 \n10 Geonosian masculine 183 80 \n# ℹ 32 more rows\n```\n\n\n:::\n:::\n", "supporting": [], "filters": [ "rmarkdown/pagebreak.lua" ], - "includes": { - "include-after-body": [ - "\n\n\n" - ] - }, + "includes": {}, "engineDependencies": {}, "preserve": {}, "postProcess": true diff --git a/exercises/_ex-11.qmd b/exercises/ex-11.qmd similarity index 100% rename from exercises/_ex-11.qmd rename to exercises/ex-11.qmd diff --git a/exercises/_ex-12.qmd b/exercises/ex-12.qmd similarity index 100% rename from exercises/_ex-12.qmd rename to exercises/ex-12.qmd diff --git a/exercises/_ex-13.qmd b/exercises/ex-13.qmd similarity index 100% rename from exercises/_ex-13.qmd rename to exercises/ex-13.qmd diff --git a/problem-sets/_ps-11.qmd b/problem-sets/ps-11.qmd similarity index 100% rename from problem-sets/_ps-11.qmd rename to problem-sets/ps-11.qmd diff --git a/problem-sets/_ps-12.qmd b/problem-sets/ps-12.qmd similarity index 100% rename from problem-sets/_ps-12.qmd rename to problem-sets/ps-12.qmd diff --git a/problem-sets/_ps-13.qmd b/problem-sets/ps-13.qmd similarity index 100% rename from problem-sets/_ps-13.qmd rename to problem-sets/ps-13.qmd diff --git a/slides/_slides-11.qmd b/slides/slides-11.qmd similarity index 99% rename from slides/_slides-11.qmd rename to slides/slides-11.qmd index cf0567d6..ea12a90f 100644 --- a/slides/_slides-11.qmd +++ b/slides/slides-11.qmd @@ -382,7 +382,7 @@ mean #| echo: true #| output-location: column-fragment -mean(p$body_mass_g) +mn=mean(p$body_mass_g) ``` . . . @@ -393,7 +393,7 @@ median #| echo: true #| output-location: column-fragment -median(p$body_mass_g) +md=median(p$body_mass_g) ``` ------------------------------------------------------------------------ @@ -406,8 +406,8 @@ ggplot( aes(x = body_mass_g) ) + geom_density() + - # geom_vline() + - # geom_vline() + + geom_vline(xintercept = mn,color="red") + + geom_vline(xintercept = md,color="blue") + theme_cowplot() ``` diff --git a/slides/_slides-12.qmd b/slides/slides-12.qmd similarity index 100% rename from slides/_slides-12.qmd rename to slides/slides-12.qmd diff --git a/slides/_slides-13.qmd b/slides/slides-13.qmd similarity index 100% rename from slides/_slides-13.qmd rename to slides/slides-13.qmd