diff --git a/.gitignore b/.gitignore
index 6be4156b..2991a260 100644
--- a/.gitignore
+++ b/.gitignore
@@ -5,6 +5,7 @@
# .Rprofile
.DS_Store
*.html
+*.swp
/.quarto/
_site/
diff --git a/content/bootcamp/r/class-01.qmd b/content/bootcamp/r/class-01.qmd
deleted file mode 100644
index ef983040..00000000
--- a/content/bootcamp/r/class-01.qmd
+++ /dev/null
@@ -1,328 +0,0 @@
----
-title: "class-01"
-author: "Sujatha Jagannathan"
-date: "8/24/2020"
----
-
-```{r include=FALSE}
-library(tidyverse)
-library(knitr)
-```
-
-### Contact Info
-Suja Jagannathan [sujatha.jagannathan@cuanschutz.edu](mailto:sujatha.jagannathan@cuanschutz.edu)
-
-### Office Hours
-Use https://calendly.com/molb7950 to schedule a time with a TA.
-
-
-
-### Learning Objectives for the R Bootcamp
-
-* Follow best coding practices (*class 1*)
-* Know the fundamentals of R programming (*class 1*)
-* Become familiar with "tidyverse" suite of packages
- * tidyr: "Tidy" a messy dataset (*class 2*)
- * dplyr: Transform data to derive new information (*class 3*)
- * ggplot2: Visualize and communicate results (*class 4*)
-* Practice reproducible analysis using Rmarkdown (Rigor & Reproducibility) (*classes 1-5*)
-
-### Today's class outline - *class 1*
-
-* Coding best practices
-* Review R basics
- * R vs Rstudio (Exercises #1-2)
- * Functions & Arguments (Exercises #3-4)
- * Data types (Exercise #5)
- * Data structures (Exercises #6-7)
- * R Packages (Exercise #8)
-* Review Rmarkdown (Exercise #9)
-* Rstudio cheatsheets (Exercise #10)
-
-### Coding best practices ###
-
-> "Good coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread."
-> --- Hadley Wickham
-
-### File Names
-
-* File names should be meaningful and end in `.R`, `.Rmd`, etc.
-* Avoid using special characters in file names - stick with numbers, letters, `-`, and `_`.
-* *Never* include spaces in file names!
-
-```{show-code}
- ###### Good
- fit_models.R
- utility_functions.Rmd
-
- ###### Bad
- fit models.R
- tmp.r
- stuff.r
-```
-
-* If files should be run in a particular order, prefix them with numbers.
-* If it seems likely you'll have more than 10 files, left pad with zero.
-* It looks nice (constant width) and sorts nicely.
-
-```{show-code}
- 00_download.R
- 01_explore.R
- ...
- 09_model.R
- 10_visualize.R
-```
-
-* Avoid capitalizing when not necessary.
-* If you want to include dates in your file name, use the ISO 8601 standard: `YYYY-MM-DD`
-* Use delimiters intentionally! (helps you to recover metadata easily from file names)
-* For example, "_" to delimit fields; "-" to delimit words
-
-```{show-code}
-2019-02-15_class1_data-wrangling.Rmd
-```
-
-* Avoid hard coding file names and instead use relative paths.
-* `~` represents the current working directory.
-* Use `getwd()` to figure out what your working directory is.
-
-```{show-code}
-###### Good
-"~/class1/code/test.R"
-
-###### Bad
-"/Users/sjaganna/Desktop/CU-onedrive/08-teaching/molb7910/class1/data.csv"
-```
-
-### Organisation
-
-* Try to give a file a concise name that evokes its contents
-* One way to organize your files is by grouping them into `data`, `code`, `plots`, etc.
-* For example, in **this class** we often use the following structure:
-
-```{show-code}
- exercises
- - exercises-01.Rmd
- - data
- - img
- - setup
- ...
-```
-
-### Internal structure of code
-
-Use commented lines of `-` and `=` to break up your code chunk into easily readable
-segments. Or better yet, make each "action" it's own chunk and give it a name.
-
-```{show-code}
-# Load data ---------------------------
-
-# Plot data ---------------------------
-```
-
-### R Basics - Overview ###
-
-* R, Rstudio (Exercise #1)
-* R as a calculator (Exercise #2)
-* Functions and arguments (Exercises #3-4)
-* Data types: numeric, character, logical (& more) (Exercise #5)
-* Data structures: vector, list, matrix, data frame, tibbles (Exercises #6-7)
-* Package system, Rstudio, and Rmarkdown (Exercises #8-9)
-
-### R vs Rstudio - Exercise 1
-
-What is R? What is Rstudio?
-
-* R is a programming language used for statistical computing
-* RStudio is an integrated development environment (IDE) for R. It includes a console, terminal, syntax-highlighting editor that supports direct code execution, tools for plotting, history, workspace management, and much more.
-* You can use R without RStudio, but not the other way around.
-
-Let's do the following to explore Rstudio:
-
-* Look at Rstudio panels one at a time
-* Environment, History, Console, Terminal, Files, Plots, Packages, Help, etc.
-
-### R as a calculator - Exercise 2
-
-* R can function like an advanced calculator
-
-- try simple math
-```{r}
-2 + 3 * 5 # Note the order of operations.
-3 - 7 # value of 3-7
-3 / 2 # Division
-5^2 # 5 raised to the second power
-# This is a comment line
-```
-
-- assign a numeric value to an object
-```{r}
-num <- 5^2 # we just created an "object" num
-```
-
-- print the object to check
-```{r}
-num
-```
-
-- do a computation on the object
-```{r}
-num + 100
-```
-Note: Objects can be over-written. So be careful if you reuse names.
-
-### Functions and arguments - Exercise 3
-
-* Functions are fundamental building blocks of R
-* Most functions take one or more arguments and transform an input object in a specific way.
-* Tab completion is your friend!
-
-```{r}
-log
-?log
-log(4)
-log(4, base = 2)
-```
-
-### Writing a simple function - Exercise 4
-
-```{r}
-addtwo <- function(x) {
- num <- x + 2
- return(num)
-}
-
-addtwo(4)
-```
-
-```{r}
-f <- function(x, y) {
- z <- 3 * x + 4 * y
- return(z)
-}
-
-f(2, 3)
-```
-
-### Data types ###
-
-* There are many data types in R.
-* For this class, the most commonly used ones are **numeric**, **character**, and **logical**.
-* All these data types can be used to create vectors natively.
-
-### Data types - Exercise 5
-```{r}
-typeof(4) # numeric data time
-typeof("suja") # character data type
-typeof(TRUE) # logical data type
-typeof(as.character(TRUE)) # coercing one data type to another
-```
-
-### Data structures ###
-
-* R has multiple data structures.
-* Most of the time you will deal with tabular data sets, you will manipulate them, take sub-sections of them.
-* It is essential to know what are the common data structures in R and how they can be used.
-* R deals with named data structures, this means you can give names to data structures and manipulate or operate on them using those names.
-
-```{r, echo = FALSE, out.width= '100%'}
-knitr::include_graphics("img/data-structures.png")
-```
-Source: Devopedia
-
-### Tibbles
-* A __tibble__, or `tbl_df`, is a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not.
-* Tibbles are data.frames that are lazy and surly: they do less (i.e. they don't change variable names or types, and don't do partial matching) and complain more (e.g. when a variable does not exist).
-* This forces you to confront problems earlier, typically leading to cleaner, more expressive code. Tibbles also have an enhanced `print()` method which makes them easier to use with large datasets containing complex objects.
-* `tibble()` does much less than `data.frame()`:
- - it never changes the type of the inputs
- - it never changes the names of variables
- - it never creates `row.names()`
-
-Source: [tibbles chapter](http://r4ds.had.co.nz/tibbles.html) in *R for data science*.
-
-### Vectors - Exercise 6
-
-- Vectors are one of the core R data structures.
-- It is basically a list of elements of the same type (numeric,character or logical).
-- Later you will see that every column of a table will be represented as a vector.
-- R handles vectors easily and intuitively.
-- The operations on vectors will propagate to all the elements of the vectors.
-
-Create the following vectors
-```{r}
-x <- c(1, 3, 2, 10, 5) # create a vector named x with 5 components
-# `c` is for combine
-# you could use '=' but I don't recommend it.
-y <- 1:5 # create a vector of consecutive integers y
-y + 2 # scalar addition
-2 * y # scalar multiplication
-y^2 # raise each component to the second power
-2^y # raise 2 to the first through fifth power
-y # y itself has not been unchanged
-y <- y * 2 # here, y is changed
-```
-
-### Data frames - Exercise 7
-
-- A data frame is more general than a matrix, in that different columns can have different modes (numeric, character, factor, etc.).
-- A data frame can be constructed by data.frame() function.
-- For example, we illustrate how to construct a data frame from genomic intervals or coordinates.
-
-Create a dataframe `mydata`
-```{r}
-chr <- c("chr1", "chr1", "chr2", "chr2")
-strand <- c("-", "-", "+", "+")
-start <- c(200, 4000, 100, 400)
-end <- c(250, 410, 200, 450)
-
-mydata.df <- data.frame(chr, strand, start, end) # creating dataframe
-mydata.df
-
-mydata.tbl <- tibble(chr, strand, start, end) # creating a tibble
-mydata.tbl
-```
-
-### R packages - Exercise 8
-
-* An R package is a collection of code, data, documentation, and tests that is easily sharable
-* A package often has a collection of custom functions that enable you to carry out a workflow. eg. DESeq for RNA-seq analysis
-* The most popular places to get R packages from are CRAN, Bioconductor, and Github.
-* Once a package is installed, one still has to "load" them into the environment using a `library()` call.
-
-Let's do the following to explore R packages
-* Look at the "Environment" panel in Rstudio
-* Explore Global Environment
-* Explore the contents of a package
-
-### Rmarkdown Exercise - Exercise 9
-
-* Rmarkdown is a fully reproducible authoring framework to create, collaborate, and communicate your work.
-* Rmarkdown supports a number of output formats including pdfs, word documents, slide shows, html, etc.
-* An Rmarkdown document is a plain text file with the extension `.Rmd` and contains the following basic components:
- - An (optional) YAML header surrounded by ---s.
- - Chunks of R code surrounded by ```.
- - Text mixed with simple text formatting like # heading and _italics_.
-
-Let's do the following to explore Rmarkdown documents
-* Create a new .Rmd document
-* `knit` the document to see the output
-
-### Homework instructions
-
-* Today's homework is:
- 1) To go over everything we covered today and make sure you understand it. (Use office hours if you have questions) - Expected time spent: 30 min - 1 hour
- 2) Go over Rstudio and Rmarkdown cheatsheets (Finding cheatsheets: Exercise 10) - - Expected time spent: 30 min on each cheatsheet
-
-### Acknowledgements
-
-The material for this class was heavily borrowed from:
-* Introduction to R by Altuna Akalin: http://compgenomr.github.io/book/introduction-to-r.html
-* R for data science by Hadley Wickham: https://r4ds.had.co.nz/index.html
-
-### Further Reading & Resources
-
-* R for data science https://r4ds.had.co.nz/index.html
-* Advanced R by Hadley Wickam https://adv-r.hadley.nz/
-* Installing R: https://cran.r-project.org/
-* Installing RStudio: https://rstudio.com/products/rstudio/download/
diff --git a/content/bootcamp/r/class-02.qmd b/content/bootcamp/r/class-02.qmd
index 29d3d6a4..55acfc95 100644
--- a/content/bootcamp/r/class-02.qmd
+++ b/content/bootcamp/r/class-02.qmd
@@ -7,6 +7,7 @@ date: "8/25/2020"
```{r include=FALSE}
library(tidyverse)
library(knitr)
+library(here)
```
### Contact Info
@@ -49,7 +50,7 @@ Use https://calendly.com/molb7950 to schedule a time with a TA.
* 25 packages, total (as of today) - we will focus mainly on tidyr, dplyr, and ggplot2
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/tidy1.png")
+knitr::include_graphics(here("img/tidy1.png"))
```
Source: R for Data Science by Hadley Wickham
@@ -57,15 +58,15 @@ Source: R for Data Science by Hadley Wickham
### Data import - readr - Exercise 1
```{r, echo = FALSE, out.width= '70%'}
-knitr::include_graphics("img/readr.png")
+knitr::include_graphics(here("img/readr.png"))
```
-```{r, echo = FALSE, out.width= '80%'}
-knitr::include_graphics("img/readr-args.png")
+```{r, echo = FALSE, out.width= '80%', fig.cap='Source: RStudio cheatsheets'}
+knitr::include_graphics(here("img/readr-args.png"))
```
-Source: Rstudio cheatsheets
- Let's try importing a small dataset - Exercise # 1
+
```{r}
getwd() # good to know which folder you are on since the path to file is relative
# same as `pwd` in bash
@@ -84,15 +85,16 @@ __Note__: All of these functions can also be used in an interactive manner via `
>
> --- Hadley Wickham
-```{r, echo = FALSE, out.width= '100%'}
-knitr::include_graphics("img/tidydata.png")
+```{r, echo = FALSE, out.width= '100%', fig.cap='Source: Rstudio cheatsheets'}
+knitr::include_graphics(here("img/tidydata.png"))
```
-Source: Rstudio cheatsheets
### Datasets for today's class - Exercise 2
-* In this class, we will use the datasets that come with the tidyr package to explore all the functions provided by tidyr.
+* In this class, we will use the datasets that come with the tidyr package to explore all the functions provided by tidyr.
+
* Explore the contents of _tidyr_ package (Exercise #2)
+
* `table1`, `table2`, `table3`, `table4a`, `table4b`, and `table5` all display the number of TB cases documented by the World Health Organization in Afghanistan, Brazil, and China between 1999 and 2000.
### Getting familiar with the data - Exercise 3
@@ -106,7 +108,8 @@ R provides many functions to examine features of a data object
- `str()` - what is the structure of the object?
- `attributes()` - does it have any metadata?
-* Let's explore table1
+* Let's explore `table1`
+
```{r}
# View(table1) # to look at the table in Viewer
table1 # to print the table to console
@@ -183,10 +186,10 @@ There are other verbs as well - as always, look at the `tidyr` cheatsheet!
pivot_wider() "widens" data, increasing the number of columns and decreasing the number of rows.
```{r, echo = FALSE, out.width= '50%'}
-knitr::include_graphics("img/pivot_wider.png")
+knitr::include_graphics(here("img/pivot_wider.png"))
```
-```{show-code}
+```{r, eval=FALSE}
pivot_wider(
data,
names_from = name,
@@ -222,10 +225,10 @@ table2_tidy
pivot_longer() "lengthens" data, increasing the number of rows and decreasing the number of columns.
```{r, echo = FALSE, out.width= '50%'}
-knitr::include_graphics("img/pivot_longer.png")
+knitr::include_graphics(here("img/pivot_longer.png"))
```
-``` r
+```{r eval=FALSE}
pivot_longer(
data,
cols,
@@ -250,10 +253,10 @@ table4_tidy
Given either a regular expression or a vector of character positions, separate() turns a single character column into multiple columns.
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/separate.png")
+knitr::include_graphics(here("img/separate.png"))
```
-```{show-code}
+```{r eval=FALSE}
separate(
data,
col,
@@ -285,7 +288,7 @@ table3_tidy_1
Given either a regular expression or a vector of character positions, separate() turns a single character column into multiple rows.
```{r, echo = FALSE, out.width= '50%'}
-knitr::include_graphics("img/separate_rows.png")
+knitr::include_graphics(here("img/separate_rows.png"))
```
```{show-code}
@@ -311,7 +314,7 @@ This is not a great example because in creating two rows, the case and populatio
unite() combines multiple columns into a single column.
```{r, echo = FALSE, out.width= '50%'}
-knitr::include_graphics("img/unite.png")
+knitr::include_graphics(here("img/unite.png"))
```
```{show-code}
@@ -334,14 +337,14 @@ table6_tidy
### Handling missing values
```{r, echo = FALSE, out.width= '100%'}
-knitr::include_graphics("img/missing-values.png")
+knitr::include_graphics(here("img/missing-values.png"))
```
Source: Rstudio cheatsheets
## Regular expressions
```{r, echo = FALSE, out.width= '80%'}
-knitr::include_graphics("img/regex.png")
+knitr::include_graphics(here("img/regex.png"))
```
Source: Rstudio cheatsheets
diff --git a/content/bootcamp/r/class-03.qmd b/content/bootcamp/r/class-03.qmd
index bcd531d6..300cf5c9 100644
--- a/content/bootcamp/r/class-03.qmd
+++ b/content/bootcamp/r/class-03.qmd
@@ -7,6 +7,7 @@ date: "8/26/2020"
```{r include=FALSE}
library(tidyverse)
library(knitr)
+library(here)
```
@@ -166,7 +167,7 @@ select_if(starwars, is.numeric) # select_if to return all columns with numeric v
Mutate has a LOT of variants.
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/mutate.png")
+knitr::include_graphics(here("img/mutate.png"))
```
Source: Rstudio cheatsheets
@@ -244,7 +245,7 @@ More information here: https://dplyr.tidyverse.org/articles/rowwise.html
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/summarise.png")
+knitr::include_graphics(here("img/summarise.png"))
```
@@ -310,7 +311,7 @@ starwars %>%
+ `union()`
```{r, echo = FALSE, out.width= '100%'}
-knitr::include_graphics("img/combining-tables.png")
+knitr::include_graphics(here("img/combining-tables.png"))
```
Source: Rstudio cheatsheets
diff --git a/content/bootcamp/r/class-04.qmd b/content/bootcamp/r/class-04.qmd
index 04b60547..db8fe0cb 100644
--- a/content/bootcamp/r/class-04.qmd
+++ b/content/bootcamp/r/class-04.qmd
@@ -12,6 +12,7 @@ library(cowplot) # to make panels of plots
library(viridis) # nice colors!
library(ggridges) # ridge plots
library(hexbin) # hexplots
+library(here)
```
### Contact Info
@@ -81,14 +82,14 @@ _coordinate-system_: specify x and y variables
_geometry_: specify type of plots - histogram, boxplot, line, density, dotplot, bar, etc.
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/ggplot-syntax.png")
+knitr::include_graphics(here("img/ggplot-syntax.png"))
```
__aesthetics__ can map variables in the data to visual
properties of the geom (aesthetics) like size, color, and x
and y locations to make the plot more information rich.
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/ggplot-aesthetics.png")
+knitr::include_graphics(here("img/ggplot-aesthetics.png"))
```
### Making a plot step-by-step (Exercise 2)
@@ -187,7 +188,7 @@ You can. But the advantage of ggplot is that it is equally "simple" to make basi
## *Create more complex plots*
```{r, echo = FALSE, out.width= '50%'}
-knitr::include_graphics("img/ggplot-layers.png")
+knitr::include_graphics(here("img/ggplot-layers.png"))
```
### Geom function
@@ -199,7 +200,7 @@ knitr::include_graphics("img/ggplot-layers.png")
### Geom functions for one variable - Exercise 4
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/ggplot-1variable.png")
+knitr::include_graphics(here("img/ggplot-1variable.png"))
```
```{r fig.height=3, fig.width=5}
@@ -258,7 +259,7 @@ With two variables, depending on the nature of the data, you can have different
### discrete x, continuous y - Exercise 5
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/ggplot-geom-dx-cy.png")
+knitr::include_graphics(here("img/ggplot-geom-dx-cy.png"))
```
```{r fig.height=3, fig.width=5}
@@ -303,7 +304,7 @@ ggplot(
### continuous x, continuous y - Exercise 6
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/ggplot-geom-cx-cy.png")
+knitr::include_graphics(here("img/ggplot-geom-cx-cy.png"))
```
```{r fig.height=3, fig.width=4}
@@ -344,7 +345,7 @@ ggplot(
### continuous bivariate - Exercise 7
```{r, echo = FALSE, out.width= '50%'}
-knitr::include_graphics("img/ggplot-geom-cont-bivar.png")
+knitr::include_graphics(here("img/ggplot-geom-cont-bivar.png"))
```
```{r fig.height=3, fig.width=3}
@@ -365,7 +366,7 @@ ggplot(
### Geom functions for three variables - Exercise 8
```{r, echo = FALSE, out.width= '100%'}
-knitr::include_graphics("img/ggplot-geom-3variables.png")
+knitr::include_graphics(here("img/ggplot-geom-3variables.png"))
```
One example with geom_tile()
@@ -383,7 +384,7 @@ ggplot(
R has 25 built in shapes that are identified by numbers. There are some seeming duplicates: for example, 0, 15, and 22 are all squares. The difference comes from the interaction of the colour and fill aesthetics. The hollow shapes (0–14) have a border determined by colour; the solid shapes (15–18) are filled with colour; the filled shapes (21–24) have a border of colour and are filled with fill.
```{r, echo = FALSE, out.width= '80%'}
-knitr::include_graphics("img/ggplot-shapes.png")
+knitr::include_graphics(here("img/ggplot-shapes.png"))
```
```{r fig.height=3, fig.width=4}
@@ -488,7 +489,7 @@ Facets divide a plot into subplots based on the values of one or more discrete v
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/ggplot-facet.png")
+knitr::include_graphics(here("img/ggplot-facet.png"))
```
```{r fig.height=3, fig.width=7}
@@ -523,7 +524,7 @@ ggplot(
Themes can significantly affect the appearance of your plot. Thanksfully, there are a lot to choose from.
```{r, echo = FALSE, out.width= '60%'}
-knitr::include_graphics("img/ggplot-themes.png")
+knitr::include_graphics(here("img/ggplot-themes.png"))
```
```{r fig.height=3, fig.width=4}
@@ -593,7 +594,7 @@ ggplot(
### Labels & Legends - Exercise 15
```{r, echo = FALSE, out.width= '50%'}
-knitr::include_graphics("img/ggplot-labels-legends.png")
+knitr::include_graphics(here("img/ggplot-labels-legends.png"))
```
```{r fig.height=3, fig.width=5}
@@ -658,7 +659,7 @@ More information on using plot_grid (from package `cowplot`) is [here](https://w
### Saving plots (Exercise 18)
```{r}
-ggsave("img/plot_final.png", width = 5, height = 5)
+ggsave(here("img/plot_final.png"), width = 5, height = 5)
# Saves last plot as 5’ x 5’ file named "plot_final.png" in working directory. Matches file type to file extension
```