Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gridmet #74

Merged
merged 5 commits into from
Aug 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: appc
Title: Air Pollution Predictor Commons
Version: 0.2.0
Version: 0.3.0
Authors@R: c(
person("Cole", "Brokamp", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-0289-3151")),
Expand All @@ -16,7 +16,6 @@ Imports:
cli,
dplyr,
fs,
furrr,
glue,
grf,
httr,
Expand All @@ -29,7 +28,7 @@ Imports:
tibble,
tidyr,
withr
RoxygenNote: 7.2.3
RoxygenNote: 7.3.1
Suggests:
curl,
dotenv,
Expand Down
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,12 @@ export(create_daily_merra_data)
export(get_closest_year)
export(get_daily_aqs)
export(get_elevation_summary)
export(get_gridmet_data)
export(get_hms_smoke_data)
export(get_merra_data)
export(get_narr_data)
export(install_elevation_data)
export(install_gridmet_data)
export(install_hms_smoke_data)
export(install_merra_data)
export(install_narr_data)
Expand Down
34 changes: 24 additions & 10 deletions R/assemble.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
assemble_predictors <- function(x, dates, pollutant = c("pm25")) {
check_s2_dates(x, dates)
d <- tibble::tibble(s2 = x, dates = dates)
cli::cli_progress_step("checking that s2 locations are within the contiguous united states")
cli::cli_progress_step("checking that s2 are within the contiguous US")
contig_us_flag <- s2::s2_intersects(s2::as_s2_geography(s2::s2_cell_to_lnglat(d$s2)), s2::as_s2_geography(contiguous_us))
if (!all(contig_us_flag)) stop("not all s2 locations are within the contiguous united states", call. = FALSE)

Expand All @@ -40,16 +40,29 @@ assemble_predictors <- function(x, dates, pollutant = c("pm25")) {

cli::cli_progress_step("adding HMS smoke data")
d$plume_smoke <- get_hms_smoke_data(x = d$s2, dates = d$dates)

cli::cli_progress_step("adding NARR")
my_narr <- purrr::partial(get_narr_data, x = d$s2, dates = d$dates)
d$air.2m <- my_narr("air.2m")
## d$air.2m <- my_narr("air.2m")
d$hpbl <- my_narr("hpbl")
d$acpcp <- my_narr("acpcp")
d$rhum.2m <- my_narr("rhum.2m")
d$vis <- my_narr("vis")
d$pres.sfc <- my_narr("pres.sfc")
d$uwnd.10m <- my_narr("uwnd.10m")
d$vwnd.10m <- my_narr("vwnd.10m")
## d$acpcp <- my_narr("acpcp")
## d$rhum.2m <- my_narr("rhum.2m")
## d$vis <- my_narr("vis")
## d$pres.sfc <- my_narr("pres.sfc")
## d$uwnd.10m <- my_narr("uwnd.10m")
## d$vwnd.10m <- my_narr("vwnd.10m")

cli::cli_progress_step("adding gridMET")
my_gridmet <- purrr::partial(get_gridmet_data, x = d$s2, dates = d$dates)
d$temperature_max <- my_gridmet("tmmx")
d$temperature_min <- my_gridmet("tmmn")
d$precipitation <- my_gridmet("pr")
d$solar_radiation <- my_gridmet("srad")
d$wind_speed <- my_gridmet("vs")
d$wind_direction <- my_gridmet("th")
d$specific_humidity <- my_gridmet("sph")
## d$relative_humidity_max <- my_gridmet("rmax")
## d$relative_humidity_min <- my_gridmet("rmin")

cli::cli_progress_step("adding MERRA")
d$merra <- get_merra_data(d$s2, d$dates)
Expand Down Expand Up @@ -81,8 +94,9 @@ assemble_predictors <- function(x, dates, pollutant = c("pm25")) {
d <-
d |>
tidyr::unnest(cols = c(
dates, air.2m, hpbl, acpcp,
rhum.2m, vis, pres.sfc, uwnd.10m, vwnd.10m,
dates, hpbl,
temperature_max, temperature_min, precipitation, solar_radiation, wind_speed, wind_direction, specific_humidity,
## air.2m, acpcp, rhum.2m, vis, pres.sfc, uwnd.10m, vwnd.10m,
merra_pm25,
merra_dust, merra_oc, merra_bc, merra_ss, merra_so4,
## urban_imperviousness,
Expand Down
90 changes: 90 additions & 0 deletions R/gridmet.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
#' Get gridMET surface meteorological data
#'
#' Daily, high spatial resolution (~4-km) data comes from the [Climatology Lab](https://www.climatologylab.org/gridmet.html)
#' and is available for the contiguous US from 1979-yesterday.
#' @param x a vector of s2 cell identifers (`s2_cell` object)
#' @param dates a list of date vectors for the NARR data, must be the same length as `x`
#' @param gridmet_var a character string that is the name of a gridMET variable
#' @param gridmet_year a character string that is the year for the gridMET data; see details
#' @details GRIDMET data comes as 1/24th degree gridded data, which is about 4 sq km resolution. s2 geohashes
#' are intersected with this grid for matching with daily weather values.
#'
#' gridMET variables are named:
#' ```
#' gridmet_variable <- c(
#' temperature_max = "tmmx",
#' temperature_min = "tmmn",
#' precipitation = "pr",
#' solar_radiation = "srad",
#' wind_speed = "vs",
#' wind_direction = "th",
#' relative_humidity_max = "rmax",
#' relative_humidity_min = "rmin"
#' specific_humidity = "sph"
#' )
#' ```
#' @return for `get_gridmet_data()`, a list of numeric vectors of gridMET values (the same length as `x` and `dates`)
#' @references <https://www.climatologylab.org/gridmet.html>
#' @references <https://www.northwestknowledge.net/metdata/data/>
#' @export
#' @examples
#' d <- list(
#' "8841b39a7c46e25f" = as.Date(c("2023-05-18", "2023-11-06")),
#' "8841a45555555555" = as.Date(c("2023-06-22", "2023-08-15"))
#' )
#' get_gridmet_data(x = s2::as_s2_cell(names(d)), dates = d, gridmet_var = "tmmx")
#' get_gridmet_data(x = s2::as_s2_cell(names(d)), dates = d, gridmet_var = "pr")
get_gridmet_data <- function(x, dates, gridmet_var = c("tmmx", "tmmn", "pr", "srad", "vs", "th", "rmax", "rmin", "sph")) {
check_s2_dates(x, dates)
gridmet_var <- rlang::arg_match(gridmet_var)
gridmet_years <-
dates |>
unlist() |>
as.Date(origin = "1970-01-01") |>
sort() |>
unique() |>
format("%Y") |>
unique()
gridmet_raster <-
gridmet_years |>
purrr::map_chr(\(.) install_gridmet_data(gridmet_var = gridmet_var, gridmet_year = .)) |>
purrr::map(terra::rast) |>
purrr::map2(gridmet_years, \(.x, .y) {
stats::setNames(
.x,
seq(
as.Date(glue::glue("{.y}-01-01")),
as.Date(glue::glue("{.y}-12-31")),
by = 1
)
)
}) |>
purrr::reduce(c)
x_vect <-
s2::s2_cell_to_lnglat(x) |>
as.data.frame() |>
terra::vect(geom = c("x", "y"), crs = "+proj=longlat +datum=WGS84") |>
terra::project(gridmet_raster)
gridmet_cells <- terra::cells(gridmet_raster[[1]], x_vect)[, "cell"]
xx <- as.data.frame(t(terra::extract(gridmet_raster, gridmet_cells)))
purrr::map2(1:ncol(xx), dates, \(.x, .y) xx[as.character(.y), .x]) |>
stats::setNames(as.character(x))
}

#' Installs gridMET raster data into user's data directory for the `appc` package
#' @return for `install_gridmet_data()`, a character string path to gridMET raster data
#' @export
#' @rdname get_gridmet_data
install_gridmet_data <- function(gridmet_var = c("tmmx", "tmmn", "pr", "srad", "vs", "th", "rmax", "rmin", "sph"),
gridmet_year = as.character(1979:format(Sys.Date(), "%Y"))) {
gridmet_var <- rlang::arg_match(gridmet_var)
gridmet_year <- rlang::arg_match(gridmet_year)
dest_file <- fs::path(tools::R_user_dir("appc", "data"), glue::glue("gridmet_{gridmet_var}_{gridmet_year}.nc"))
if (file.exists(dest_file)) {
return(dest_file)
}
message(glue::glue("downloading {gridmet_year} {gridmet_var}:"))
glue::glue("https://www.northwestknowledge.net/metdata/data/{gridmet_var}_{gridmet_year}.nc") |>
utils::download.file(destfile = dest_file, mode = "wb")
return(dest_file)
}
4 changes: 3 additions & 1 deletion R/helper.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,9 @@ utils::globalVariables(c(
"vis", "pres.sfc", "uwnd.10m", "vwnd.10m",
"merra_pm25",
"plume_smoke", ".rowid",
"predictions", "variance.estimates"
"predictions", "variance.estimates",
"precipitation", "solar_radiation", "specific_humidity",
"temperature_max", "temperature_min", "wind_direction", "wind_speed"
))

#' Get the closest years to a vector of dates
Expand Down
11 changes: 6 additions & 5 deletions R/hms_smoke.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,7 @@
#' The HMS operates daily in near real-time by outlining the smoke polygon of each distinct smoke plume
#' and classifying it as "light", "medium", and "heavy". Since multiple plumes of varying or the same classification
#' can cover one another, the total smoke plume exposure is estimated as the weighted sum of all plumes, where
#' "light" = 1, "medium" = 2, and "heavy" = 3. Parallel processing via the `furrr` package can be used by
#' setting the `future::plan()` evaluation strategy beforehand.
#' "light" = 1, "medium" = 2, and "heavy" = 3.
#' @param x a vector of s2 cell identifers (`s2_cell` object); currently required to be within the contiguous united states
#' @param dates a list of date vectors for the predictions, must be the same length as `x`
#' @return for `get_hms_smoke_data()`, a list of numeric vectors of smoke plume scores (the same length as `x` and `dates`)
Expand All @@ -25,30 +24,32 @@ get_hms_smoke_data <- function(x, dates) {
date_smoke_geoms <- purrr::map(dates, \(.) d_smoke[as.character(.)])
withr::with_options(list(sf_use_s2 = FALSE, future.rng.onMisuse = "ignore"), {
out <-
furrr::future_map(seq_along(x), \(i) {
purrr::map(seq_along(x), \(i) {
purrr::map(date_smoke_geoms[[i]], \(.) sf::st_join(sf::st_as_sf(s2::s2_cell_to_lnglat(x[[i]])), .)) |>
suppressMessages() |>
purrr::map("Density") |>
purrr::map(\(.) as.numeric(factor(., levels = c("Light", "Medium", "Heavy")))) |>
purrr::map_dbl(sum, na.rm = TRUE) |>
as.numeric() |>
suppressWarnings()
}, .progress = TRUE)
}, .progress = "intersecting smoke data")
})
return(out)
}

#' installs HMS smoke data into user's data directory for the `appc` package
#' @return for `install_hms_smoke_data()`, a character string path to the installed RDS file
#' @rdname get_hms_smoke_data
#' @details this installs smoke data created using code from version 0.2.0 of the package;
#' version 0.3.0 of the package did not change smoke data code
#' @export
install_hms_smoke_data <- function() {
dest_file <- fs::path(tools::R_user_dir("appc", "data"), "hms_smoke.rds")
if (file.exists(dest_file)) {
return(as.character(dest_file))
}
if (!install_source_preference()) {
install_released_data(released_data_name = "hms_smoke.rds")
install_released_data(released_data_name = "hms_smoke.rds", package_version = "0.2.0")
return(as.character(dest_file))
}
smoke_days <- seq(as.Date("2017-01-01"), as.Date("2023-12-31"), by = 1)
Expand Down
8 changes: 5 additions & 3 deletions R/merra.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@
#' permissions for GES DISC](https://disc.gsfc.nasa.gov/information/documents?title=Data%20Access) is required.
#' The `EARTHDATA_USERNAME` and `EARTHDATA_PASSWORD` must be set. If
#' a `.env` file is present, environment variables will be loaded
#' using the {dotenv} package.
#' using the dotenv package.
#' - Installed data are filtered to a
#' [bounding box](http://bboxfinder.com/#24.766785,-126.474609,49.894634,-66.445313)
#' around the contiguous US, averaged to daily values, and
#' converted to micrograms per cubic meter ($ug/m^3$).
#' - Total surface PM2.5 mass is calculated according to
#' the formula in <https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/FAQ/#Q4>
#' Set a proxy to be used by all {httr} calls in the merra functions with `httr::set_config(httr::use_proxy( ... ))`; e.g.
#' Set a proxy to be used by all httr calls in the merra functions with `httr::set_config(httr::use_proxy( ... ))`; e.g.
#' `httr::set_config(httr::use_proxy("http://bmiproxyp.chmcres.cchmc.org", 80, Sys.getenv("CCHMC_USERNAME"), Sys.getenv("CCHMC_PASSWORD")))`
#' @param x a vector of s2 cell identifers (`s2_cell` object)
#' @param dates a list of date vectors for the MERRA data, must be the same length as `x`
Expand Down Expand Up @@ -71,6 +71,8 @@ get_merra_data <- function(x, dates) {
#' `install_merra_data()` installs MERRA PM2.5 data into
#' user's data directory for the `appc` package
#' @return for `install_merra_data()`, a character string path to the merra data
#' @details this installs merra data created using code from version 0.2.0 of the package;
#' version 0.3.0 of the package did not change merra data code
#' @export
#' @rdname get_merra_data
install_merra_data <- function(merra_year = as.character(2016:2023)) {
Expand All @@ -83,7 +85,7 @@ install_merra_data <- function(merra_year = as.character(2016:2023)) {
return(as.character(dest_file))
}
if (!install_source_preference()) {
install_released_data(released_data_name = glue::glue("merra_{merra_year}.rds"))
install_released_data(released_data_name = glue::glue("merra_{merra_year}.rds"), package_version = "0.2.0")
return(as.character(dest_file))
}
date_seq <- seq(as.Date(paste(c(merra_year, "01", "01"), collapse = "-")),
Expand Down
4 changes: 2 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,12 @@ Spatiotemporal geomarkers are used for predicting air pollution concentrations,

| geomarker | appc function |
|-----------|---------------|
| 🌦 weather & atmospheric conditions | `get_narr_data()` |
| 🌦 weather & atmospheric conditions | `get_gridmet_data`, `get_narr_data()` |
| 🛰 satellite-based aerosol diagnostics | `get_merra_data()` |
| 🔥 wildfire smoke | `get_hms_smoke_data()` |
| 🗻 elevation | `get_elevation_summary()` |

Currently, `get_urban_imperviousness()`, `get_traffic()`, and `get_nei_point_summary()` are stashed in the `/inst` folder and not integrated into this package.
Currently, `get_urban_imperviousness()`, `get_traffic()`, and `get_nei_point_summary()` are stashed in the `/inst` folder and are not integrated into this package.

## Developing

Expand Down
43 changes: 23 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,42 +48,45 @@ appc::predict_pm25(
dates = list(as.Date(c("2023-05-18", "2023-11-06")), as.Date(c("2023-06-22", "2023-08-15")))
)
#> ℹ (down)loading random forest model
#> ✔ (down)loading random forest model [8.2s]
#> ✔ (down)loading random forest model [9.2s]
#>
#> ℹ checking that s2 locations are within the contiguous united states
#> ✔ checking that s2 locations are within the contiguous united states [55ms]
#> ℹ checking that s2 are within the contiguous US
#> ✔ checking that s2 are within the contiguous US [60ms]
#>
#> ℹ adding coordinates
#> ✔ adding coordinates [1.3s]
#> ✔ adding coordinates [1.6s]
#>
#> ℹ adding elevation
#> ✔ adding elevation [1.3s]
#> ✔ adding elevation [1.4s]
#>
#> ℹ adding HMS smoke data
#> ✔ adding HMS smoke data [967ms]
#> ✔ adding HMS smoke data [889ms]
#>
#> ℹ adding NARR
#> ✔ adding NARR [3.1s]
#> ✔ adding NARR [483ms]
#>
#> ℹ adding gridMET
#> ✔ adding gridMET [443ms]
#>
#> ℹ adding MERRA
#> ✔ adding MERRA [569ms]
#> ✔ adding MERRA [558ms]
#>
#> ℹ adding time components
#> ✔ adding time components [24ms]
#> ✔ adding time components [23ms]
#>
#> [[1]]
#> # A tibble: 2 × 2
#> pm25 pm25_se
#> <dbl> <dbl>
#> 1 7.95 0.917
#> 2 9.32 0.814
#> 1 8.16 1.11
#> 2 9.28 0.806
#>
#> [[2]]
#> # A tibble: 2 × 2
#> pm25 pm25_se
#> <dbl> <dbl>
#> 1 5.82 0.685
#> 2 7.68 0.765
#> 1 5.13 0.381
#> 2 5.95 0.466
```

Installed geomarker data sources and the grf model are hosted as release
Expand Down Expand Up @@ -119,15 +122,15 @@ Spatiotemporal geomarkers are used for predicting air pollution
concentrations, but also serve as exposures or confounding exposures
themselves. View information and options about each geomarker:

| geomarker | appc function |
|---------------------------------------|---------------------------|
| 🌦 weather & atmospheric conditions | `get_narr_data()` |
| 🛰 satellite-based aerosol diagnostics | `get_merra_data()` |
| 🔥 wildfire smoke | `get_hms_smoke_data()` |
| 🗻 elevation | `get_elevation_summary()` |
| geomarker | appc function |
|----|----|
| 🌦 weather & atmospheric conditions | `get_gridmet_data`, `get_narr_data()` |
| 🛰 satellite-based aerosol diagnostics | `get_merra_data()` |
| 🔥 wildfire smoke | `get_hms_smoke_data()` |
| 🗻 elevation | `get_elevation_summary()` |

Currently, `get_urban_imperviousness()`, `get_traffic()`, and
`get_nei_point_summary()` are stashed in the `/inst` folder and not
`get_nei_point_summary()` are stashed in the `/inst` folder and are not
integrated into this package.

## Developing
Expand Down
1 change: 1 addition & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ reference:
- title: Geomarker assessment
contents:
- get_elevation_summary
- get_gridmet_data
- get_narr_data
- get_merra_data
- get_hms_smoke_data
Expand Down
4 changes: 0 additions & 4 deletions inst/make_training_data.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,6 @@
library(dplyr, warn.conflicts = FALSE)
library(purrr)

future::plan("multicore")
cli::cli_alert_info("using `future::plan()`:")
future::plan()

# load development version if developing (instead of currently installed version)
if (file.exists("./inst")) {
devtools::load_all()
Expand Down
Loading
Loading