Merge pull request #155 from nmfs-fish-tools/joss-review

Joss review
nmfs-fish-tools · Sep 6, 2023 · 2a25b93 · 2a25b93
2 parents 47445d9 + 82da678
commit 2a25b93
Show file tree

Hide file tree

Showing 21 changed files with 739 additions and 55 deletions.
diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json
@@ -0,0 +1,22 @@
+// For format details, see https://aka.ms/devcontainer.json. For config options, see the
+// README at: https://github.com/rocker-org/devcontainer-templates/tree/main/src/r-ver
+{
+	"name": "R (rocker/r-ver base)",
+	// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
+	"image": "ghcr.io/rocker-org/devcontainer/r-ver:4.3",
+
+	// Features to add to the dev container. More info: https://containers.dev/features.
+	// "features": {},
+
+	// Use 'forwardPorts' to make a list of ports inside the container available locally.
+	// "forwardPorts": [],
+
+	// Use 'postCreateCommand' to run commands after the container is created.
+	// "postCreateCommand": "R -q -e 'renv::install()'",
+
+	// Configure tool-specific properties.
+	// "customizations": {},
+
+	// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
+	// "remoteUser": "root"
+}
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: SSMSE
 Title: Management Strategy Evaluation (MSE) using Stock Synthesis (SS)
-Version: 0.2.6
+Version: 0.2.7
 Authors@R: c(
     person("Kathryn", "Doering", , "[email protected]", role = c("aut", "cre")),
     person("Nathan", "Vaughan", , "[email protected]", role = "aut")
@@ -28,11 +28,11 @@ Imports:
     ss3sim,
     stats,
     tidyr,
-    utils
-Suggests: 
+    utils,
     doParallel,
     foreach,
-    parallel,
+    parallel
+Suggests:
     testthat
 Remotes:
     r4ss/r4ss@main,

diff --git a/JOSS/paper.bib b/JOSS/paper.bib
@@ -295,7 +295,7 @@ @article{a4amsecite
 
 	year = {2017},
 
-	author = {Gamito, Jardim JE and Scott, F and Mosqueira, Sanchez I and Citores, L and Devine, J and Fischer, S and Ibaibarriaga, L and Mannini, A and Millar, C and Miller, D and Minto, C and De Oliveira, J and Osio, GC and Urtizberea, A and Vasilakopoulos, P and Kell, LT},
+	author = {Jardim, JE and Scott, F and Mosqueira, I and Citores, L and Devine, J and Fischer, S and Ibaibarriaga, L and Mannini, A and Millar, C and Miller, D and Minto, C and De Oliveira, J and Osio, GC and Urtizberea, A and Vasilakopoulos, P and Kell, LT},
 
 	isbn = {978-92-79-71290-6},
 

diff --git a/JOSS/paper.md b/JOSS/paper.md
@@ -66,13 +66,11 @@ Management Strategy Evaluation (MSE) is a decision-support tool for fisheries
 management. MSE uses closed-loop simulation to evaluate the long-term
 performance of management strategies with respect to societal goals like
 sustainability and profits [@smith1994; @punt2014 \; \autoref{fig:MSE-diagram}].
-Management strategies are pre-defined decision rules that dynamically adjust
+Management strategies are pre-defined decision rules that can dynamically adjust
 management advice given an estimate of population status. In addition to
-specifying management actions, management strategies may include the processes
-of stock assessment (i.e., using models to determine the size and status of a
-population) [@sainsburyetal2000].
+specifying management actions, management strategies may specify how a stock assessment model is configured to determine the size and status of a population [@sainsburyetal2000].
 
-Within MSE simulations, operating models (OMs) represent the “true” dynamics and
+Within MSE simulations, operating models (OMs) represent the hypothesized dynamics and
 relevant complexity of the system. Multiple OMs are typically generated for a
 single MSE to reflect different uncertainties and assess management performance
 under uncertainty. Developing suitable OMs requires an analyst to, at a
@@ -111,12 +109,15 @@ adaptable to new analyses and populations.
 
 Existing generalized MSE tools [e.g., openMSE, @openMSEcite; FLR’s mse R
 package, @a4amsecite] have been built around custom OMs developed for use in
-each package. These applications provide the benefits of a generalized MSE
-codebase, but offer limited capacity to use existing stock assessment products
-created using SS3 for OM development. These tools do support importing
+each package. These tools provide the benefits of a generalized MSE
+codebase and a wide range of built-in procedures and estimation model options that
+are able to answer a variety of questions. However, existing 
+generalized MSE tools offer limited capacity to use existing stock assessment 
+products directly as OMs. These tools do support importing
 specifications from stock assessment model files such as SS3, but converting SS3
 models to a different model format often results in some loss of model
-structure. Additionally, it can be time consuming for the analyst to learn a
+structure. For complex populations, loosing model structure may not 
+represent the population well. Additionally, it can be time consuming for the analyst to learn a
 different model format.
 
 The primary goal of the SSMSE project was to develop a tool that can use
@@ -179,21 +180,21 @@ SSMSE:
    functional model structures in the estimation model and observational noise
    in data resulting in poor estimation of model parameter values (even if the 
    assessment is correctly specified outside of those estimated parameters). Users can
-   adjust errors in assessments by specifying different fixed values and
-   structures in different scenarios and by changing the sampling scheme through
-   the `sample_struct_list` input to `run_SSMSE()` to adjust observation
-   uncertainty.
+   adjust errors in assessments by 1) specifying different fixed values and
+   structures in different scenarios by directly changing the model files; and 2) 
+   by changing the sampling scheme through the `sample_struct_list` input to
+    `run_SSMSE()` to adjust observation uncertainty.
 5. Implementation uncertainty happens because it is difficult to perfectly
    implement a theoretical management strategy. For example, fishing may
    continue to occur after the theoretical catch limit is caught because there
    is a time lag in reporting and the catch limit is exceeded before fishing can
    be stopped. Implementation uncertainty (also known as implementation error)
-   can be specified in the `future_om_list` input to `run_SSMSE()`.
+   can be added by specifiying it in the `future_om_list` input to `run_SSMSE()`.
 
 The source code for SSMSE is available at
 [https://github.com/nmfs-fish-tools/SSMSE](https://github.com/nmfs-fish-tools/SSMSE).
 A [user
-manual](https://nmfs-fish-tools.github.io/SSMSE/manual/index.html) provides more details on how to use the SSMSE tool. SSMSE can be installed from
+manual](https://nmfs-fish-tools.github.io/SSMSE/manual) provides more details on how to use the SSMSE tool. SSMSE can be installed from
 the R console using the `remotes` package:
 
 ```{r}
@@ -222,7 +223,7 @@ distinct management strategies. We used a cod-like species as the population and
 one fishing fleet and one survey in both the operating and estimation models.
 
 Because the pattern of natural mortality is uncertain, we built three OMs, each
-reflecting a different hypothesis of the “true” natural mortality dynamics of
+reflecting a different hypothesis of the natural mortality dynamics of
 the stock: 1) constant instantaneous natural mortality at 0.2 $yr^{-1}$ (per year);
 2) natural mortality at 0.2 $yr^{-1}$ with a spike in natural mortality of
 0.3 $yr^{-1}$ every 5 years; and 3) natural mortality at 0.2
@@ -251,7 +252,7 @@ harvest controls specified by the user in the estimation model forecast file.
 Two management strategies with alternative target harvest rates corresponding to
 a Spawning Potential Ratio (SPR) of 30% or 45% ($SPR_{30}$ and
 $SPR_{45}$, respectively) were used. The estimation model assumed constant
-natural mortality of 0.2 $yr^{-1}$ (i.e., matching true base natural
+natural mortality of 0.2 $yr^{-1}$ (i.e., matching the hypothesized base natural
 mortality but not accounting for episodic spikes in natural mortality included in some OMs). 
 
 The forecasting module of the SS3 estimation model estimated the management
@@ -282,7 +283,9 @@ by extracting point estimates of catch from the first 10 years of the
 projection, averaging for each iteration across years, and plotting.
 
 The R code used to set up this simulation is available at
-[https://nmfs-fish-tools.github.io/SSMSE/manual/M-case-study-ex.html](https://nmfs-fish-tools.github.io/SSMSE/manual/M-case-study-ex.html).
+[https://nmfs-fish-tools.github.io/SSMSE/manual/M-case-study-ex.html](https://nmfs-fish-tools.github.io/SSMSE/manual/M-case-study-ex.html). 
+The complete simulation may take hours or days to run, so we recommend reducing the 
+number of iterations if running for illustrative purposes.
 
 Iterations were excluded if any runs of the estimation model failed to converge,
 had a high maximum gradient (>2), or had parameters on bounds. This resulted in
@@ -371,7 +374,7 @@ Sample $n$ years of data | $n = 5$ | No | No
 # Figures
 
 ![The main components of MSE simulations. The operating model (OM) represents
-the “truth”. From the OM, data can be sampled (in sample data step) and passed
+the hypothesized dynamics. From the OM, data can be sampled (in sample data step) and passed
 to the management strategy. The management strategy is run and usually
 influences the OM (e.g., the management strategy may remove a certain
 amount of catch from the OM) as the OM is stepped forward in time. The
@@ -388,10 +391,11 @@ multiple iterations and/or scenarios could be called through
 ![Diagram illustrating a basic workflow for using SSMSE. This diagram shows the
 functions (ovals) in addition to input and output objects (rounded rectangles)
 and the steps for which users will write their own code (rectangle enclosed by
-dashed line).\label{fig:SSMSE-workflow}](images/SSMSE-workflow.png)
+dashed line). Note that the helper functions `create_sample_struct()` and `create_future_om_list()` may be used to assemble components of the "user inputs." \label{fig:SSMSE-workflow}](images/SSMSE-workflow.png)
 
-![Natural mortality patterns in the OMs through the simulation years (years
-101-150).\label{fig:case-study-M}](images/case-study-M.png)
+![Natural mortality patterns in the case study OMs through the simulation years (years
+101-150). The EMs assumed constant natural mortality equivalent to the pattern
+labeled "none." \label{fig:case-study-M}](images/case-study-M.png)
 
 ![Performance metrics from the case study. Each plot shows a different performance metric. Each violin
 represents the distribution of the metric from a different

diff --git a/NAMESPACE b/NAMESPACE
@@ -1,11 +1,18 @@
 # Generated by roxygen2: do not edit by hand
 
 export(SSMSE_summary_all)
+export(check_convergence)
 export(create_future_om_list)
 export(create_sample_struct)
 export(create_scen_list)
 export(develop_OMs)
+export(get_SSB_avg)
+export(get_avg_catch)
 export(get_bin)
+export(get_catch_cv)
+export(get_catch_sd)
+export(get_rel_SSB_avg)
+export(get_total_catch)
 export(parse_MS)
 export(plot_comp_sampling)
 export(plot_index_sampling)
@@ -14,7 +21,6 @@ export(run_SSMSE)
 export(run_ss_model)
 export(set_MSE_seeds)
 import(ggplot2)
-import(parallel)
 import(r4ss)
 importFrom(assertive.base,assert_is_identical_to_true)
 importFrom(magrittr,"%>%")

diff --git a/R/example_perf_metrics.R b/R/example_perf_metrics.R
@@ -0,0 +1,187 @@
+#' Example Performance Metric: Calculate total catch over a range of years
+#'
+#' Example performance metric that calculates total catch over a range of years 
+#'  in a Stock Synthesis data file. This function aggregates catch across
+#'  fleets, so may not be appropriate for models with multiple fleets.
+#' @param datfile Path to the Stock Synthesis data file containing catch
+#' @param yrs A vector containing a range of years. Years are as defined in the
+#'  Stock Synthesis data file.
+#' @returns The total catch, a number.
+#' @export
+#' @examples \dontrun{
+#'   total_catch <- get_total_catch(datfile = "ss3model/dat.ss", yrs = 25:100)
+#'   total_catch
+#' }
+get_total_catch <- function(datfile, yrs) {
+   dat <- r4ss::SS_readdat(datfile, verbose = FALSE)
+   catch_sum <- sum(dat$catch[dat$catch$year %in% yrs, "catch"])
+}
+
+#' Example Performance Metric: Calculate average catch over a range of years
+#'
+#' Example performance metric that calculates average catch over a range of
+#'  years in a Stock Synthesis data file. This function aggregates across
+#'  fleets, so may not be appropriate for models with multiple fleets.
+#' @inheritParams get_total_catch
+#' @returns The average catch, a number.
+#' @export
+#' @examples
+#' \dontrun{
+#' avg_catch <- function(datfile = "ss3model/dat.ss", yrs = 30:75)
+#' avg_catch
+#' }
+get_avg_catch <- function(datfile, yrs) {
+  dat <- r4ss::SS_readdat(datfile, verbose = FALSE)
+  catch_avg <- mean(dat$catch[dat$catch$year %in% yrs, "catch"])
+}
+
+#' Example Performance Metric: Calculate Standard Deviation of Catch
+#'
+#' Example performance metric that calculates the standard deviation of catch 
+#'  over a range of years in a Stock Synthesis data file. This function 
+#'  aggregates across fleets, so may not be appropriate for models with multiple
+#'  fleets.
+#' @inheritParams get_total_catch
+#' @returns The catch standard deviation, a number.
+#' @export
+#' @examples 
+#' \dontrun{
+#'   catch_sd <- get_catch_sd(datfile = "mod/dat.ss", yrs = c(20:50, 75:100))
+#'   catch_sd
+#' }
+get_catch_sd <- function(datfile, yrs) {
+  dat <- r4ss::SS_readdat(datfile, verbose = FALSE)
+  catch_var <- sd(dat$catch[dat$catch$year %in% yrs, "catch"])
+}
+
+#' Example Performance Metric: Calculate the coefficient of variation of catch
+#'
+#' Example performance metric that calculates the coefficient of variation (CV)
+#'  of catch over a range of years in a Stock Synthesis data file. This function 
+#'  aggregates across fleets, so may not be appropriate for models with multiple
+#'  fleets.
+#' @inheritParams get_total_catch
+#' @returns The catch coefficient of variation, a number.
+#' @export
+#' @examples
+#' \dontrun{
+#'   catch_cv <- get_catch_cv(datfile = "mod/dat.ss", yrs = c(20:50, 75:100))
+#'   catch_cv
+#' }
+get_catch_cv <- function(datfile, yrs) {
+  dat <- r4ss::SS_readdat(datfile, verbose = FALSE)
+  catch <- dat$catch[dat$catch$year %in% yrs, "catch"]
+  catch_var <- sd(catch)/mean(catch)
+}
+
+#' Example Performance Metric: calculate the average SSB over a range of years
+#'  for each iteration
+#' 
+#' Example performance metric that calculates the average Spawning Stock Biomass
+#'  SSB (units as in the simulations) over a range of years for each iteration of
+#'  each scenario in the SSMSE simulation run.
+#' @param summary Summary returned from running `SSMSE_summary_all()`
+#' @param min_year The first year to include in the average
+#' @param max_year The last year to include in the average
+#' @export
+#' @importFrom magrittr %>%
+#' @examples
+#' \dontrun{
+#'    avg_ssb <- get_SSB_avg(run_SSMSE_summary, min_yr = 10, max_yr = 105)
+#'    avg_ssb
+#' }
+#' @returns A tibble containing the average SSB by iteration and scenario.
+get_SSB_avg <- function(summary, min_yr, max_yr) {
+  OM_vals <- unique(summary$ts$model_run)
+  OM_vals <- grep("_OM$", OM_vals, value = TRUE)
+  SSB_yr <- summary$ts %>% 
+              dplyr::filter(year >= min_yr & year <= max_yr) %>% 
+              dplyr::filter(model_run %in% OM_vals) %>% 
+              dplyr::select(iteration, scenario, year, SpawnBio) %>% 
+              dplyr::group_by(iteration, scenario) %>% 
+              dplyr::summarize(avg_SSB = mean(SpawnBio), .groups = "keep") %>% 
+              dplyr::ungroup()
+  SSB_yr
+}
+
+#' Example Performance Metric: Calculate the avg relative SSB (SSB/SSB unfished)
+#' over a range of years for each iteration
+#' 
+#' Example performance metric that calculates the average Spawning Stock Biomass
+#'  SSB (units as in the simulations) relative to the  unfished SSB over a range
+#'  of years for each iteration of each scenario in the SSMSE simulation run.
+#' @inheritParams get_SSB_avg
+#' @export
+#' @examples
+#' \dontrun{
+#'   rel_avg_ssb <- get_rel_SSB_avg(run_SSMSE_summary, min_yr = 10, max_yr = 105)
+#'   rel_avg_ssb
+#' }
+#' @importFrom magrittr %>%
+#' @returns A tibble containing the relative avg SSB per year by iteration and
+#'  scenario.
+get_rel_SSB_avg <- function(summary, min_yr, max_yr) {
+  OM_vals <- unique(summary$ts$model_run)
+  OM_vals <- grep("_OM$", OM_vals, value = TRUE)
+  B_unfished <- summary$scalar %>% 
+                 dplyr::filter(model_run %in% OM_vals) %>% 
+                 dplyr::select(iteration, scenario,SSB_Unfished)
+  SSB_yr <- summary$ts %>% 
+              dplyr::filter(year >= min_yr & year <= max_yr) %>%
+              dplyr::select(iteration, scenario, year, SpawnBio)
+  SSB_yr <- dplyr::left_join(SSB_yr, B_unfished) %>%
+              dplyr::mutate(Rel_SSB = SpawnBio/SSB_Unfished) %>%
+              dplyr::group_by(iteration, scenario) %>%
+              dplyr::summarize(avg_SSB = mean(Rel_SSB), .groups = "keep") %>%
+              dplyr::ungroup()
+  SSB_yr
+}
+
+
+#' Flag potential convergence issues in SS3 model runs
+#'
+#' Does basic checks for convergance of estimation model runs from `run_SSMSE()`
+#'  simulations. This function 1) warns if there are parameters on bounds; 2) 
+#'  warns if the SSB in the EM is 2x as large or half as small as the OM. Note
+#'  these warnings may not mean that the models have not converged, but can flag
+#'  potential issues that can be investigated further
+#' @param summary Summary returned from running `SSMSE_summary_all()`
+#' @param min_yr The first year of SSB checked
+#' @param max_yr The last year of SSB checked
+#' @export
+#' @importFrom magrittr %>%
+#' @examples
+#' \dontrun{
+#'   check_convergance(SSMSE_summary, min_yr = 101, max_yr = 120)
+#' }
+#' @returns A tibble containing the SSB values in the EM relative to the OM by
+#'  model run of each iteration of each scenario.
+check_convergence <- function(summary, min_yr, max_yr) {
+  if(any(!is.na(summary$scalar$params_on_bound))) {
+    warning("Params on bounds")
+  } else {
+    message("No params on bounds")
+  }
+  summary$ts$model_type <- ifelse(grepl("_EM_", summary$ts$model_run), "EM", "OM")
+  calc_SSB <- summary$ts %>%
+    dplyr::filter(year >= min_yr & year <= max_yr) %>% 
+    dplyr::select(iteration, scenario, year, model_run, model_type, SpawnBio)
+  OM_vals <- calc_SSB %>% 
+              dplyr::filter(model_type == "OM") %>% 
+              dplyr::rename(SpawnBio_OM = SpawnBio ) %>% 
+              dplyr::select(iteration, scenario, year, SpawnBio_OM)
+  EM_vals <- calc_SSB %>% 
+               dplyr::filter(model_type == "EM") %>% 
+               dplyr::rename(SpawnBio_EM = SpawnBio) %>% 
+               dplyr::select(iteration, scenario, year, model_run, SpawnBio_EM)
+  bind_vals <- dplyr::full_join(EM_vals, OM_vals, by = c("iteration", "scenario", "year")) %>% 
+               dplyr::mutate(SSB_ratio = SpawnBio_EM/SpawnBio_OM)
+  filter_SSB <- bind_vals %>% 
+    dplyr::filter(SSB_ratio > 2 | SSB_ratio < 0.5)
+  if(nrow(filter_SSB) > 0) {
+    warning("Some large/small SSBs relative to OM")
+  } else {
+    message("All SSBs in EM are no greater than double and no less than half SSB vals in the OM")
+  }
+  return_val <- bind_vals
+}