-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Classification vs. class probability metrics #237
Comments
Upon some additional thought, I wonder if there should just be a wrapper function that simply turns any classification metric into one that can handle class probabilities with thresholding. The library(tidymodels)
library(dplyr)
threshold_metric <- function(metric, threshold = 0.5) {
new_metric <- function(data, truth, estimate, estimator = NULL, na_rm = TRUE,
event_level = yardstick_event_level(), ...) {
data <-
data %>%
mutate(
new_estimate = if_else(
{{ estimate }} >= threshold,
levels({{ truth }})[[2]],
levels({{ truth }})[[1]]
),
new_estimate = factor(new_estimate, levels = levels({{truth}}))
)
metric(data, {{truth}}, new_estimate, estimator, na_rm, event_level, ...)
}
class(new_metric) <- c('prob_metric', 'metric', 'function')
return(new_metric)
}
data <- tibble(y = factor(sample(0:1, 100, TRUE)), y_hat = runif(100))
threshold_metric(sens)(data, y, y_hat, event_level = 'second')
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 sens binary 0.519
threshold_metric(spec)(data, y, y_hat, event_level = 'second')
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 spec binary 0.5
threshold_metric(recall)(data, y, y_hat, event_level = 'second')
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 recall binary 0.519
threshold_metric(precision)(data, y, y_hat, event_level = 'second')
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 precision binary 0.549 The function returned from |
We'd like to be able to optimize that threshold algorithmically so we plan on adding post-processing operations to workflows. We would also avoid tightly coupling the threshold specification from the metric calculation (so it can be used on predicted values). I really want to get this feature in workflows but it is a little lower on the priority list (for now). |
Fair enough. I look forward to this being implemented into |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
Hi,
I love the
yardstick
package but I always find myself trying to use both classification and class probability metrics in a single metric set forworkflows
orworkflowsets
, and always run into the issue that they don't accept the same type of input forestimate
. I always feel like it should "just work". This is probably because I regularly usetorchmetrics
, which allows you to use any of the classification metrics for the same inputs. The thresholding to form discrete levels is handled by the metric function itself, so it makes these classification functions easy to work with because they all take the same inputs, the difference is just how the computations go on behind the scenes. I like this approach because classification metrics are sort of a special case of class probability metrics where some thresholding has been done to theestimate
. A defaultthreshold
of 0.5 is a sensible default, I think.Is there any room in the
yardstick
world to either modify the classification metrics to be more general in that they can optionally accept the sameestimate
type as the class probability metrics, and add athreshold
argument? To avoid issues with backwards compatibility, perhaps theestimate
argument for classification metrics could accept either a factor (the current behaviour), or numeric/probability which gets thresholded and factored as processing step.For a simple use case, when you have a binary outcome and fit a model that spits out probabilities (e.g. logistic regression), I think it would be nice to be able to get
roc_auc
,pr_auc
,sensitivity
,specificity
, etc. in a metric set with less friction.torchmetrics
, for reference (using specificity as an example): https://torchmetrics.readthedocs.io/en/latest/references/functional.html#specificity-funcThe text was updated successfully, but these errors were encountered: