weighted normalized gini #442

SimonCoulombe · 2023-08-08T21:29:55Z

It would be awesome to add weighted normalized gini to the set of available metrics for regression. This is useful in insurance when we want to evaluate to performance of a loss cost model. We order the predictions by the predicted "annualized loss cost", but weigh them by the exposure (time the policy actually lasted) to get the actual dollar amount.

It is discussed (with code) in the following kaggle on fire peril loss cost https://www.kaggle.com/c/liberty-mutual-fire-peril/discussion/9880

Here is some code I use outside tidymodels. It is inspired by the function posted by pimin the kaggle thread. I think he had inverted the sign in the weighted gini, which meant a perfect prediction would get a gini of -0.999 instead of 0.999.

the formula is derived from this 2015 blog post: http://blog.nguyenvq.com/blog/2015/09/25/calculate-the-weighted-gini-coefficient-or-auc-in-r/


#' Title
#'
#' @param actual #  actual  loss cost
#' @param predicted ## predicted loss cost
#' @param weights ## earned exposure
#'
#' @return
#' @export
#'
#' @examples
weighted_gini <- function(actual, predicted, weights) {
  df <- data.frame(actual, weights, predicted)
  n <- nrow(df)
  df <- df[order(df$predicted, decreasing = TRUE), ]
  df$cum_weight <- cumsum(df$weights / sum(df$weights))
  df$cum_pos_found <- cumsum(df$actual * df$weights) 
  df$Lorentz <- df$cum_pos_found / df$cum_pos_found[n]
  sum(df$Lorentz[-n] * df$cum_weight[-1]) - sum(df$Lorentz[-1] * df$cum_weight[-n])
}

#' Title
#'
#' @param actual # actual loss cost
#' @param predicted # predicted loss cost
#' @param weights # earned exposure
#'
#' @return
#' @export
#'
#' @examples
normalized_weighted_gini <- function(actual, predicted, weights) {
  weighted_gini(actual, predicted, weights) / weighted_gini(actual, actual, weights)
}

The text was updated successfully, but these errors were encountered:

simonpcouch · 2023-11-27T20:10:39Z

Related to #147.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

weighted normalized gini #442

weighted normalized gini #442

SimonCoulombe commented Aug 8, 2023 •

edited

Loading

simonpcouch commented Nov 27, 2023

weighted normalized gini #442

weighted normalized gini #442

Comments

SimonCoulombe commented Aug 8, 2023 • edited Loading

simonpcouch commented Nov 27, 2023

SimonCoulombe commented Aug 8, 2023 •

edited

Loading