Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#150 Return SHAP values with ShapRFECV fit_compute call #171

Conversation

eduardo-lourenco
Copy link

This PR targets issue #150 and comprises the following changes:

  • Return SHAP values with ShapRFECV fit_compute call by including them in the report.
  • Add new method to get shap values of a given reduced features set.
  • Update tutorial notebook.

I will update the related tests if this solution seems appropriate (current tests are still passing, we just need to add new tests).

Return SHAP values with ShapRFECV fit_compute call. Add new method to get shap values of a given reduced features set. Update tutorial notebook.
@eduardo-lourenco eduardo-lourenco changed the title Return SHAP values with ShapRFECV fit_compute call #150 Return SHAP values with ShapRFECV fit_compute call Nov 11, 2021
Copy link
Collaborator

@Matgrb Matgrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions

@@ -673,6 +685,37 @@ def get_reduced_features_set(self, num_features):
else:
return self.report_df[self.report_df.num_features == num_features]["features_set"].values[0]

def get_reduced_features_set_shap_values(self, num_features, abs_shap_values=True):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe good to rename it to get_feature_importance?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small unit test using very very simple handcrafted dataset, to get these shap values would be nice.

@@ -673,6 +685,37 @@ def get_reduced_features_set(self, num_features):
else:
return self.report_df[self.report_df.num_features == num_features]["features_set"].values[0]

def get_reduced_features_set_shap_values(self, num_features, abs_shap_values=True):
"""
Gets the shap values of a features set after the feature elimination process, for a given number of features.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get feature importance (mean abs shap values)

"execution_count": 6,
"source": [
"#First 5 rows of first 5 columns\n",
"report[['num_features', 'features_set', 'val_metric_mean']]"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you can also add the mean abs shap values.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As i am thinking about this now, let's keep this part as it is, but somewhere later in the notebook, add a cell showing feature importances for the final feature set :)

@@ -292,6 +292,8 @@ def _report_current_results(
train_metric_std,
val_metric_mean,
val_metric_std,
features_mean_abs_shap_value,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a dictionary or a list?

If a list, please make sure that this corresponds well to the current feature list. the feature set might be sorted based on the shap values. Something to double check :)

@Matgrb
Copy link
Collaborator

Matgrb commented Dec 8, 2021

Also please have a look what has changed in, just to make sure your PR is aligned #175

@Matgrb
Copy link
Collaborator

Matgrb commented Jan 23, 2022

@eduardo-lourenco Are you going to still work on this one PR?

@ReinierKoops
Copy link

hi @eduardo-lourenco thanks for your contributions, could you update your PR to be made compatible with the latest changes on main?

@ReinierKoops
Copy link

This PR is closed due to inactivity. Please reach out if you want to continue with this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants