-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] make scikit-learn estimator tags compatible with scikit-learn>=1.6.0dev
#6651
base: master
Are you sure you want to change the base?
[python-package] make scikit-learn estimator tags compatible with scikit-learn>=1.6.0dev
#6651
Conversation
Update: The change introduced in scikit-learn/scikit-learn#29677 makes it hard to subclass a sklearn estimator in a codebase while being compatible with sklearn < 1.6.0 and sklearn >= 1.6.0. Essentially the former looks up The issue is discussed here: and it looks like a relaxation of the impossibility of having both |
@vnherdeiro note that it's possible already to support both with this method (scikit-learn/scikit-learn#29677 (comment)), however, the version check and |
Correct I am waiting for that PR to go in to bring back _more_tags
Using @available_if would require another sklearn import and make the code
less readable I reckon
…On Thu, 12 Sept 2024, 11:34 am Adrin Jalali, ***@***.***> wrote:
@vnherdeiro <https://github.com/vnherdeiro> note that it's possible
already to support both with this method (scikit-learn/scikit-learn#29677
(comment)
<scikit-learn/scikit-learn#29677 (comment)>),
however, the version check and @available_if are going to be unnecessary
once we merge scikit-learn/scikit-learn#29801
<scikit-learn/scikit-learn#29801>
—
Reply to this email directly, view it on GitHub
<#6651 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE4CNVUURU6AMLDYUXKPFTTZWFU2TAVCNFSM6AAAAABOAVNTLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBVHA4DCOJWGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks for starting on this @vnherdeiro . I've documented it in an issue: #6653 (and added that to the PR description). Note there that I intentionally put the exact errors messages in plain text instead of just referring to Note also that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting on this! Please see scikit-learn/scikit-learn#29801 (comment):
The story becomes "If you want to support multiple scikit-learn versions, define both."
I think we should leave _more_tags()
untouched and add __sklearn_tags__()
. And have self.__sklearn_tags__()
call self._more_tags()
to get its data, so we don't define things like _xfail_checks
twice.
Do you have time to do that in the next few days? We need to fix this to unblock CI here, so if you don't have time to fix it this week please let me know and I will work on this.
scikit-learn>=1.16
scikit-learn>=1.16
scikit-learn>=1.16
@jameslamb Have just pushe a sklearn_tags trying a conversion from _more_tags. I added a out of current argument scope warning to catch a change from the arguments in _more_tags (they don't seem to change much though). |
scikit-learn>=1.16
scikit-learn>=1.6.0dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a maintainer here, but coming from sklearn side. Leaving thoughts hoping it'd help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this.
I've reviewed the dataclasses at https://github.com/scikit-learn/scikit-learn/blob/e2ee93156bd3692722a39130c011eea313628690/sklearn/utils/_tags.py and agree with the choices you've made about how to map the dictionary-formatted values from _more_tags()
to the dataclass attributes scikit-learn
now prefers.
Please see the other comments about simplifying this.
Co-authored-by: James Lamb <[email protected]>
@jameslamb have adressed your comments! thanks for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably include a test to make sure X_types
is exactly as is here, so that when somebody changes it in the future in _more_tags
, the corresponding tags in __sklearn_tags__
is also changed (and the test itself)
I started looking into this and realized that I'll push commits here adding this test and fixing that. |
This is proving to be very challenging to get right, because python -c "import lightgbm; print(lightgbm.LGBMRegressor.__mro__)"
# (<class 'lightgbm.sklearn.LGBMRegressor'>,
# <class 'sklearn.base.RegressorMixin'>,
# <class 'lightgbm.sklearn.LGBMModel'>,
# <class 'sklearn.base.BaseEstimator'>,
# <class 'sklearn.utils._estimator_html_repr._HTMLDocumentationLinkMixin'>,
# <class 'sklearn.utils._metadata_requests._MetadataRequester'>,
# <class 'object'> (we do that intentionally, following the advice from "BaseEstimator and mixins" at https://scikit-learn.org/stable/developers/develop.html#rolling-your-own-estimator) I'm finding it difficult to preserve the LightGBM-specific changes that we want (that @vnherdeiro has implemented here) without them being overwritten by the Will come back to this tomorrow, when I can, and will try to put together a clear reproducible example. The amount of indirection here means that'll take a bit more time than I have today. |
Fixes #6653
Tring to fix latest CI job. Sklearn 1.6.0 dev deprecates
BaseEstimator._more_tags_()
for__sklearn_tags__
see https://scikit-learn.org/dev/whats_new/v1.6.html and scikit-learn/scikit-learn#29677