Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Example usage in docstring #596

Open
15 of 33 tasks
FBruzzesi opened this issue Nov 9, 2023 · 18 comments · Fixed by #646
Open
15 of 33 tasks

[DOCS] Example usage in docstring #596

FBruzzesi opened this issue Nov 9, 2023 · 18 comments · Fixed by #646
Labels
documentation We might describe something better good first issue Good for newcomers

Comments

@FBruzzesi
Copy link
Collaborator

FBruzzesi commented Nov 9, 2023

As discussed in #586, although most of the library features are documented in the user guide, the best way to showcase how to access and use each class/function would be a minimal example usage in the docstrings. And this is currently covered only for a subset of the library features.

Here the list of all the remaining classes that would benefit from it:

As an instance of such minimal example you can refer to QuantileRegression docstring section, which renders as in its API section.

If possible try to add one unique example covering the relevant features and methods in the top level docstring of the class.

@FBruzzesi FBruzzesi added good first issue Good for newcomers documentation We might describe something better labels Nov 9, 2023
@likeajumprope
Copy link

likeajumprope commented Mar 23, 2024

I am going to work on some of those today (Johanna). The first one has definitively documentation

@anopsy
Copy link
Contributor

anopsy commented Mar 23, 2024

I'm gonna do preprocessing.outlier_remover.OutlierRemover

@anopsy
Copy link
Contributor

anopsy commented Mar 23, 2024

I'm going to work on DictMapper

@anopsy
Copy link
Contributor

anopsy commented Mar 25, 2024

Hi hi! I'm going to add a usage example to sklego.meta.outlier_classifier.OutlierClassifier.

I have a question though, it requires both X, y in fit, otherwise it raises a ValueError. I understand that y is required here (classifier/metrics etc) but I kind of thought that when the fit signature says fit(X, y=None) y=None implies that y is optional. But in this case it's required. Just wondering

@koaning
Copy link
Owner

koaning commented Mar 25, 2024

I have a question though, it requires both X, y in fit, otherwise it raises a ValueError. I kind of thought that when the fit signature says fit(X, y=None) y=None implies that y is optional. But in this case it's required. Just wondering

This might deserve a ticket of its own. My first reaction is that most outlier models don't require a y ... can't remember why the ValueError would be there.

@anopsy
Copy link
Contributor

anopsy commented Mar 25, 2024

According to the docs, the intention was to morph this outlier model into a classifier, thus making the use of metrics possible.

@anopsy
Copy link
Contributor

anopsy commented Mar 26, 2024

I'll work this week on usage examples for the following three classes
preprocessing.pandastransformers.PandasTypeSelector
preprocessing.projections.InformationFilter
preprocessing.repeatingbasis.RepeatingBasisFunction

@anopsy
Copy link
Contributor

anopsy commented Apr 8, 2024

I'll work on these:
preprocessing.formulaictransformer.FormulaicTransformer
preprocessing.identitytransformer.IdentityTransformer
preprocessing.intervalencoder.IntervalEncoder

I also noticed that some examples start with "Examples" and some with "Example" and it has an influence on how the example is shown in documenattion I'll change "Examples" -> Example" so it gets the nice frame

@FBruzzesi
Copy link
Collaborator Author

Does anyone know if there is a way in a PR description to tag an issue and reference a particular task only? Or make some sort of partial fix reference? Otherwise it will automagically close the issue on merge.

I know it's possible to click on a task to create a separated issue, but maybe that's a bit too much

@koaning
Copy link
Owner

koaning commented Apr 8, 2024

Good question. I guess not? We could see if there's an easy way to split this into many smaller tickets ... but that's also the best that I can come up with.

@FBruzzesi
Copy link
Collaborator Author

FBruzzesi commented Apr 8, 2024

Hovering over a task will let you me create a separate issue (see screenshot), but I am not sure who has the access to that, I believe only the issue creator?!

image

@koaning
Copy link
Owner

koaning commented Apr 8, 2024

Maybe other admins as well? I can also click it.

@anopsy
Copy link
Contributor

anopsy commented Apr 26, 2024

Grabbing these:
decomposition.pca_reconstruction.PCAOutlierDetection
decomposition.umap_reconstruction.UMAPOutlierDetection

Also, what would be a more elegant example- applying them on ndrrays or a dataframe?

@likeajumprope
Copy link

Hi :) I am back :) I am really interested in the Bayesian ones - can I do aive_bayes.GaussianMixtureNB,
naive_bayes.BayesianGaussianMixtureNB & neighbors.BayesianKernelDensityClassifier please?

@likeajumprope
Copy link

Also, maybe remove the deadzoneregressor one

@anopsy
Copy link
Contributor

anopsy commented Apr 27, 2024

I'll take care of those:

model_selection.TimeGapSplit
model_selection.GroupTimeSeriesSplit
model_selection.KlusterFoldValidation

@david26694
Copy link
Contributor

Hello! since we're in the process of writing more and more docstrings, does it make sense to mktestdocs to check that our docstrings don't fail?

@koaning
Copy link
Owner

koaning commented Jul 14, 2024

It is not the worst idea, but I do fear the sheer amount of compute that we might waste in doing that. The CI job will become a whole lot more expensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation We might describe something better good first issue Good for newcomers
Projects
None yet
5 participants