Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove outdated "single sourcing package version" advice #1276

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

wimglenn
Copy link
Contributor

@wimglenn wimglenn commented Jul 20, 2023

Now that Python 3.7 is EOL and stdlib importlib.metadata is available in all non-EOL Python versions, it is no longer advisable to keep a __version__ attribute in the source code. Use importlib.metadata.version() to get it from package metadata if necessary.

References:

  • PEP 396 rejection.
  • ".__version__ attributes are IMNSHO generally pointless" - gpshead
  • "Unfortunately, I think __version__ is a fairly ingrained habit for a lot of people. But it’s a holdover from the days before importlib.metadata." - pf_moore
  • "Do you really need/want a __version__ attribute at all?" - me

@sinoroc
Copy link
Contributor

sinoroc commented Jul 22, 2023

I mostly agree. But completely deleting the page seems too radical. I guess it would make sense to still provide some guidance on this topic. At the very least we should tell to use importlib.metadata.version(). Also I think I recall reading somewhere that calling importlib.metadata.version() can be resource intensive (I do not remember the details), so we might want to advise against having importlib.metadata.version() directly at the root of a module so that it does not run blindly on import.

@ChrisBarker-NOAA
Copy link
Contributor

Please no.

The rejection of PEP 396 means that the SC doesn't want to say you should use a __version__ attribute, but it does not mean that they are saying you should not use one.. [*]

Having a version attribute is widely done, a good idea (IMHO) and supported by the up-to-date packaging tools (setuptools, etc).

I hope we all agree that if you are going to use a version __version__ attribute then it should be automatically synced with the package version as part of the build -- what this page is about, and it is still useful.

I also note that at least originally, the conclusion of the thread in #182 was that PyPA would not endorse one single way do things -- adding a __version__ attribute is a perfectly reasonable thing to do, it's a still good idea to document how to do it right.

@wimglenn
Copy link
Contributor Author

wimglenn commented Jul 23, 2023

@ChrisBarker-NOAA Could you explain why you think having a version attribute is a good idea? You did say "I really like __version__" but you haven't offered any reasons. Without arguments for it, then claiming it's a perfectly reasonable thing or a good idea are just opinions.

Here are some arguments against burning a version string into the source:

  • "Version" is a required field in the package metadata. It's unnecessary and redundant to store it in two places.
  • All Python versions where retrieving version info from metadata was non-trivial are now EOL.
  • When a version string is kept only in the package metadata, then the problem of keeping the source and the metadata in sync simply goes away. That's simple. Keeping them in sync is complicated.
  • The format has never been standardized, some projects use __version__, some use VERSION, some use a tuple instead of a string, sometimes it's in a submodule, and it might not be there at all. As user, it's total guesswork about if/where/how to find a version attribute.
  • Version can be retrieved from the metadata without requiring import of the code. That's useful when the import has side-effects (e.g. numpy), and it's useful when the package can't be imported for some reason (e.g. missing dependencies)
  • Existing tooling looks to the metadata for the truth. You can change it in the source, but stuff like pip list, pip freeze etc would ignore that and display the metadata version field.

So, there are a bunch of disadvantages. What are the advantages? Do they outweigh the disadvantages?

@sinoroc
Copy link
Contributor

sinoroc commented Jul 23, 2023

There is also the fact that an import package is not the same thing as a distribution package. It is a subtle detail and all things considered it might not matter much for this task, but there is still a difference.

When I do import something; print(something.__version__) should I really get the same value as when I do import importlib.metadata; print(importlib.metadata.version('Something'))?

@ChrisBarker-NOAA
Copy link
Contributor

Could you explain why you think having a version attribute is a good idea? You did say #182 (comment) but you haven't offered any reasons.

I thought about writing all that, but this really isn't the place for that discussion.

I don't have any particular pull here, but I think the goal of these docs is to document the state of practice -- and having a __version__ is currently a common practice, so it should be documented here.

There is also the fact that an import package is not the same thing as a distribution package. It is a subtle detail and all things considered it might not matter much for this task, but there is still a difference.

Yes, indeed -- which if one reason I prefer version attribute :-)

"Version" is a required field in the package metadata. It's unnecessary and redundant to store it in two places.

This entire page is about removing that redundancy, isn't it?

When I do import something; print(something.version) should I really get the same value as when I do import importlib.metadata; print(importlib.metadata.version('Something'))?

Only if the importable package name is the same as the distribution name.

he format has never been standardized, some projects use version, some use VERSION, some use a tuple instead of a string, sometimes it's in a submodule, and it might not be there at all. As user, it's total guesswork about if/where/how to find a version attribute.

yeah, which is why I thought PEP 396 was a good idea -- but even since that PEP was written, there's been a LOT more standardization in the community -- but another topic anyway -- at least now there IS a standard for what a version string should look like, THAT doesn't need to change.

But again, not the place for this debate.

Copy link
Member

@webknjaz webknjaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm strongly against blindly removing the guide. Its point is not to provide an importable, such as __version__ in an installed package but quite vice versa — show ways for the version value to make its way into the installed package metadata. And an importable variable is just one of the sources, many people use SCM plugins that can pull that from Git tags and so on.

So depending on where the source of truth is, different projects would have different assumptions around handling/generation/presence of __version__ (some use different variable names but the purpose is the same).

@wimglenn
Copy link
Contributor Author

wimglenn commented Sep 6, 2023

@webknjaz This is not blindly removing the guide. It is removing the guide with a reasoned set of arguments, as described in the comments here. Regardless of where the source of truth is, the premise that a __version__ attribute is available in the source code at all is not an assumption which packaging.python.org should make, since there is no specification about how to do that.

What changes are you requesting?

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

We can rename it to _VERSION if you like that variant more. Not everything in PyPUG is spec-based. Some of it is just trying to make tribal knowledge recorded.

But as I said before, single-sourcing of a version value is a legit use case and is what people often aspire to. Regardless of what the source is or how it's called.

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

  • Version can be retrieved from the metadata without requiring import of the code. That's useful when the import has side-effects (e.g. numpy), and it's useful when the package can't be imported for some reason (e.g. missing dependencies)

This dismisses the case when a library/app itself makes use of the version and it does need a variable to store it somewhere. And this is one of the aspects of single-sourcing — the version is a part of the metadata but it's often also necessary in runtime. And yes, that variable can be inferred from the installed package metadata, if it's a library. But many people have other preferences. Also, apps are often not packaged as dists and so they need to rely on different sources of versioning or the mechanism of generating them.

I think it's acceptable to augment the guide with clarifications and other example but dismissing it as a whole and pretending that people don't do this is rather harmful.

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

The rejection of PEP 396 means that the SC doesn't want to say you should use a __version__ attribute, but it does not mean that they are saying you should not use one..

Exactly. Moreover, some PEPs, even being rejected, end up being widely used across the ecosystem, as __version__ clearly demonstrates. Sometimes even other PEPs rely on the ideas and concepts laid out in the rejected ones.

One other case of a "rejected" concept is "attribute docstrings" — https://peps.python.org/pep-0258/#attribute-docstrings. The accepted PEP 257 shows some legit use of of "rejected attribute docstrings": https://peps.python.org/pep-0257/#what-is-a-docstring.

Back to __version__, it is explicitly called out in PEP 8, that most pythonistas agree to follow: https://peps.python.org/pep-0008/#module-level-dunder-names.

By the way, there's a PEP 723 draft in the works: https://peps.python.org/pep-0723/. This essentially allows moving pyproject.toml back into (standalone) Python modules. So there you go — the version is in a .py file once again.
I suppose, when it's accepted, this guide could also be extended to explain how to work with this source too.

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

When I do import something; print(something.version) should I really get the same value as when I do import importlib.metadata; print(importlib.metadata.version('Something'))?

Only if the importable package name is the same as the distribution name.

This is not always possible since the dist name can contain dashes that aren't allowed in Python identifiers. Also, a standalone script might be functional even without being packaged or pip installed, so it'll likely have a fallback for such case.

@sinoroc
Copy link
Contributor

sinoroc commented Sep 6, 2023

If we want to keep this page of the guide, then there is a whole bunch of clarifications to be made.

  • The __version__ attribute is not a standard, but a relatively widely accepted convention.
  • Zero guarantee that the rest of the ecosystem will honor the value of __version__.
  • Getting the build back-end (or plugin like setuptools-scm) to extract the version string from the code (for example from a __version__ attribute) is not something that is standardized (i.e. it can not be configured in the [project] table of pyproject.toml), but something that is build back-end specific and might vary from one build back-end to another (we should be careful about this kind of things, we do not want users to mistake this for a standard and them come complain to PyPA when this breaks).
  • Be mindful that the distribution package name and the import package name are not the same thing (in case where one wants to use importlib.metadata.version()), even though by convention we try to keep them to be equal, those are still two different concepts.
  • Be mindful that maybe the distribution package version string (the one in the metadata) is not the same as the import package version string (the one in __version__), I can not find it now but I recall quite clearly reading that some projects do this (2 different version strings).
  • Packaging metadata is accessible (for example via importlib.metadata) if and only if the distribution package is installed correctly (via pip or something else).
  • Importing a package has a cost, and one should try to avoid adding to this cost by having expensive logic that computes the value of __version__ attribute at import time.

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

  • When a version string is kept only in the package metadata, then the problem of keeping the source and the metadata in sync simply goes away. That's simple. Keeping them in sync is complicated.

Yet, the version can be sourced dynamically on build, which makes external resources like Git the actual source of truth and the metadata is an intermediate snapshot of that data from SCM (prone to becoming outdated in cases like development w/ editable installs).
I agree that in the generic case, a project should consult metadata as the priority source of truth but that doesn't mean that we should be forbidding certain variable names.

Another case is when a project uses said variable as its ultimate source of truth. With this in mind, that variable would be external to the build system generating the version, just like Git/SCM might be.

With a version in setup.cfg or pyproject.toml, the version is still external to the build system producing the metadata.

I think there's some confusion as to how versions are being used and what single-sourcing is about. In terms of packaging, that version sourcing is about getting the version number from somewhere and putting it into the metadata. In terms of runtime, it's about getting the version from somewhere (like metadata) and making use of it in a context that is not related to packaging.

So if we present this as a unidirectional flow, it'd be something like this:

---
title: Version sourcing flow
---

graph LR;

    single-source[("
        Non-standardized ultimate version source:
            - build system config file
            - CLI flag/option/argument
            - environment variable
            - *.py module import
            -  static parsing of a *.py module as a text file
            - Git/setuptools-scm
            - API call
            - database lookup
    ")];

    meta[(True-to-spec package metadata)];

    runtime{{"
        occasional access to the metadata version,
        maybe through __version__
    "}};

    single-source --> meta --> runtime;

    runtime -. runtime may sometimes be the source but it is mostly usable for pure-python projects .-> single-source;

Loading

The guide is describing the above, or at least, a part of it. And I think it should remain, possibly be improved with additions, but not deleted.

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

@sinoroc yep, I agree with all the points.

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

We should probably ask opinions of @RonnyPfannschmidt and @jaraco, though, as they have a lot of experience with this specific topic.

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

When I do import something; print(something.version) should I really get the same value as when I do import importlib.metadata; print(importlib.metadata.version('Something'))?

Only if the importable package name is the same as the distribution name.

This is not always possible since the dist name can contain dashes that aren't allowed in Python identifiers. Also, a standalone script might be functional even without being packaged or pip installed, so it'll likely have a fallback for such case.

Oh, and I forgot to mention that dists may ship multiple importable packages/modules, not just one. So on distribution name can be mapped to multiple importables. Such dists are setuptools and pytest, for example.

@webknjaz
Copy link
Member

webknjaz commented Sep 6, 2023

@flying-sheep @abravalheri I saw you commenting on the neighboring PR so FYI ^

@RonnyPfannschmidt
Copy link

RonnyPfannschmidt commented Sep 6, 2023

i like the motion of going away from __version__ in files - its a headache and setuptools_scm still cant provide a save provider of the version file

that being said, there probably ought to be a section on providing a __version__ attribute by importlib.metadata.

@ChrisBarker-NOAA
Copy link
Contributor

ChrisBarker-NOAA commented Sep 6, 2023

I'm going to rephrase @webknjaz's comment:

I'm strongly against removing the guide with eyes wide open.

There is NOT a consensus yet about the "one true way" to handle version specification, but there IS consensus that at the very least, however you do the version specification should be DRY -- i.e. "One source of truth" -- and THAT is what this page is about. So it still serves a purpose, and I don't think it should be simply removed unless / until a proper consensus is reached.

And as @webknjaz said this guide does not say "you shold put a __version__ attribute in your package, but rather, IF you have a version string somewhere in your code, this how you should use it properly to set the version in your built package.

Anyway, I think there is also a consensus that the text as it stands is pretty darn out of date, so keeping the page as is is a disservice to the community.

NOTE: :Last Reviewed: 2015-09-08 -- that's pretty ancient!

Could we please merge a PR along the lines of: #1273 [*] sooner than later, while the discussion continues about the ultimate one true way?

Where should that discussion be held? I'm not sure this PR is the right place.

[*] "along the lines of" could be a slightly edited version (I'll go now and address a few comments) - or a bigger re-write if someone wants to tackle it. Maybe put the "extract version from SCM" at the top :-) -- note that when I wrote that PR, I purposely made as few changes as possible, I naively thought that would make it less controversial. So maybe a larger change would be better -- perhaps completely remove the "no longer recommended" section?

@wimglenn
Copy link
Contributor Author

wimglenn commented Sep 7, 2023

IMO there is nothing valuable worth saving from the existing page, but I wouldn't be against a total re-write that is build backend agnostic.

I'm not sure what's the urgency for merging #1273 - it still has several problems, and it's still setuptools/distutils specific.

@webknjaz
Copy link
Member

webknjaz commented Sep 7, 2023

@ChrisBarker-NOAA thanks for summarizing the above! I think your PR has potential!

@wimglenn the most important bit of this guide is "how to bring the version string from somewhere, into the package metadata". Removing it just because of a side effect of how the version is additionally accessed in runtime is not an option. You've been making arguments about the runtime (right side of my illustration) but it does hurt the usefulness of the left-side part — the build process.

@abravalheri
Copy link
Contributor

abravalheri commented Sep 11, 2023

My view is that:

  1. The only safe way of obtaining the version at runtime that is backed by standards is importlib.metada or similar (i.e. actually reading the *.dist-info/METADATA file, which is the only standardised way of storing this info).

  2. This does not mean that users don't need guidance about how to populate this value from other files (e.g. consider you want to distribute Python bindings for a native library, and that you want the version of your bindings to mimic the version of the original native library). This is a valid use case. (Equally valid is the use case of keeping the single source of truth for version in the VCS).

  3. Providing guidance for single-sourcing version infirmation (2) is backend specific.

Personally, I would not completely remove all the guidance for single-sourcing, even if it is backend specific, because that is very useful for the users.

If the website mainters prefer to not have backend-specific instructions, we can transfer the setuptools-specific docs to the setuptools website, but it would be good if we can leave at least a link so users can be pointed out in the right direction.

@willingc
Copy link
Contributor

Hi @wimglenn. Thanks for the PR. I think the consensus is leaning toward updating the doc instead of removing it. I'm linking to my review #1273 (comment) on #1273.

Now that Python 3.7 is EOL and importlib.metadata is available in all non-EOL Python versions, it is no longer advisable to keep a __version__ attribute in the source code. Use importlib.metadata.version() to retrieve a version string from package metadata if necessary.
@wimglenn
Copy link
Contributor Author

CI failures seem to be due to https://pyscaffold.org/ returning 403 and unrelated to my change.

I've considered all the comments here but still believe removing this page is the best way forward. It's up to individual build backends to document how they support keeping the metadata version in sync with source attributes or git tags, if they want to support that at all. It's a matter which is out of scope for the packaging.python.org guide, which should aim to be tool-agnostic.

@ChrisBarker-NOAA
Copy link
Contributor

See my comment on #1273

But in short -- a simple placeholder with a bit of info would be better that simply removing, and turning external references to a 404.

@wimglenn
Copy link
Contributor Author

Meh? That's what the wayback machine is for..

These are living documents. If we left placeholders every time some content was removed, eventually the majority of the content would be such placeholders.

@ChrisBarker-NOAA
Copy link
Contributor

It's not JUST a placeholder -- it's also useful info.

But whatever the doc editors decide...

@webknjaz
Copy link
Member

Personally, I was hoping to merge #1273 as it's got better potential in my opinion.

@tacaswell
Copy link
Contributor

Also weighing in from the peanut gallery, I also think removing this content is a mistake.

@flying-sheep
Copy link
Contributor

flying-sheep commented Jul 25, 2024

Please be aware of the context: people did try to improve it starting a year ago, then that effort got stalled over bikeshedding. The end result is that the outdated information is still there, misleading people.

I’d say unless someone is willing to do some quick rounds on #1273 to get it into a shape everyone likes, removing this document is preferable to doing nothing.

@ChrisBarker-NOAA
Copy link
Contributor

Fair enough -- I certainly agree that having the outdated info up is the worst case scenario.

I'll try to take a look at making #1276 far more simple -- I would love some help.

@ChrisBarker-NOAA
Copy link
Contributor

Actually -- it was too much of a mess to updated an old PR -- so I made a new one:

#1578

@tacaswell: I tried inviting you to access to that repo -- but gitHub can't find you -- maybe you've got some provacy setting? Anyway -- I'll take any help you might provide.

Anyone else that wants to help, please chime in.

Final note: there was a lot of bike shedding, yes, but what really stalled it out was a core difference in philosophy:

Some folks want __version__ to go away, and they wanted these docs to endorse that.

Others very much want __version__ to remain a reasonable option, and want these docs to reflect that.

I do not think that these docs are the place to set policy like that -- rather, they should reflect the state of the practice, which I've tried to do with that PR.

@bwoodsend
Copy link
Contributor

bwoodsend commented Jul 25, 2024

I'd also prefer Chris's change to go through (and I'm coming from the __version__ is wrong side of the fence) but that's no reason not to merge this one in the meantime. It'll take a significant amount of effort, grit and anti-bikeshedding to get Chris's change through so it has a high risk of not making it.

@webknjaz webknjaz mentioned this pull request Aug 13, 2024
@wimglenn
Copy link
Contributor Author

wimglenn commented Aug 14, 2024

It looks like other efforts are no longer attempting to modify that page, but instead add a new page / rewrite.

So, this can be merged in the meantime. These PRs are not blocking each other any longer.

@webknjaz Can you remove your change request?

@ChrisBarker-NOAA
Copy link
Contributor

Now that #1580 has been merged, I think we can merge this -- or at least delete the out-of-date page in "guides" -- which is most of this PR.

@tabedzki
Copy link

This PR intends to remove the "single sourcing package version" advice from the guides section. Since there is now a version of the page under discussion, should we include a short reference/link to the discussion under guides/writing-pyproject-toml#static-vs-dynamic-metadata since that section already includes an example about setting the version dynamically? We could say something like
"See the discussion here about setting versions"

Also, since there was concern about having broken links, is it possible to just create a URL redirect from https://packaging.python.org/en/latest/guides/single-sourcing-package-version/ to https://packaging.python.org/en/latest/discussions/single-source-version/? That way you have a single source of truth for single versioning and you won't 404 anyone?

@flying-sheep
Copy link
Contributor

URL redirect makes more sense than a single-line page, but it has to be set up in readthedocs AFAIK and can’t be done in this PR.

@ChrisBarker-NOAA
Copy link
Contributor

should we include a short reference/link to the discussion under guides/writing-pyproject-toml#static-vs-dynamic-metadata since that section already includes an example about setting the version dynamically? We could say something like
"See the discussion here about setting versions"

sound great -- can someone add it to this PR?

Add link to discussion "Single-sourcing the Project Version" in writing-pyproject.toml
@tabedzki
Copy link

tabedzki commented Aug 30, 2024

done. Thank you Wim for merging.
@webknjaz Are there other changes that you would like to see in this PR before merging?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.