Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested square brackets inside sections are incorrectly parsed #121

Closed
Delgan opened this issue Oct 4, 2023 · 7 comments · Fixed by #122
Closed

Nested square brackets inside sections are incorrectly parsed #121

Delgan opened this issue Oct 4, 2023 · 7 comments · Fixed by #122

Comments

@Delgan
Copy link
Contributor

Delgan commented Oct 4, 2023

Description of your problem

Hi!

I noticed difference between standard configparser and configupdater parsing.

Please provide a minimal, self-contained, and reproducible example.

Given this INI file:

# example.ini
[[]]

Using configparser:

import configparser

config = configparser.ConfigParser()
config.read('example.ini')
sections = config.sections()
print(sections)  # => ['[]']

Using configupdater:

from configupdater import ConfigUpdater

updater = ConfigUpdater()
updater.read("example.ini")
sections = updater.sections()
print(sections)  # => ['[']

Please provide the full traceback.
N/A

Please provide any additional information below.
It seems it's interpreted as being a comment.
I'll try to open a PR if I can figure it ouy.

Versions and main components

  • ConfigUpdater Version: master
  • Python Version: 3.11.5
  • Operating system: Linux
  • How did you install ConfigUpdater: pip
@FlorianWilhelm
Copy link
Member

Thanks @Delgan for posting this issue and providing a PR at the same time. It's highly appreciated.

@abravalheri: What is your take on this? Can we merge and make a new release? If yes, it would be release 3.2, right?

@Delgan
Copy link
Contributor Author

Delgan commented Oct 5, 2023

New release would be great!
I took the opportunity to create #123, perhaps it can also be integrated into the next release

@abravalheri
Copy link
Contributor

Apparently this behaviour of configparser only happens on Python 3.10+.

For Python < 3.10, the reproducer prints ['['], which is the same as configupdater.

To be honest the last 2 examples in the PR test are very counter intuitive to me, but since cpython itself is doing it, I guess we have to follow...

Something to be aware though is that on Python < 3.10, configupdater will differ from the configparser.

@abravalheri
Copy link
Contributor

abravalheri commented Oct 5, 2023

This also would result in a very unintuitive behaviour:

[section-A]  # comment telling that this section is related to [section-B]

Maybe we should open an issue in Cpython?

Never mind, if you instantiate ConfigParser with inline_comment_prefixes=('#', ';') the unintuitive behaviour goes away...

@Delgan
Copy link
Contributor Author

Delgan commented Oct 7, 2023

Apparently this behaviour of configparser only happens on Python 3.10+.

Oh, good catch, I didn't even realize that.

Something to be aware though is that on Python < 3.10, configupdater will differ from the configparser.

We could build something around ConfigParser.SECTCRE regex which is public, but that requires more changes to the current implementation.

Maybe we should open an issue in Cpython?

Never mind, if you instantiate ConfigParser with inline_comment_prefixes=('#', ';') the unintuitive behaviour goes away...

It would like the behavior of CPython to be changed as well.

However, we made the assumption that [section-A] # Some comment was valid to begin with, while CPython makes no such promise. By default, comments are only allowed on empty lines.

Although it would still make sense to stop the section name when encountering a closing bracket matching the opening one, this behavior can't be implemented with a Python regex (it requires recursion), but as the regex is public (see above), it would be a breaking change for a feature not even meant to be supported...

@abravalheri
Copy link
Contributor

abravalheri commented Oct 8, 2023

Being pragmatic, I think we can just be "forward"- compatible.

I am happy that if the users specify inline_comment_prefixes, the library works as expected. We could simply add a recommendation for instantiating the object that way in the docs.

The PR looks good to me, @FlorianWilhelm . My impression is that it is 100% compatible with the Python 3.10 code base. I haven't checked with tox -e py310-diff though.

@FlorianWilhelm
Copy link
Member

FlorianWilhelm commented Oct 12, 2023

Thanks @abravalheri, feel free to merge and close this issue :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants