Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i/prompting: implement path pattern matching #13866

Conversation

olivercalder
Copy link
Member

@olivercalder olivercalder commented Apr 20, 2024

Path pattern matching is implemented via the doublestar package, which emulates bash's globstar matching. Patterns may include '*' wildcard characters (which match any number of non-separator characters), '**' doublestars (which match zero or more subdirectories), '?' wildcard characters (which match exactly one non-separator character), and nested groups delimited by '{' and '}'. Notably, path patterns are not allowed to have character classes delimited by '[' and ']', nor inverted classes of the form "[^abc]".

There is a limit on the number of groups allowed in path patterns, but up to that limit, groups may be arbitrarily nested or sequential.

This PR is split from #13730. This work is tracked internally by SNAPDENG-18541.

interfaces/prompting/patterns.go Outdated Show resolved Hide resolved
break
}
switch r {
case '{':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm confused even more now. The spec does not mention {} for path patterns, so why validate this? Also, there's inconsistency between ValidatePathPattern and MatchPathPatter, where the latter is oblivious to {}, which means that a definition of a valid pattern is different for each of those calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatchPathPattern directly calls doublestar.Match to do matching (discussed some here: https://docs.google.com/document/d/1tBnefdukP69EUJOlH8bgD2hrvZCYoE8-1ZlqRRYlOqc/edit?pli=1#heading=h.a0e0vfqho1ip). This should be called on expanded patterns, but this isn't technically necessary, since doublestar.Match happily matches patterns with '{' '}' groups in them (see: https://pkg.go.dev/github.com/bmatcuk/doublestar#Match). Path pattern validation should occur on patterns before they are expanded, so they may contain '{' '}'-defined groups.

}
case '\\':
// Skip next rune
_, _, err = reader.ReadRune()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the escape sequence valid only for some of the characters that follow? eg what would a pattern like \c mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to doublestar.Match (https://pkg.go.dev/github.com/bmatcuk/doublestar#Match), \c matches c for any character c. I'm not adjusting that behavior at all, at the moment.

Comment on lines +98 to +135
matched, err := doublestar.Match(pattern, path)
if err != nil {
return false, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't the pattern be validated prior to being used? Perhaps ValidatePathPatter should call doublestar.Match with say "/" as path, discard the result and only check the error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doublestar.ValidatePattern is called as part of ValidatePathPattern, so the pattern should be valid. This error should thus never occur when the PathPatternMatch is called in practice, but it's a check in case PathPatternMatch is called in isolation without being validated. So I'm inclined to leave the error return in place rather than discarding the error, for completeness.

@olivercalder olivercalder force-pushed the prompting-path-pattern-matching-validation branch from 1cab837 to 6e34c06 Compare May 2, 2024 04:15
interfaces/prompting/patterns.go Outdated Show resolved Hide resolved
interfaces/prompting/patterns.go Outdated Show resolved Hide resolved
@olivercalder olivercalder force-pushed the prompting-path-pattern-matching-validation branch from d414d4d to eff5c6d Compare May 25, 2024 04:52
@olivercalder
Copy link
Member Author

Note: the path pattern validation in this PR may be no longer relevant due to changes in #13867, which was intended to be based on this PR but has now grown to replace manual path pattern validation with path pattern parsing, which validates and expands path patterns.

@olivercalder
Copy link
Member Author

@bboozzoo or @zyga please check whether I've added the doublestar dependency properly in packaging/fedora/snapd.spec and packaging/debian-sid/control. In particular, whether the /v4 should be included or not. Thanks!

Path pattern matching is implemented via the doublestar package, which
emulates bash's globstar matching. Patterns may include '*' wildcard
characters (which match any number of non-separator characters), '**'
doublestars (which match zero or more subdirectories), '?' wildcard
characters (which match exactly one non-separator character), and nested
groups delimited by '{' and '}'. Notably, path patterns are *not* allowed
to have character classes delimited by '[' and ']', nor inverted
classes of the form "[^abc]".

There is a limit on the number of groups allowed in path patterns, but
up to that limit, groups may be arbitrarily nested or sequential.

Signed-off-by: Oliver Calder <[email protected]>

i/prompting: fix typo and add notes to remove test boilerplate

Signed-off-by: Oliver Calder <[email protected]>

i/prompting: use separate test suite for patterns

Signed-off-by: Oliver Calder <[email protected]>

i/prompting: improve unit test coverage

Signed-off-by: Oliver Calder <[email protected]>
Rather than counting the number of groups and using it as a heuristic
for the number of patterns into which a given path pattern will expand,
instead compute the true number of expanded patterns and compare it
against a set limit.

Signed-off-by: Oliver Calder <[email protected]>
@olivercalder olivercalder force-pushed the prompting-path-pattern-matching-validation branch from deff68c to e841e60 Compare June 13, 2024 03:41
@MiguelPires
Copy link
Contributor

@olivercalder what's the status of this PR? Can it be reviewed now?

@olivercalder
Copy link
Member Author

olivercalder commented Jun 13, 2024

@MiguelPires this PR is the basis for two PRs with possible implementations of path pattern expansion: #13867 and #14059. Both strip out the validation function from this PR, so it's possible/likely that this PR won't be merged on its own. Alternatively, I could strip out the validation function and no longer rebase the other PRs on this one when it changes. I think that might be the best approach.

So it's ready for review, just ignore the validation part :)

@bboozzoo
Copy link
Contributor

I could strip out the validation function and no longer rebase the other PRs on this one when it changes. I think that might be the best approach.

Please do, it'll make it easier to review the actual thing that you want to land.

@olivercalder olivercalder changed the title i/prompting: implement path pattern matching and validation i/prompting: implement path pattern matching Jun 13, 2024
Copy link
Contributor

@bboozzoo bboozzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// `/**/`: `/foo/**`, and `/foo/**/` both match `/foo` and `/foo/`. We want to
// override this behavior to make `/foo/**/` not match `/foo`. We also want to
// override doublestar to make `/foo` match `/foo/`.
func PathPatternMatch(pattern string, path string) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but just a random idea as there's no control over what is passed as a pattern, but if the pattern expansion resulted in an opaque type eg struct with unexported fields, then you could guarantee that what is passed into this function as the pattern parameter is not a random string.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this point, but it can be addressed together with the parsing/validation code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may relate to precedence as well, if we want to do something a bit more clever than comparing patterns rune-by-rune (e.g. strip common prefixes).

@olivercalder olivercalder added the Squash-merge Please squash this PR when merging. label Jun 14, 2024
Copy link
Collaborator

@pedronis pedronis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

// `/**/`: `/foo/**`, and `/foo/**/` both match `/foo` and `/foo/`. We want to
// override this behavior to make `/foo/**/` not match `/foo`. We also want to
// override doublestar to make `/foo` match `/foo/`.
func PathPatternMatch(pattern string, path string) (bool, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this point, but it can be addressed together with the parsing/validation code

@pedronis pedronis merged commit 6905775 into canonical:master Jun 17, 2024
46 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Samuele review Needs a review from Samuele before it can land Squash-merge Please squash this PR when merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants