-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i/prompting: render path patterns variants using recursive descent parser #14059
i/prompting: render path patterns variants using recursive descent parser #14059
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much nicer this way
0c8702c
to
2939985
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please join some of the files in patterns, one struct per file is a bit too much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions for @zyga regarding the original code :)
needs a rebase now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick scan and this looks good to me. Since the PR is huge I will do a full pass later not to cannibalise other reviews.
Path patterns may include arbitrary nested groups, delimited by '{' and '}', up to a limit on the total number of groups. In order to compare the precedence of path patterns which match a given path, these path patterns must be expanded until no groups remain, and thus the particular group-free patterns which was resolved from the original patterns when matching the path can be compared. Signed-off-by: Oliver Calder <[email protected]>
Rather than separately validate and expand path patterns, storing the result as a list of expanded patterns, parse a pattern into a PathPattern type, which can dynamically render expanded path patterns as needed with minimal overhead. When path patterns are received from prompting clients, path patterns can be unmarshalled and automatically validated, and any future use of the pattern in-memory can use the pre-parsed PathPattern to iterate through expanded path patterns without needing to explicitly expand and store all path patterns. Additionally, the new PathPattern type should be used in Constraints in place of the old path pattern string. Signed-off-by: Oliver Calder <[email protected]>
Rather than keep separate stacks for the sequences and paths which the parser is currently inside, instead keep a single stack, to which the existing sequence and a new group is added whenever a '{' rune is encountered. Then there is no need to no need for a variable to hold the current group, peeking the stack yields the most recent group, to which the current sequence can be added whenever a ',' or '}' is encountered. When a '}' is encountered, the most recent group is popped off the stack, the current sequence is added to it (completing the group), and then the previous sequence is popped off the stack and the completed group is added to it. From there, that previous sequence is now considered the current sequence until another '{' is encountered. Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
…erns Co-authored-by: Oliver Calder <[email protected]> Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
225d8f9
to
048b233
Compare
…te naming Signed-off-by: Oliver Calder <[email protected]>
Add a reference to the `renderNode` used to generate a given `variantState` to that state itself. This allows methods on `variantState` to be called without needing to pass as a parameter the same `renderNode` which was used to generate the `variantState`. Also, move the `Render` function to be a method on `variantState` instead of `renderNode`. This makes sense semantically, since we render particular variants, rather than nodes themselves, and makes sense ergonomically since we now have a reference to the `renderNode` within each `variantState`, so there is no need to pass parameters around for nodes and variants which are required to be associated anyway. Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
…s are exhausted Signed-off-by: Oliver Calder <[email protected]>
…files Signed-off-by: Oliver Calder <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the changes, some more comments
Signed-off-by: Oliver Calder <[email protected]>
Add comment as such to the `literal` type definition, and have `literal.NextVariant` return length 0 to make it consistent with other `variantState` types. Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
Signed-off-by: Oliver Calder <[email protected]>
if pattern[0] != '/' { | ||
return fmt.Errorf("%s: pattern must start with '/'", prefix) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be caught by the parser at level 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to interrogate inside text tokens once they've been tokenized, and there is precedent for scanner throwing errors about invalid characters, so I think I'd rather catch this by the scanner. But I agree, not in the calling function in patterns.go.
if pattern == "" { | ||
return fmt.Errorf("%s: pattern has length 0", prefix) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parser could complain about no tokens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a scanner check as well, as part of checking that the first character is '/'
if strings.HasSuffix(pattern, "/") && !strings.HasSuffix(path, "/") { | ||
return false, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you're discarding match result, then could this be done earlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first match result happens first to check that everything is valid, and throw an error if not. Then (assuming there was not an error), the match result is used to see if the pattern as-is matches the path. If there's no match, we then check if the pattern would match if it ended in a '/'
. So the match result isn't really discarded, just some reorganization to handle the error first.
But the pattern should be known to be valid, since we rendered it, so perhaps this check could happen before the first match. And this would save the computation time of doublestar.Match
if we know it doesn't match (by our standards).
totalLength := 0 | ||
|
||
for _, element := range v.elements { | ||
totalLength += element.Length() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this may be getting called mulitple times, any way we could store totalLength in seqVariant to avoid redoing all the the work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think caching length of the current variant for a seqVariant
helps, as literal
and altVariant
are O(1)
(O(depth)
total in variants). More importantly we nearly always if not always need to know the length immediately after calling InitialVariant
, so we might as well return it as well, meaning we almost never need to call Length()
directly -- only when we've changed a variant partway through a seq and need to compute the total length again -- but this is cached now, so much faster.
Signed-off-by: Oliver Calder <[email protected]>
Signed-off-by: Oliver Calder <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// parse validates the given pattern and parses it into a PathPattern from | ||
// which expanded path patterns can be iterated, overwriting the receiver. | ||
func (p *PathPattern) parse(pattern string) error { | ||
prefix := fmt.Sprintf("cannot parse path pattern %q", pattern) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefix := fmt.Sprintf("cannot parse path pattern %q", pattern) |
func ParsePathPattern(pattern string) (*PathPattern, error) { | ||
pathPattern := &PathPattern{} | ||
if err := pathPattern.parse(pattern); err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, err | |
return nil, fmt.Errorf("cannot parse path pattern %q: %w", pattern, err) |
end even then I would question whether you need this prefix at all, since the caller knows the pattern already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want the parse
method to handle its error messages rather than ParsePathPattern
, since parse
is called by UnmarshalJSON
and I think the error should be properly prefixed there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only one minor comment inline.
…dering Signed-off-by: Oliver Calder <[email protected]>
Parse path patterns using a recursive descent parser such that all possible choices of options within groups can be rendered into distinct variants of the pattern, where each variant has no groups remaining. This parsing occurs when patterns are unmarshalled from JSON, so the parsing only needs to occur once.
This is done using a dedicated
patterns
package, which internally includes a tokenizer, parser, and renderer.This is an alternative to #13867, and also relates to #13866 and #13868, as well as SNAPDENG-20094.