Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated GCN circular regex matcher to expand matches #44

Merged
merged 8 commits into from
Jul 19, 2024

Conversation

Courey
Copy link
Contributor

@Courey Courey commented Jul 16, 2024

Description

This update expands the regex pattern to include:

  • things that start with a parenthesis: (GCN 25)
  • include an optional period after circ: GCN circ 25 AND GCN circ. 25
  • include an optional # before the ID: GCN nasa-gcn/gcn.nasa.gov#25

Related Issue(s)

Resolves #51

Testing

I looked into the cases pointed out by Tyler in the Issue. I used an existing circular that I added to the seed file. In this circular I added many different patterns in many different positions to show how it behaves in different cases. These are the results:
Screenshot 2024-07-16 at 4 44 49 PM

@Courey
Copy link
Contributor Author

Courey commented Jul 16, 2024

NOTE: I did NOT include this pattern:
GCN note [#016] I don't think it occurs very often and the conversation in the Issue mentioned updating the circular to bring the format in line with others. Let me know if you also want me to add this pattern or if it's okay for us to update the circular to make it GCN nasa-gcn/gcn.nasa.gov#16 instead.

Copy link
Member

@lpsinger lpsinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Please add test cases.
  • Why didn't it recognize text preceded by parentheses before?

@tylerbarna
Copy link
Member

resolves #51

@tylerbarna

This comment was marked as outdated.

@Courey
Copy link
Contributor Author

Courey commented Jul 17, 2024

  • Please add test cases.

    • Why didn't it recognize text preceded by parentheses before?

Because you have a lookback statement (?<=^|\\s) that specifically says that it must be preceded by the start of the string OR a space. So it won't match on a parenthesis because it is not a space and the start of the string is ( and not the specified pattern. Here is a side by side comparison. The one on the left is the updated version, the one on the right is the old version.
Screenshot 2024-07-17 at 1 36 00 PM

@lpsinger
Copy link
Member

  • Please add test cases.

    • Why didn't it recognize text preceded by parentheses before?

Because you have a lookback statement (?<=^|\\s) that specifically says that it must be preceded by the start of the string OR a space. So it won't match on a parenthesis because it is not a space and the start of the string is ( and not the specified pattern. Here is a side by side comparison. The one on the left is the updated version, the one on the right is the old version. Screenshot 2024-07-17 at 1 36 00 PM

OK, so why don't we add open parentheses to that lookback statement?

test.ts Outdated Show resolved Hide resolved
@Courey
Copy link
Contributor Author

Courey commented Jul 18, 2024

resolves nasa-gcn/remark-rehype-astro#51

This won't work because that issue is in the gcn repo not this repo so it will not auto link. I'll transfer the issue and it will get a new number.

@Courey Courey requested a review from lpsinger July 18, 2024 14:35
@Courey Courey requested a review from lpsinger July 18, 2024 14:51
test.ts Outdated Show resolved Hide resolved
@Courey Courey requested a review from lpsinger July 18, 2024 15:30
Copy link
Member

@lpsinger lpsinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks. I'll approve and merge once the CI finishes. It's running slowly now because there are a lot of queued pipelines from dependabot updates.

@lpsinger lpsinger force-pushed the courey/update_circular_regex branch from bfa0505 to b041a25 Compare July 18, 2024 16:48
@Courey
Copy link
Contributor Author

Courey commented Jul 18, 2024

The linter automatically makes changes and makes my intentionally large whitespace gap normally spaced, which breaks two of the tests in each file.

It also looks like carriage returns are messing thigns up in the json version on CI.

@lpsinger
Copy link
Member

The linter automatically makes changes and makes my intentionally large whitespace gap normally spaced, which breaks two of the tests in each file.

It also looks like carriage returns are messing thigns up in the json version on CI.

Yes, but #45 should fix the newline comparison issues on Windows. I just rebased it into your branch.

@Courey
Copy link
Contributor Author

Courey commented Jul 18, 2024

The linter automatically makes changes and makes my intentionally large whitespace gap normally spaced, which breaks two of the tests in each file.
It also looks like carriage returns are messing thigns up in the json version on CI.

Yes, but #45 should fix the newline comparison issues on Windows. I just rebased it into your branch.

It does not seem to have worked.

I also took the giant diff that it gave on the html file that it said was not matching and diffed them and could find no difference. I suspect that means it's an invisible whitespace character difference as well.

@lpsinger lpsinger force-pushed the courey/update_circular_regex branch from 975fa44 to 4a7376f Compare July 19, 2024 00:30
@lpsinger lpsinger merged commit d39d0e6 into nasa-gcn:main Jul 19, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Automatic linking to other circulars fails in some instances
3 participants