-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#32498][prism] Add split / progress back off + catch-up. #32526
Conversation
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #32526 +/- ##
=========================================
Coverage 58.88% 58.89%
Complexity 3071 3071
=========================================
Files 1129 1129
Lines 173992 174018 +26
Branches 3329 3329
=========================================
+ Hits 102463 102490 +27
- Misses 68182 68185 +3
+ Partials 3347 3343 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -220,12 +248,28 @@ progress: | |||
Data: residuals, | |||
}) | |||
} | |||
|
|||
// Any split means we're processing slower than desired, but splitting should increase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non blocking question. How does split translate to processing slower than desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question.
That can be answered by looking at the issue being fixed, and seeing the behavior.
In this case, the file was being opened repeatedly and endlessly*.
By splitting too quickly, we end up doing more work, serializing and deserializing the elements in the set of elements to be processed by the SDK. So we weren't letting the SDK actually get any work done
This meant that because we were slow to open the file, we opened the file, again and again in different bundles.
*Eventually there would be nothing left to split, and lines would be emitted, but it would have been very wasteful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Prism splits too aggressively in low latency scenarios, which for textio can cause splitting to single byte inputs most of which don't output anything.
While it would be nice to make this configurable, that is more plumbing than I'd like to do right now. Instead we add some per stage back off and catchup.
The stages are all independant, so different stages will get different progress intervals over time. If bundles are always making progress, splitting behavior is unchanged (biased to not split).
This allows for larger latencies to first input, while not hampering separation harness tests, or otherwise overdoing it.
There still remains the same bad case, of time between elements being larger than 2* maximumProgressInterval, but to resolve that likely requires a bit more global state awareness, and not splitting when maximum parallelism has been reached. But that would be a different change.
Fixes #32498.
Tested against the Wordcount example with tightened minimal timings, and sleeps. Once this is in the following command should be able to validate that it's fixed the issue for the higher latency cases.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.