-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separated control and compute loop, shorten the critical path, and enable more complicated policies #1287
base: main
Are you sure you want to change the base?
Conversation
…rations to memory pools
…ng performance for certain scenarios
66b98d5
to
f4be82c
Compare
What is the status of this PR? Do we still plan to merge it? |
I think so, @hnyls2002 did a fix on the case of |
Moved from #1182. @xiezhq-hermann @hnyls2002
(I'm sorry that the original PR has been closed by me accidentally.)
Motivation
The existing design of the scheduler coupled control logic and model computation. While it simplifies the implementation, the overhead of scheduling can sometimes be non-negligible, especially when interacting with radix tree.
This PR decoupled the control and compute loops into different threads, overlapping computation and tasks that are not necessary on the critical paths. This reduces the scheduling overhead and enables more complicated policies in the future to be implemented on the control plane.
Based on the latest CI benchmark result, the PR would introduce about 10% throughput gain by reducing the overhead on critical path.
Modifications
Checklist