forked from kubernetes-sigs/scheduler-plugins
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.14] resync 20240529 #202
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Bump NRT API package to v0.1.2; there is no API change, but we have now a better replacement for the internal `getID` helper, which we can now remove. Signed-off-by: Francesco Romani <[email protected]> (cherry picked from commit 8d9a4cd)
"host-level" resources are resources which are not expected to have NUMA affinity. This means that these resources not showing up in per-NUMA resource counters should not prevent per se scheduling on a given node. Signed-off-by: Francesco Romani <[email protected]> (cherry picked from commit f7057da)
We call "NUMA-affine" resources compute resources like CPU and memory/hugepages which we know they do expose NUMA affinity. This is another attempt to factor this logic in a central place. Signed-off-by: Francesco Romani <[email protected]> (cherry picked from commit c48f462)
Rewrite the accounting of NUMA-local resources when using scope=container. The previous code was too lenient and worked mostly by side effects when dealing with non-NUMA affine resources. A non-NUMA affine resource (aka a hostlevel resource) is a resource which is not guaranteed to always have a NUMA affinity. CPU and memory (incl. hugepages) always do, but devices may or may not, both options are legal for device plugins. Similarly, ephemeral storage is a prominent example of resource which should never have a NUMA affinity. The accounting in this case was wrong because previously the resource was considered NUMA affine. Note: it's legal to configure topology updaters (e.g. NFD) to not advertise CPU and memory in NRT objects. Thus is best to treat lack of them as warnings, not as blocking errors. However if the per-NUMA affine counters go negative this is definitely an error condition we need to detect and be very loud about it. Signed-off-by: Francesco Romani <[email protected]> (cherry picked from commit e9b8aa4)
openshift-ci
bot
added
the
do-not-merge/work-in-progress
Indicates that a PR should not merge because it is a work in progress.
label
May 29, 2024
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ffromani The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
openshift-ci
bot
added
the
approved
Indicates a PR has been approved by an approver from all required OWNERS files.
label
May 29, 2024
The ephemeral storage resource is not a deciding factor for noderesourcetopology filtering, but it was incorrectly accounted causing bad scheduling decisions. First, we add some integration test coverage to catch these issues. Signed-off-by: Francesco Romani <[email protected]> (cherry picked from commit e3388b9)
Signed-off-by: Francesco Romani <[email protected]>
add compatibility fixes to deal with older codebase, non-backported patches and older k8s libs. Signed-off-by: Francesco Romani <[email protected]>
record targeted cherry picks on top of latest rebase Signed-off-by: Francesco Romani <[email protected]>
ffromani
force-pushed
the
resync-20240529-4.14
branch
from
May 29, 2024 14:57
0917b3a
to
8acae4b
Compare
ffromani
changed the title
WIP: [release-4.14] resync 20240529
[release-4.14] resync 20240529
May 30, 2024
openshift-ci
bot
removed
the
do-not-merge/work-in-progress
Indicates that a PR should not merge because it is a work in progress.
label
May 30, 2024
/hold we need to verify more recent backports before |
openshift-ci
bot
added
the
do-not-merge/hold
Indicates that a PR should not merge because someone has issued a /hold command.
label
May 30, 2024
/hold cancel |
openshift-ci
bot
removed
the
do-not-merge/hold
Indicates that a PR should not merge because someone has issued a /hold command.
label
Jun 4, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
resync to consume fixes to ephemeral storage
add API fixes because the cherry-picked commits where made against a much modern codebase (and k8s libs)