-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-4816: DRA Prioritized Alternatives in Device Requests #4871
base: master
Are you sure you want to change the base?
KEP-4816: DRA Prioritized Alternatives in Device Requests #4871
Conversation
johnbelamaric
commented
Sep 24, 2024
•
edited
Loading
edited
- One-line PR description: adding new KEP for Prioritized Alternatives in Device Requests
- Issue link: DRA: Prioritized Alternatives in Device Requests #4816
- Other comments:
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnbelamaric The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold for all approvals |
@johnbelamaric can you redo the one-line PR description? It's a bit unclear. |
The proposal adds a new type, called `RankedDeviceRequest`, which allows the | ||
user to list `DeviceRequest`s, exactly one of which must be satisfied. The | ||
`DeviceClaim` then gets a new field listing all of these such requests that must | ||
be satisfied. There is no change to the existing `DeviceRequest` type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach is the opposite of how we envisioned to add this feature. Adding a new field means that old consumers acting on a DeviceClaim will not see that new field. Instead of detecting that they are out-dated and cannot handle the claim, they will treat it like an empty claim.
Instead, the plan was to extend DeviceRequest
with a OneOf
slice and leave the rest of the fields unset, thus treating it like a "one-of": either the other fields are set or the new OneOf
, but never both. This depends on DeviceClassName
being required and current consumers checking that it is set, which they should.
We could reuse DeviceRequest
and define that the OneOf
slice must be empty when the DeviceRequest
is used inside a OneOf
(no nesting). The DeviceRequest.Name
is meaningful the same way as you describe it in the KEP for RankedDeviceRequest.Name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I had a similar train of thought but wasn't sure how we could handle it because of the current request fields being required. Are you saying we can relax that in validation? Then I think we can nest the current DeviceRequest fields all under a DeviceRequestDetail which we inline. Or just duplicate the fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying we can relax that in validation?
I think we can. It's the "organic" evolution of the API that we discussed previously.
Then I think we can nest the current DeviceRequest fields all under a DeviceRequestDetail which we inline. Or just duplicate the fields.
Of just use DeviceRequest
with the constraint that the nested instances must use the fields, not the OneOf
slice:
type DeviceRequest {
...
// +optional
// +oneOf=deviceRequestType
DeviceClassName string
...
//
// OneOf may only be set in the entries of DeviceClaim.Requests. It must not be set
// in DeviceRequest instances that themselves are part of a OneOf.
//
// +optional
// +oneOf=deviceRequestType
OneOf []DeviceRequest
}
Reusing DeviceRequest
might make some of the implementation simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manual validation code will have to ensure that either DeviceClassName
or OneOf
are set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change, I like it. It makes the change even smaller. The one thing I did differently is use the field name FirstOf
instead of OneOf
, which I think is more descriptive and also won't be confused with our other use of the term "one of".
@johnbelamaric: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
CRI or CNI may require updating that component before the kubelet. | ||
--> | ||
|
||
## Production Readiness Review Questionnaire |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnbelamaric Would you ping me when this is ready for review?
`DeviceClassName` was required, were requested to include this logic, and the | ||
in-tree components have been built in this way. | ||
|
||
```go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the diff, if it's useful during review:
diff --git a/pkg/apis/resource/types.go b/pkg/apis/resource/types.go
index 2c03a304fdf..93b96c3e4c8 100644
--- a/pkg/apis/resource/types.go
+++ b/pkg/apis/resource/types.go
@@ -397,7 +397,11 @@ type DeviceRequest struct {
// additional configuration and selectors to be inherited by this
// request.
//
- // A class is required. Which classes are available depends on the cluster.
+ // Either a class or FirstOf requests are required in DeviceClaim.Requests. When
+ // this request is part of the FirstOf list, a class is required. Nested FirstOf
+ // requests are not allowed
+ //
+ // Which classes are available depends on the cluster.
//
// Administrators may use this to restrict which devices may get
// requested by only installing classes with selectors for permitted
@@ -405,9 +409,19 @@ type DeviceRequest struct {
// then administrators can create an empty DeviceClass for users
// to reference.
//
- // +required
+ // +optional
+ // +oneOf=deviceRequestType
DeviceClassName string
+ // FirstOf contains subrequests, exactly one of which must be satisfied
+ // in order to satisfy this request. This field may only be set in the
+ // entries of DeviceClaim.Requests. It must not be set in DeviceRequest
+ // instances that themselves are part of a FirstOf.
+ //
+ // +optional
+ // +oneOf=deviceRequestType
+ FirstOf []DeviceRequest
+
// Selectors define criteria which must be satisfied by a specific
// device in order for that device to be considered for this
// request. All selectors must be satisfied for a device to be
@@ -457,7 +471,8 @@ type DeviceRequest struct {
}
const (
- DeviceSelectorsMaxSize = 32
+ DeviceSelectorsMaxSize = 32
+ FirstOfDeviceRequestMaxSize = 8
)
// | ||
// +optional | ||
// +oneOf=deviceRequestType | ||
FirstOf []DeviceRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this call for a dedicated type that doesn't have this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're enforcing that in validation. We could do it as a separate type, if that's the consensus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested reusing the type because ... I am lazy. I expect that there will be code which needs to work with both DeviceRequest
and such a new type (for example, validation). Go generics doesn't help here because it doesn't support accessing common fields.
But I agree that a separate type is better because then the OpenAPI spec will not include an inner FirstOf
which can never be set.
used to circumvent quota. This reduces the usefulness of the feature, as it | ||
means it will not serve as a quota management feature. However, the primary goal | ||
of the feature is about flexibility across clusters and obtainability of | ||
underlying devices, not quota management. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully kueue can help with that.
cc @kannon92
Consider including folks who also work outside the SIG or subproject. | ||
--> | ||
|
||
## Design Details |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs some implementation details for scheduler.
cc @dom4ha