-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NodeFeature resource should not be owned by Pod #1817
Comments
The reason is garbage collection, keeping the cluster clean of stale NodeFeature objects after uninstallation of NFD (and e.g. prevent another instance of NFD picking them).
There were issues in v0.16.0 but those should be now mitigated in the latest v0.16.3.
I can see you point. This is now mitigated so that the NodeFeature objects are owned by both the Pod and the Daemonset. NodeFeatures are only GC'd if you uninstall the daemonset. What we could do, is to add something like
A namespaced object (NodeFeature) cannot be "owned" (as by ownerReference) by a cluster-scoped object (Node). |
Sorry I forgot to reply.
Are you sure? I think it's the other way around. :)
IMO this should be the default. Many serious companies rely on persistence of node labels for scheduling-time decisions, as there's no way to select a particular group of nodes (pools) some other way in a heterogeneous cluster.
If I'm uninstalling NFD I delete CRD which deletes CRs of that type. I don't follow why this is a path the project needs to handle. |
Hello, I'm trying to understand why the NodeFeature resource is owned by DaemonSet Pods.
We've been using 0.4.0 and noticed that 0.16.x brings a ton of architectural changes. These changes seem to lead to removals under some scenarios like #1802, and removal though Kubernetes garbage collector being another.
In our use case, removal of node labels at any time is unacceptable since we heavily rely on
nodeSelector
s for scheduling (or preventing scheduling), program controllers to avoid touching nodes with certain labels etc. So things like point-in-time absence of a DaemonSet pod causing removal of NodeFeature.Is there a reason why NodeFeature is owned by nfd-worker
Pod
, say, instead ofNode
?cc: @lxlxok
The text was updated successfully, but these errors were encountered: