Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8sprocessor] Handle resource deletion on DeletedFinalStateUnknown #1277

Closed
c-kruse opened this issue Oct 9, 2023 · 1 comment · Fixed by #1278
Closed

[k8sprocessor] Handle resource deletion on DeletedFinalStateUnknown #1277

c-kruse opened this issue Oct 9, 2023 · 1 comment · Fixed by #1278
Labels
bug Something isn't working

Comments

@c-kruse
Copy link
Contributor

c-kruse commented Oct 9, 2023

The k8sprocessor uses a cache to store cluster resources and to update its model on change. In a degraded state, this cache can become disconnected from the apiserver and can miss deletion watch events. The cache will eventually reconcile this and notify the processor of the deletion and the resource's last known state using the tombstone cache.DeletedFinalStateUnknown. The processor is mishandling this notification.

Uncovered in #1267.

It is unclear to me if this particular issue is a primary contributing factor to the memory consumption problems observed in 1267.

@c-kruse c-kruse added the bug Something isn't working label Oct 9, 2023
@c-kruse
Copy link
Contributor Author

c-kruse commented Oct 10, 2023

FWIW, I can reliably reproduce this error for Pods, but I can't seem to hit the similar case for the (owning) resources. If it is possible, I'm pretty sure it'd cause a panic.

EDIT confirmed. Just needed to be a bit more patient.

panic: interface conversion: interface {} is cache.DeletedFinalStateUnknown, not *v1.Endpoints

goroutine 75 [running]:
github.com/open-telemetry/opentelemetry-collector-contrib/processor/k8sprocessor/kube.(*OwnerCache).genericEndpointOp(0x721d664?, {0x669e460?, 0xc003c3a3e0?}, 0xc002da7100?)
        github.com/open-telemetry/opentelemetry-collector-contrib/processor/[email protected]/kube/owner.go:417 +0x267
github.com/open-telemetry/opentelemetry-collector-contrib/processor/k8sprocessor/kube.(*OwnerCache).deleteEndpoint(0xc0029b6c60, {0x669e460, 0xc003c3a3e0})
        github.com/open-telemetry/opentelemetry-collector-contrib/processor/[email protected]/kube/owner.go:435 +0x1ca
github.com/open-telemetry/opentelemetry-collector-contrib/processor/k8sprocessor/kube.(*OwnerCache).addOwnerInformer.func3({0x669e460?, 0xc003c3a3e0?})
        github.com/open-telemetry/opentelemetry-collector-contrib/processor/[email protected]/kube/owner.go:320 +0x25
github.com/open-telemetry/opentelemetry-collector-contrib/processor/k8sprocessor/kube.(*OwnerCache).addOwnerInformer.(*OwnerCache).deferredDelete.func4.1()
        github.com/open-telemetry/opentelemetry-collector-contrib/processor/[email protected]/kube/owner.go:298 +0x22
github.com/open-telemetry/opentelemetry-collector-contrib/processor/k8sprocessor/kube.(*OwnerCache).deleteLoop(0xc0029b6c60, 0x0?, 0x0?)
        github.com/open-telemetry/opentelemetry-collector-contrib/processor/[email protected]/kube/owner.go:514 +0x110
created by github.com/open-telemetry/opentelemetry-collector-contrib/processor/k8sprocessor/kube.newOwnerProvider in goroutine 1
        github.com/open-telemetry/opentelemetry-collector-contrib/processor/[email protected]/kube/owner.go:121 +0x1d3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant