Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETCD becomes unavailable on update #449

Open
alexander-demicev opened this issue Sep 30, 2024 · 2 comments
Open

ETCD becomes unavailable on update #449

alexander-demicev opened this issue Sep 30, 2024 · 2 comments
Labels
priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@alexander-demicev
Copy link
Member

See https://gitlab.com/sylva-projects/sylva-core/-/issues/1687 for all details describing the issue

@alexander-demicev alexander-demicev added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Sep 30, 2024
@tmmorin
Copy link

tmmorin commented Sep 30, 2024

thanks for opening this issue

one precision: the symptom is less clear that "etcd isn't available"... what we clearly see is some controllers failing to get leases from the apiServer, but some controller apparently still can, and we aren't sure that reading from the api is an issue

there are "bad certificates" issues in etcd logs, but they possibly are harmless (they relate to a node which isn't anymore an etcd cluster member)

so at this point we aren't really sure what the best issue title is ... ;-)

@tmmorin
Copy link

tmmorin commented Sep 30, 2024

here are two dumps of CI jobs that exhibited the issue:

The relevant info is in management-cluster-dump directory, it has the usual per-ns kubectl cluster-info dumps with pod logs, but also resource dumps for many CRDs, it also has a "clusterctl-describe.txt" which is convenient to see what was CAPI state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

No branches or pull requests

2 participants