Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make MWorkers reconnecting to EventPublisher if it was killed #633

Conversation

vzhestkov
Copy link
Contributor

What does this PR do?

It makes MWorkers to be able to recoonect with salt.transport.ipc.IPCMessageClient to the EventPublisher after killing it with OOM or any other reason, so it makes salt-master to be able to self recover after killing the EventPublisher.

Implementation of timeout and tries is a kind of bonus there, the main fix is in the way of handling StreamClosedError exceptions.

The reason why there is norelated upstream change:

salt.transport.ipc is specified as deprecated for 3009 and in the master either tcp or zmq can be used, but they have different implementation of this part which can be not affected, but it's better to check it deeper on switching the version.

What issues does this PR fix or reference?

Tracks: https://github.com/SUSE/spacewalk/issues/23526

Previous Behavior

On killing EventPublisher subprocess the salt-master is still running but is not able to handle the incoming events the proper way as the stream to the EventPublisher in MWorker was closed and not restored, so the salt-master is running but not functional.

New Behavior

The salt-master is able to self recover on killing EventPublisher and continue handling the events.

Merge requirements satisfied?

[NOTICE] Bug fixes or features added to Salt require tests.

Commits signed with GPG?

Yes/No

Please review Salt's Contributing Guide for best practices.

See GitHub's page on GPG signing for more information about signing commits with GPG.

@vzhestkov vzhestkov force-pushed the openSUSE/improve/3006.0/reconnect-transport-ipc branch 2 times, most recently from 6c9f301 to 5329afe Compare March 1, 2024 10:01
@vzhestkov vzhestkov force-pushed the openSUSE/improve/3006.0/reconnect-transport-ipc branch from 2efe2dc to 900d579 Compare May 15, 2024 07:33
@vzhestkov vzhestkov merged commit 794b5d1 into openSUSE/release/3006.0 May 15, 2024
2 of 8 checks passed
@vzhestkov vzhestkov deleted the openSUSE/improve/3006.0/reconnect-transport-ipc branch May 15, 2024 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants