Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky test_reload_configuration_checks when all processes are not… #14218

Merged
merged 1 commit into from
Aug 28, 2024

Conversation

tudupa
Copy link
Contributor

@tudupa tudupa commented Aug 22, 2024

… up during swss stop job

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

The testcase test_reload_configuration_checks fails sometimes when swss is stopped after a config reload and some of the critical processes are still coming up. The stop job of swss in the queue is cancelled due to other critical processes still coming up and trying to bring up swss. Hence, we get the error - "Job for swss.service cancelled"

How did you do it?

This PR enhanced the testcase to wait until all the critical processes are up after a config reload and then execute a stop job for swss.

How did you verify/test it?

Ran the testcase 15-20 times to see if it fails.

Any platform specific information?

NA

Supported testbed topology if it's a new test case?

NA

@tudupa tudupa requested a review from prgeor as a code owner August 22, 2024 15:38
@prgeor
Copy link
Contributor

prgeor commented Aug 25, 2024

@tudupa please fix the build errors

@tudupa
Copy link
Contributor Author

tudupa commented Aug 26, 2024

/azpw run Azure.sonic-mgmt

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-mgmt

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lizhijianrd
Copy link
Contributor

PR test failed due to #14247.
The testcase will be skipped once #14248 is merged.

@lizhijianrd
Copy link
Contributor

/azpw run Azure.sonic-mgmt

@lizhijianrd
Copy link
Contributor

@tudupa Can you please help comment /azpw run Azure.sonic-mgmt to re-trigger the PR test?
The PR test issue has been resolved. But I don't have permission to re-trigger it so I need your help. Thanks!

@tudupa
Copy link
Contributor Author

tudupa commented Aug 27, 2024

/azpw run Azure.sonic-mgmt

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-mgmt

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@tudupa tudupa force-pushed the platform_tests/test_reload_config branch from 4a9fc7e to 4aaa76c Compare August 28, 2024 00:28
@wangxin wangxin merged commit 95cbe46 into sonic-net:master Aug 28, 2024
16 checks passed
@lizhijianrd
Copy link
Contributor

@bingwang-ms @yxieca Can you please help add approval tag for backport 202405/202311, thank you!

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Aug 28, 2024
… up during swss stop job (sonic-net#14218)

What is the motivation for this PR?
The testcase test_reload_configuration_checks fails sometimes when swss is stopped after a config reload and some of the critical processes are still coming up. The stop job of swss in the queue is cancelled due to other critical processes still coming up and trying to bring up swss. Hence, we get the error - "Job for swss.service cancelled"

How did you do it?
This PR enhanced the testcase to wait until all the critical processes are up after a config reload and then execute a stop job for swss.

How did you verify/test it?
Ran the testcase 15-20 times to see if it fails.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202405: #14298

mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Aug 28, 2024
… up during swss stop job (sonic-net#14218)

What is the motivation for this PR?
The testcase test_reload_configuration_checks fails sometimes when swss is stopped after a config reload and some of the critical processes are still coming up. The stop job of swss in the queue is cancelled due to other critical processes still coming up and trying to bring up swss. Hence, we get the error - "Job for swss.service cancelled"

How did you do it?
This PR enhanced the testcase to wait until all the critical processes are up after a config reload and then execute a stop job for swss.

How did you verify/test it?
Ran the testcase 15-20 times to see if it fails.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202311: #14302

mssonicbld pushed a commit that referenced this pull request Aug 28, 2024
… up during swss stop job (#14218)

What is the motivation for this PR?
The testcase test_reload_configuration_checks fails sometimes when swss is stopped after a config reload and some of the critical processes are still coming up. The stop job of swss in the queue is cancelled due to other critical processes still coming up and trying to bring up swss. Hence, we get the error - "Job for swss.service cancelled"

How did you do it?
This PR enhanced the testcase to wait until all the critical processes are up after a config reload and then execute a stop job for swss.

How did you verify/test it?
Ran the testcase 15-20 times to see if it fails.
mssonicbld pushed a commit that referenced this pull request Aug 28, 2024
… up during swss stop job (#14218)

What is the motivation for this PR?
The testcase test_reload_configuration_checks fails sometimes when swss is stopped after a config reload and some of the critical processes are still coming up. The stop job of swss in the queue is cancelled due to other critical processes still coming up and trying to bring up swss. Hence, we get the error - "Job for swss.service cancelled"

How did you do it?
This PR enhanced the testcase to wait until all the critical processes are up after a config reload and then execute a stop job for swss.

How did you verify/test it?
Ran the testcase 15-20 times to see if it fails.
eddieruan-alibaba pushed a commit to eddieruan-alibaba/sonic-mgmt that referenced this pull request Sep 4, 2024
… up during swss stop job (sonic-net#14218)

What is the motivation for this PR?
The testcase test_reload_configuration_checks fails sometimes when swss is stopped after a config reload and some of the critical processes are still coming up. The stop job of swss in the queue is cancelled due to other critical processes still coming up and trying to bring up swss. Hence, we get the error - "Job for swss.service cancelled"

How did you do it?
This PR enhanced the testcase to wait until all the critical processes are up after a config reload and then execute a stop job for swss.

How did you verify/test it?
Ran the testcase 15-20 times to see if it fails.
hdwhdw pushed a commit to hdwhdw/sonic-mgmt that referenced this pull request Sep 20, 2024
… up during swss stop job (sonic-net#14218)

What is the motivation for this PR?
The testcase test_reload_configuration_checks fails sometimes when swss is stopped after a config reload and some of the critical processes are still coming up. The stop job of swss in the queue is cancelled due to other critical processes still coming up and trying to bring up swss. Hence, we get the error - "Job for swss.service cancelled"

How did you do it?
This PR enhanced the testcase to wait until all the critical processes are up after a config reload and then execute a stop job for swss.

How did you verify/test it?
Ran the testcase 15-20 times to see if it fails.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants