-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[test_bgp_stress_link_flap] case hung sometimes due to memory exhaust #14163
[test_bgp_stress_link_flap] case hung sometimes due to memory exhaust #14163
Conversation
Signed-off-by: xuliping <[email protected]>
The pre-commit check detected issues in the files touched by this pull request. Detailed pre-commit check results: To run the pre-commit checks locally, you can follow below steps:
|
…sonic-net#14163) What is the motivation for this PR? The case is flaky Sometimes the case runs a long time and no response. Especially on kvm device. Based on current log, it should be related to the memory resource limitation. The case would create so many threads to flap the neighbor, it would cause kvm device memory exhaust, and same for low memory physical device. Based on available logs, no obvious memory leak issue. How did you do it? The case is for stress link flap, it creates thread per interface to flap. 1: enlarge the delay time for kvm 2: only test one interface for kvm 3: Use one thread to flap all the interfaces for fanout. 4: correct neighbor host 6: add event stop and timeout for thread function to ensure thread exit How did you verify/test it? run the case locally and verified using elastictest https://elastictest.org/scheduler/testplan/66c69c4008761ba27f76ed5d https://elastictest.org/scheduler/testplan/66c69c2708761ba27f76ed5b https://elastictest.org/scheduler/testplan/66cdafd1bd14ce56b2e820f7 Signed-off-by: xuliping <[email protected]>
Cherry-pick PR to 202405: #14378 |
…#14163) What is the motivation for this PR? The case is flaky Sometimes the case runs a long time and no response. Especially on kvm device. Based on current log, it should be related to the memory resource limitation. The case would create so many threads to flap the neighbor, it would cause kvm device memory exhaust, and same for low memory physical device. Based on available logs, no obvious memory leak issue. How did you do it? The case is for stress link flap, it creates thread per interface to flap. 1: enlarge the delay time for kvm 2: only test one interface for kvm 3: Use one thread to flap all the interfaces for fanout. 4: correct neighbor host 6: add event stop and timeout for thread function to ensure thread exit How did you verify/test it? run the case locally and verified using elastictest https://elastictest.org/scheduler/testplan/66c69c4008761ba27f76ed5d https://elastictest.org/scheduler/testplan/66c69c2708761ba27f76ed5b https://elastictest.org/scheduler/testplan/66cdafd1bd14ce56b2e820f7 Signed-off-by: xuliping <[email protected]>
…sonic-net#14163) What is the motivation for this PR? The case is flaky Sometimes the case runs a long time and no response. Especially on kvm device. Based on current log, it should be related to the memory resource limitation. The case would create so many threads to flap the neighbor, it would cause kvm device memory exhaust, and same for low memory physical device. Based on available logs, no obvious memory leak issue. How did you do it? The case is for stress link flap, it creates thread per interface to flap. 1: enlarge the delay time for kvm 2: only test one interface for kvm 3: Use one thread to flap all the interfaces for fanout. 4: correct neighbor host 6: add event stop and timeout for thread function to ensure thread exit How did you verify/test it? run the case locally and verified using elastictest https://elastictest.org/scheduler/testplan/66c69c4008761ba27f76ed5d https://elastictest.org/scheduler/testplan/66c69c2708761ba27f76ed5b https://elastictest.org/scheduler/testplan/66cdafd1bd14ce56b2e820f7 Signed-off-by: xuliping <[email protected]>
…14369) Reverts sonic-net#14227 Fix PR (sonic-net#14163) merged
Description of PR
Summary:
Fixes # (issue)
28852952
fix for #14076
Type of change
Back port request
Approach
What is the motivation for this PR?
The case is flaky
Sometimes the case runs a long time and no response. Especially on kvm device.
Based on current log, it should be related to the memory resource limitation.
The case would create so many threads to flap the neighbor, it would cause kvm device memory exhaust, and same for low memory physical device.
Based on available logs, no obvious memory leak issue.
How did you do it?
The case is for stress link flap, it creates thread per interface to flap.
1: enlarge the delay time for kvm
2: only test one interface for kvm
3: Use one thread to flap all the interfaces for fanout.
4: correct neighbor host
6: add event stop and timeout for thread function to ensure thread exit
How did you verify/test it?
run the case locally and verified using elastictest
https://elastictest.org/scheduler/testplan/66c69c4008761ba27f76ed5d
https://elastictest.org/scheduler/testplan/66c69c2708761ba27f76ed5b
https://elastictest.org/scheduler/testplan/66cdafd1bd14ce56b2e820f7
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation