Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat_: RPC providers detailed statuses #5923 #5924

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

friofry
Copy link
Contributor

@friofry friofry commented Oct 7, 2024

PR Summary:
This PR introduces a new RPC provider health monitor, as discussed here. It's moved to a separate PR without changing existing logic (will be introduced as a separate PR)

please read before review
1st commit - healthmanager

  • aggregator: Combines the status of multiple RPC providers to a chain into one result (e.g., if provider1 is down and provider2 is up, the result is 'up').
  • provider_errors: Helps filter out non-critical errors (general or RPC errors).
  • provider_health_manager.go: Collects provider statuses and notifies subscribers when things change.
  • blockchain_health_manager.go: Collects chain statuses from different provider_health_managerss and sends updates.

2nd commit - integration

  • circuit_breaker - Returns a list of errors per provider (previously it was one big concatenated string).
  • rpc/chain/client.go - Passes RPC results from the circuit breaker to the provider_health_manager
  • rpc/client.go - Creates a blockchain_health_manager and sends a new event to the wallet feed on every update

Important changes:

  • Matches the existing logic on the desktop version.
  • As noted in code smells, contexts should not be stored in structs. Instead, store cancel callbacks. (Reference: Code Smell #2)

Usage:

  1. Create top-level BlockchainHealthManager with NewBlockchainHealthManager()
  2. Create a ProviderHealthManager for each chain via NewProvidersHealthManager(chainID)
  3. Register a ProvidersHealthManager with BlockchainHealthManager::RegisterProvidersHealthManager
  4. On each provider call ProvidersHealthManager::Update()
  5. Subscribe to blockchain updates channel BlockchainHealthManager::Subscribe() and get status from GetFullStatus()
  6. Stop BlockchainHealthManager::Stop()

Closes #5923

@status-im-auto
Copy link
Member

status-im-auto commented Oct 7, 2024

Jenkins Builds

Click to see older builds (72)
Commit #️⃣ Finished (UTC) Duration Platform Result
✖️ ac6650f #1 2024-10-07 15:24:55 ~1 min tests 📄log
✔️ ac6650f #1 2024-10-07 15:26:41 ~3 min tests-rpc 📄log
✔️ ac6650f #1 2024-10-07 15:27:34 ~4 min linux 📦zip
✔️ ac6650f #1 2024-10-07 15:28:40 ~5 min ios 📦zip
✔️ ac6650f #1 2024-10-07 15:28:49 ~5 min android 📦aar
✔️ 5164e3d #2 2024-10-08 08:37:03 ~2 min linux 📦zip
✔️ 5164e3d #2 2024-10-08 08:37:49 ~3 min ios 📦zip
✔️ 5164e3d #2 2024-10-08 08:38:25 ~3 min tests-rpc 📄log
✖️ 5164e3d #2 2024-10-08 08:38:59 ~4 min tests 📄log
✔️ 5164e3d #2 2024-10-08 08:39:36 ~5 min android 📦aar
✔️ 53d33e7 #3 2024-10-08 09:49:43 ~2 min android 📦aar
✔️ 53d33e7 #3 2024-10-08 09:49:59 ~2 min linux 📦zip
✖️ 53d33e7 #3 2024-10-08 09:50:35 ~3 min tests 📄log
✔️ 53d33e7 #3 2024-10-08 09:50:36 ~3 min ios 📦zip
✔️ 53d33e7 #3 2024-10-08 09:50:51 ~3 min tests-rpc 📄log
✔️ cd5b966 #4 2024-10-08 09:52:24 ~1 min android 📦aar
✔️ cd5b966 #4 2024-10-08 09:53:05 ~2 min linux 📦zip
✔️ cd5b966 #4 2024-10-08 09:54:01 ~3 min ios 📦zip
✖️ cd5b966 #4 2024-10-08 09:54:18 ~3 min tests 📄log
✔️ cd5b966 #4 2024-10-08 09:54:38 ~3 min tests-rpc 📄log
✔️ 4341e17 #5 2024-10-08 10:07:26 ~1 min android 📦aar
✖️ 4341e17 #5 2024-10-08 10:07:45 ~2 min tests 📄log
✔️ 4341e17 #5 2024-10-08 10:07:56 ~2 min linux 📦zip
✔️ 4341e17 #5 2024-10-08 10:08:28 ~2 min ios 📦zip
✔️ 4341e17 #5 2024-10-08 10:09:12 ~3 min tests-rpc 📄log
✔️ bc7e638 #6 2024-10-08 10:11:34 ~1 min android 📦aar
✖️ bc7e638 #6 2024-10-08 10:11:38 ~1 min tests 📄log
✔️ bc7e638 #6 2024-10-08 10:12:14 ~2 min linux 📦zip
✔️ bc7e638 #6 2024-10-08 10:13:21 ~3 min tests-rpc 📄log
✔️ bc7e638 #6 2024-10-08 10:13:22 ~3 min ios 📦zip
✖️ 694148a #7 2024-10-08 10:14:40 ~2 min tests 📄log
✔️ 694148a #7 2024-10-08 10:14:51 ~2 min android 📦aar
✔️ 694148a #7 2024-10-08 10:15:11 ~2 min linux 📦zip
✔️ 694148a #7 2024-10-08 10:17:07 ~3 min ios 📦zip
✔️ 694148a #7 2024-10-08 10:17:10 ~3 min tests-rpc 📄log
✖️ 60ab2ea #8 2024-10-08 10:31:47 ~2 min tests 📄log
✔️ 60ab2ea #8 2024-10-08 10:32:21 ~2 min android 📦aar
✔️ 60ab2ea #8 2024-10-08 10:32:27 ~2 min linux 📦zip
✔️ 60ab2ea #8 2024-10-08 10:33:13 ~3 min tests-rpc 📄log
✔️ 60ab2ea #8 2024-10-08 10:33:15 ~3 min ios 📦zip
✖️ 3680586 #9 2024-10-08 11:58:03 ~2 min tests 📄log
✔️ 3680586 #9 2024-10-08 11:58:16 ~2 min android 📦aar
✔️ 3680586 #9 2024-10-08 11:58:28 ~2 min linux 📦zip
✔️ 3680586 #9 2024-10-08 11:59:02 ~3 min ios 📦zip
✔️ 3680586 #9 2024-10-08 11:59:22 ~3 min tests-rpc 📄log
✔️ c817bef #10 2024-10-08 15:06:08 ~2 min android 📦aar
✔️ c817bef #10 2024-10-08 15:06:21 ~2 min linux 📦zip
✔️ c817bef #10 2024-10-08 15:07:14 ~3 min tests-rpc 📄log
✔️ c817bef #10 2024-10-08 15:07:33 ~3 min ios 📦zip
✖️ c817bef #10 2024-10-08 15:35:27 ~31 min tests 📄log
✔️ 5b29745 #11 2024-10-08 15:19:22 ~2 min android 📦aar
✔️ 5b29745 #11 2024-10-08 15:19:39 ~2 min linux 📦zip
✔️ 5b29745 #11 2024-10-08 15:20:23 ~3 min ios 📦zip
✔️ 5b29745 #11 2024-10-08 15:20:39 ~3 min tests-rpc 📄log
92c5dbc #12 2024-10-08 15:33:47 ~45 sec ios 📄log
92c5dbc #12 2024-10-08 15:33:53 ~53 sec android 📄log
92c5dbc #12 2024-10-08 15:34:01 ~59 sec linux 📄log
✖️ 92c5dbc #12 2024-10-08 15:35:18 ~2 min tests-rpc 📄log
✖️ 92c5dbc #11 2024-10-08 15:37:49 ~2 min tests 📄log
9f5569a #14 2024-10-08 15:51:46 ~46 sec ios 📄log
9f5569a #14 2024-10-08 15:51:55 ~58 sec android 📄log
9f5569a #13 2024-10-08 15:52:54 ~2 min linux 📄log
✖️ 9f5569a #13 2024-10-08 15:52:58 ~2 min tests-rpc 📄log
✖️ 9f5569a #14 2024-10-08 15:53:47 ~37 sec tests-rpc 📄log
✖️ 9f5569a #12 2024-10-08 15:53:47 ~2 min tests 📄log
9f5569a #14 2024-10-08 15:53:59 ~50 sec linux 📄log
✖️ 9f5569a #13 2024-10-08 15:55:45 ~1 min tests 📄log
✔️ b0651df #15 2024-10-08 19:09:15 ~2 min android 📦aar
✔️ b0651df #15 2024-10-08 19:09:26 ~2 min linux 📦zip
✔️ b0651df #15 2024-10-08 19:09:54 ~3 min ios 📦zip
✔️ b0651df #15 2024-10-08 19:10:10 ~3 min tests-rpc 📄log
✖️ b0651df #14 2024-10-08 19:38:30 ~31 min tests 📄log
Commit #️⃣ Finished (UTC) Duration Platform Result
✔️ cd640b3 #16 2024-10-09 09:42:50 ~3 min ios 📦zip
✔️ cd640b3 #16 2024-10-09 09:45:12 ~6 min linux 📦zip
✔️ cd640b3 #16 2024-10-09 09:45:29 ~6 min tests-rpc 📄log
✔️ cd640b3 #16 2024-10-09 09:46:18 ~7 min android 📦aar
✖️ cd640b3 #15 2024-10-09 10:14:11 ~35 min tests 📄log
✖️ cd640b3 #16 2024-10-09 16:45:21 ~32 min tests 📄log
✔️ eb10b20 #17 2024-10-09 17:09:35 ~2 min android 📦aar
✔️ eb10b20 #17 2024-10-09 17:10:00 ~2 min linux 📦zip
✔️ eb10b20 #17 2024-10-09 17:10:39 ~3 min ios 📦zip
✔️ eb10b20 #17 2024-10-09 17:10:40 ~3 min tests-rpc 📄log

Copy link

codecov bot commented Oct 7, 2024

Codecov Report

Attention: Patch coverage is 66.53920% with 175 lines in your changes missing coverage. Please review.

Project coverage is 10.54%. Comparing base (55bad8f) to head (cd640b3).
Report is 13 commits behind head on develop.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
healthmanager/provider_errors/provider_errors.go 37.96% 35 Missing and 32 partials ⚠️
healthmanager/blockchain_health_manager.go 73.68% 29 Missing and 1 partial ⚠️
rpc/client.go 68.85% 13 Missing and 6 partials ⚠️
...althmanager/provider_errors/rpc_provider_errors.go 54.28% 13 Missing and 3 partials ⚠️
healthmanager/providers_health_manager.go 79.03% 13 Missing ⚠️
healthmanager/aggregator/aggregator.go 83.56% 10 Missing and 2 partials ⚠️
healthmanager/rpcstatus/provider_status.go 50.00% 11 Missing ⚠️
circuitbreaker/circuit_breaker.go 69.56% 6 Missing and 1 partial ⚠️

❗ There is a different number of reports uploaded between BASE (55bad8f) and HEAD (cd640b3). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (55bad8f) HEAD (cd640b3)
unit 1 0
Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #5924       +/-   ##
============================================
- Coverage    47.02%   10.54%   -36.48%     
============================================
  Files          832      832               
  Lines       137725   137070      -655     
============================================
- Hits         64763    14459    -50304     
- Misses       65429   120751    +55322     
+ Partials      7533     1860     -5673     
Flag Coverage Δ
functional 10.54% <66.53%> (?)
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
node/get_status_node.go 37.01% <100.00%> (-12.84%) ⬇️
rpc/chain/client.go 39.04% <100.00%> (+14.85%) ⬆️
circuitbreaker/circuit_breaker.go 67.74% <69.56%> (-27.07%) ⬇️
healthmanager/rpcstatus/provider_status.go 50.00% <50.00%> (ø)
healthmanager/aggregator/aggregator.go 83.56% <83.56%> (ø)
healthmanager/providers_health_manager.go 79.03% <79.03%> (ø)
...althmanager/provider_errors/rpc_provider_errors.go 54.28% <54.28%> (ø)
rpc/client.go 51.72% <68.85%> (-14.95%) ⬇️
healthmanager/blockchain_health_manager.go 73.68% <73.68%> (ø)
healthmanager/provider_errors/provider_errors.go 37.96% <37.96%> (ø)

... and 643 files with indirect coverage changes

@friofry friofry changed the title RPC providers detailed statuses #5923 feat: RPC providers detailed statuses #5923 Oct 8, 2024
@friofry friofry changed the title feat: RPC providers detailed statuses #5923 feat_: RPC providers detailed statuses #5923 Oct 8, 2024
Copy link
Contributor

@saledjenic saledjenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner approach than we had.

healthmanager/aggregator/aggregator.go Outdated Show resolved Hide resolved
healthmanager/providers_health_manager.go Outdated Show resolved Hide resolved

// Subscribe allows providers to receive notifications about changes.
func (p *ProvidersHealthManager) Subscribe() chan struct{} {
p.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If p.mu is used to guard p. aggregator could we use another mutex to guard p. subscribers only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it shouldn't be a problem to protect the whole struct state with the same mutext, unless there is frequent access to Update and Subscribe methods at the same time.

Ideally we should not use 2 mutexes in the same struct to avoid unintentional deadlocks. Otherwise we have to ensure consistent locking order. Or I can use a sync.Map for subscribers

// Subscribe allows clients to receive notifications about changes.
func (b *BlockchainHealthManager) Subscribe() chan struct{} {
ch := make(chan struct{}, 1)
b.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here, does it make sense to have separate mutex for subscribers?

healthmanager/blockchain_health_manager.go Outdated Show resolved Hide resolved
@friofry friofry force-pushed the ab/issue-5923-rpc-status branch 14 times, most recently from b0651df to cd640b3 Compare October 9, 2024 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rpc provider status aggregator with detailed info
3 participants