Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allow tx index to catch up with the block index in TestChainSetup dtor #5454

Merged
merged 2 commits into from
Jun 28, 2023

Conversation

UdjinM6
Copy link

@UdjinM6 UdjinM6 commented Jun 22, 2023

Issue being fixed or feature implemented

TL;DR: Should hopefully fix crashes like https://gitlab.com/dashpay/dash/-/jobs/4522256293

In dashd we flush all callbacks first and then destroy g_txindex. In tests we had to move g_txindex to TestChainSetup and its dtor is executed first, so the order is broken. It also explains why this crash happens so rare. In most cases tx index is up to date and you need some kind of a hiccup for scheduler to lag behind a bit. Basically, between g_txindex.reset() and FlushBackgroundCallbacks BaseIndex::BlockConnected finally arrives. But it’s processed on a (now) null instance hence a crash. If it’s earlier - it’s processed normally, if it’s later - it’s flushed without execution, so there is a tiny window to catch this crash.

What was done?

Give tx index a bit of time to process everything

How Has This Been Tested?

run tests (but this crash is rare 🤷‍♂️ )

Breaking Changes

n/a

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

…p dtor

otherwise we might be destroying it while scheduler still has some work for it e.g. via BlockConnected signal
@UdjinM6 UdjinM6 added this to the 20 milestone Jun 22, 2023
@UdjinM6 UdjinM6 requested a review from kwvg June 22, 2023 15:05
Copy link
Collaborator

@kwvg kwvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

// Allow tx index to catch up with the block index cause otherwise
// we might be destroying it while scheduler still has some work for it
// e.g. via BlockConnected signal
constexpr int64_t timeout_ms = 10 * 1000;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why don't move timeout_ms inside while? so far as it constexpr variable definition, the overhead is zero, but it would be defined closer to usage

Copy link
Collaborator

@knst knst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

Copy link
Collaborator

@ogabrielides ogabrielides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK for squash merge

@UdjinM6 UdjinM6 merged commit a65b1fb into dashpay:develop Jun 28, 2023
8 checks passed
@UdjinM6 UdjinM6 deleted the fix_test_crash branch June 28, 2023 16:00
UdjinM6 added a commit to UdjinM6/dash that referenced this pull request Jul 25, 2023
…p dtor (dashpay#5454)

## Issue being fixed or feature implemented
TL;DR: Should hopefully fix crashes like
https://gitlab.com/dashpay/dash/-/jobs/4522256293

In dashd we flush all callbacks first and then destroy `g_txindex`. In
tests we had to move `g_txindex` to `TestChainSetup` and its dtor is
executed first, so the order is broken. It also explains why this crash
happens so rare. In most cases tx index is up to date and you need some
kind of a hiccup for scheduler to lag behind a bit. Basically, between
`g_txindex.reset()` and `FlushBackgroundCallbacks`
`BaseIndex::BlockConnected` finally arrives. But it’s processed on a
(now) null instance hence a crash. If it’s earlier - it’s processed
normally, if it’s later - it’s flushed without execution, so there is a
tiny window to catch this crash.

## What was done?
Give tx index a bit of time to process everything

## How Has This Been Tested?
run tests (but this crash is rare 🤷‍♂️ )

## Breaking Changes
n/a

## Checklist:
- [x] I have performed a self-review of my own code
- [x] I have commented my code, particularly in hard-to-understand areas
- [ ] I have added or updated relevant unit/integration/functional/e2e
tests
- [ ] I have made corresponding changes to the documentation
- [x] I have assigned this pull request to a milestone _(for repository
code-owners and collaborators only)_
@UdjinM6 UdjinM6 modified the milestones: 20, 19.3 Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants