Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stream functional test and fix bugs #532

Open
wants to merge 1 commit into
base: streaming
Choose a base branch
from

Conversation

ZZhongge
Copy link

  1. add four test for stream function
  2. fix throttling, lock, runtime change, and enable stream

1. add four test for stream function
2. fix throttling, lock, runtime change, and enable stream
int32 max_gap_in_stream = ctx_->get_params()->max_log_gap_in_stream_;
ulong acceptable_precommit_idx = resp.get_next_idx() +
max_gap_in_stream;
ulong last_streamed_log_idx = p->get_last_streamed_log_idx();
p_tr("max gap: %d, acceptable_precommit_idx: %ld, last_streamed_log_idx: %ld, "
"last_sent: %ld, next_idx: %ld", max_gap_in_stream,
acceptable_precommit_idx, last_streamed_log_idx, p->get_last_sent_idx(),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check whether it should enable stream first. Because calling commit will change p->get_last_sent_idx()

CHK_Z( append_log_in_stream_without_delivery(s1, s2_addr, 1, 1) );

// Activate stream mode
// It need to re-install the snapshot because of config change, 18 reqs
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After S2 re-enter into the cluster, it will trigger a duplicate snapshot install, is that expected?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in S2, at first it received snapshot at 10, and had log from 11 to 14.

2024-09-18T17:07:47.820_650-07:00 [159298] [INFO] snapshot idx 10 term 1 is successfully applied, log start 11 last idx 14  [handle_snapshot_sync.cxx:593, handle_snapshot_sync_req()]

after that, log 15 was created for configuration in S1

2024-09-18T17:07:47.820_938-07:00 [159303] [INFO] new configuration: log idx 15, prev log idx 14peer 1, DC ID 1, S1, voting member, 50
peer 2, DC ID 1, S2, voting member, 50 
my id: 1, leader: 1, term: 1    [handle_commit.cxx:929, reconfigure()]

since snapshot distance is 5, and number of reserved log is 0, it immediately did log compaction, up to 15

2024-09-18T17:07:47.820_962-07:00 [159303] [INFO] snapshot idx 15 log_term 1 created, compact the log store if needed   [handle_commit.cxx:662, on_snapshot_completed()]
2024-09-18T17:07:47.820_964-07:00 [159303] [INFO] log_store_ compact upto 15    [handle_commit.cxx:673, on_snapshot_completed()]

which means, there is no way for S2 to catch-up, hence snapshot was sent again.

2024-09-18T17:07:47.821_256-07:00 [159298] [INFO] trying to sync snapshot with last index 15 to peer 2, its last log idx 14 [handle_snapshot_sync.cxx:144, create_sync_snapshot_req()]

So if you want to avoid sending snapshot twice, set reserved_log_items_ greater than 0 at the beginning of the test. Maybe 5 is appropriate.

@@ -284,10 +284,17 @@ bool raft_server::request_append_entries(ptr<peer> p) {
}
} else {
ulong last_streamed_log_idx = p->get_last_streamed_log_idx();
int32 max_gap_in_stream = ctx_->get_params()->max_log_gap_in_stream_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of ctx_->get_params(), use params which is gathered above, to avoid calling get_params() twice.

@@ -284,10 +284,17 @@ bool raft_server::request_append_entries(ptr<peer> p) {
}
} else {
ulong last_streamed_log_idx = p->get_last_streamed_log_idx();
int32 max_gap_in_stream = ctx_->get_params()->max_log_gap_in_stream_;
if (last_streamed_log_idx > 0 && max_gap_in_stream == 0) {
p_in("disable stream mode at runtime");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put more useful info: peer id, last_streamed_log_idx.

Comment on lines +307 to +308
if (max_gap_in_stream + p->get_next_log_idx()
<= (last_streamed_log_idx + 1)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (max_gap_in_stream + p->get_next_log_idx() <= 
        (last_streamed_log_idx + 1)) {

Comment on lines +1077 to +1078
p_tr("max gap: %d, acceptable_precommit_idx: %ld, last_streamed_log_idx: %ld, "
"last_sent: %ld, next_idx: %ld", max_gap_in_stream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ulong (== uint64_t), we should use PRIu64 to be multi platform compatible. Also need peer ID in this log.

if (max_gap_in_stream > 0 &&
last_streamed_log_idx == 0 &&
resp.get_next_idx() > 0 &&
p->get_last_sent_idx() < resp.get_next_idx() &&
precommit_index_ < acceptable_precommit_idx) {
p_in("start stream mode at idx: %ld", resp.get_next_idx() - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto: PRIu64

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also put peer ID here too.

@@ -0,0 +1,573 @@
/************************************************************************
Copyright 2017-2019 eBay Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be 2017-present

@@ -0,0 +1,573 @@
/************************************************************************
Copyright 2017-2019 eBay Inc.
Author/Developer(s): Jung-Sang Ahn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can either remove this line, or put your name.

Comment on lines +33 to +36
bool enable_ssl,
bool use_global_asio = false,
bool use_bg_snapshot_io = true,
const raft_server::init_options& opt = raft_server::init_options()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent.


// Snapshot transmission in stream mode
ts.doTest( "snapshot transmission in stream mode",
snapshot_transmission_in_stream_mode );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent.

CHK_Z( append_log_in_stream_without_delivery(s1, s2_addr, 1, 1) );

// Activate stream mode
// It need to re-install the snapshot because of config change, 18 reqs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in S2, at first it received snapshot at 10, and had log from 11 to 14.

2024-09-18T17:07:47.820_650-07:00 [159298] [INFO] snapshot idx 10 term 1 is successfully applied, log start 11 last idx 14  [handle_snapshot_sync.cxx:593, handle_snapshot_sync_req()]

after that, log 15 was created for configuration in S1

2024-09-18T17:07:47.820_938-07:00 [159303] [INFO] new configuration: log idx 15, prev log idx 14peer 1, DC ID 1, S1, voting member, 50
peer 2, DC ID 1, S2, voting member, 50 
my id: 1, leader: 1, term: 1    [handle_commit.cxx:929, reconfigure()]

since snapshot distance is 5, and number of reserved log is 0, it immediately did log compaction, up to 15

2024-09-18T17:07:47.820_962-07:00 [159303] [INFO] snapshot idx 15 log_term 1 created, compact the log store if needed   [handle_commit.cxx:662, on_snapshot_completed()]
2024-09-18T17:07:47.820_964-07:00 [159303] [INFO] log_store_ compact upto 15    [handle_commit.cxx:673, on_snapshot_completed()]

which means, there is no way for S2 to catch-up, hence snapshot was sent again.

2024-09-18T17:07:47.821_256-07:00 [159298] [INFO] trying to sync snapshot with last index 15 to peer 2, its last log idx 14 [handle_snapshot_sync.cxx:144, create_sync_snapshot_req()]

So if you want to avoid sending snapshot twice, set reserved_log_items_ greater than 0 at the beginning of the test. Maybe 5 is appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants