Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CX_HARDENING] Evaluate and Remove Issue Links in HotShot #3701

Merged
merged 8 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions crates/hotshot/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -436,8 +436,6 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> SystemContext<T
}

/// Emit an external event
// A copypasta of `ConsensusApi::send_event`
// TODO: remove with https://github.com/EspressoSystems/HotShot/issues/2407
async fn send_external_event(&self, event: Event<TYPES>) {
debug!(?event, "send_external_event");
broadcast_event(event, &self.external_event_stream.0).await;
Expand Down
1 change: 0 additions & 1 deletion crates/hotshot/src/traits/networking/libp2p_network.rs
Original file line number Diff line number Diff line change
Expand Up @@ -552,7 +552,6 @@ impl<K: SignatureKey + 'static> Libp2pNetwork<K> {
bootstrap_addrs,
is_ready: Arc::new(AtomicBool::new(false)),
// This is optimal for 10-30 nodes. TODO: parameterize this for both tests and examples
// https://github.com/EspressoSystems/HotShot/issues/2088
dht_timeout: config.dht_timeout.unwrap_or(Duration::from_secs(120)),
is_bootstrapped: Arc::new(AtomicBool::new(false)),
metrics,
Expand Down
2 changes: 1 addition & 1 deletion crates/libp2p-networking/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ spawns off five integration tests.
- One that intersperses both broadcast and increments.
- Two that publishes entries to the DHT and checks that other nodes can access these entries.

This can fail on MacOS (and linux) due to ["too many open files."](https://github.com/EspressoSystems/hotshot-networking-demo/issues/18) The fix is:
This can fail on MacOS (and linux) due to "too many open files." The fix is:

```bash
ulimit -n 4096
Expand Down
4 changes: 0 additions & 4 deletions crates/libp2p-networking/src/network/node.rs
Original file line number Diff line number Diff line change
Expand Up @@ -222,10 +222,6 @@ impl<K: SignatureKey + 'static> NetworkNode<K> {

// - Build a gossipsub network behavior
let gossipsub: Gossipsub = Gossipsub::new(
// TODO do we even need this?
// <https://github.com/EspressoSystems/hotshot/issues/42>
// if messages are signed at the the consensus level AND the network
// level (noise), this feels redundant.
MessageAuthenticity::Signed(keypair.clone()),
gossipsub_config,
)
Expand Down
90 changes: 27 additions & 63 deletions crates/task-impls/src/consensus/handlers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -218,14 +218,16 @@ pub async fn publish_proposal_from_commitment_and_metadata<TYPES: NodeType, V: V
.filter(|cert| cert.is_valid_for_view(&view))
.cloned();

// FIXME - This is not great, and will be fixed later.
// If it's > July, 2024 and this is still here, something has gone horribly wrong.
let cnm = commitment_and_metadata
.clone()
.context("Cannot propose because we don't have the VID payload commitment and metadata")?;
ensure!(
commitment_and_metadata.is_some(),
"Cannot propose because we don't have the VID payload commitment and metadata"
);

// This is a safe unwrap due to the prior ensure call.
let commitment_and_metadata = commitment_and_metadata.unwrap();

ensure!(
cnm.block_view == view,
commitment_and_metadata.block_view == view,
"Cannot propose because our VID payload commitment and metadata is for an older view."
);

Expand All @@ -236,7 +238,7 @@ pub async fn publish_proposal_from_commitment_and_metadata<TYPES: NodeType, V: V
OuterConsensus::new(Arc::clone(&consensus.inner_consensus)),
sender,
view,
cnm,
commitment_and_metadata,
parent_leaf.clone(),
state,
proposal_upgrade_certificate,
Expand All @@ -258,45 +260,6 @@ pub async fn publish_proposal_from_commitment_and_metadata<TYPES: NodeType, V: V
Ok(create_and_send_proposal_handle)
}

/// Publishes a proposal if there exists a value which we can propose from. Specifically, we must have either
/// `commitment_and_metadata`, or a `decided_upgrade_certificate`.
#[allow(clippy::too_many_arguments)]
#[instrument(skip_all)]
pub async fn publish_proposal_if_able<TYPES: NodeType, V: Versions>(
view: TYPES::Time,
sender: Sender<Arc<HotShotEvent<TYPES>>>,
receiver: Receiver<Arc<HotShotEvent<TYPES>>>,
quorum_membership: Arc<TYPES::Membership>,
public_key: TYPES::SignatureKey,
private_key: <TYPES::SignatureKey as SignatureKey>::PrivateKey,
consensus: OuterConsensus<TYPES>,
delay: u64,
formed_upgrade_certificate: Option<UpgradeCertificate<TYPES>>,
upgrade_lock: UpgradeLock<TYPES, V>,
commitment_and_metadata: Option<CommitmentAndMetadata<TYPES>>,
proposal_cert: Option<ViewChangeEvidence<TYPES>>,
instance_state: Arc<TYPES::InstanceState>,
id: u64,
) -> Result<JoinHandle<()>> {
publish_proposal_from_commitment_and_metadata(
view,
sender,
receiver,
quorum_membership,
public_key,
private_key,
consensus,
delay,
formed_upgrade_certificate,
upgrade_lock,
commitment_and_metadata,
proposal_cert,
instance_state,
id,
)
.await
}

/// Handle the received quorum proposal.
///
/// Returns the proposal that should be used to set the `cur_proposal` for other tasks.
Expand Down Expand Up @@ -490,23 +453,24 @@ pub(crate) async fn handle_quorum_proposal_recv<
"Attempting to publish proposal after voting for liveness; now in view: {}",
*new_view
);
let create_and_send_proposal_handle = publish_proposal_if_able(
qc.view_number + 1,
event_sender,
event_receiver,
Arc::clone(&task_state.quorum_membership),
task_state.public_key.clone(),
task_state.private_key.clone(),
OuterConsensus::new(Arc::clone(&task_state.consensus.inner_consensus)),
task_state.round_start_delay,
task_state.formed_upgrade_certificate.clone(),
task_state.upgrade_lock.clone(),
task_state.payload_commitment_and_metadata.clone(),
task_state.proposal_cert.clone(),
Arc::clone(&task_state.instance_state),
task_state.id,
)
.await?;
let create_and_send_proposal_handle =
publish_proposal_from_commitment_and_metadata(
qc.view_number + 1,
event_sender,
event_receiver,
Arc::clone(&task_state.quorum_membership),
task_state.public_key.clone(),
task_state.private_key.clone(),
OuterConsensus::new(Arc::clone(&task_state.consensus.inner_consensus)),
task_state.round_start_delay,
task_state.formed_upgrade_certificate.clone(),
task_state.upgrade_lock.clone(),
task_state.payload_commitment_and_metadata.clone(),
task_state.proposal_cert.clone(),
Arc::clone(&task_state.instance_state),
task_state.id,
)
.await?;

task_state
.spawned_tasks
Expand Down
5 changes: 3 additions & 2 deletions crates/task-impls/src/consensus/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ use async_lock::RwLock;
use async_std::task::JoinHandle;
use async_trait::async_trait;
use futures::future::join_all;
use handlers::publish_proposal_from_commitment_and_metadata;
use hotshot_task::task::TaskState;
use hotshot_types::{
consensus::{CommitmentAndMetadata, OuterConsensus},
Expand All @@ -38,7 +39,7 @@ use tracing::{debug, error, info, instrument, warn};

use crate::{
consensus::handlers::{
handle_quorum_proposal_recv, handle_quorum_proposal_validated, publish_proposal_if_able,
handle_quorum_proposal_recv, handle_quorum_proposal_validated,
update_state_and_vote_if_able, VoteInfo,
},
events::HotShotEvent,
Expand Down Expand Up @@ -191,7 +192,7 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> ConsensusTaskSt
event_sender: Sender<Arc<HotShotEvent<TYPES>>>,
event_receiver: Receiver<Arc<HotShotEvent<TYPES>>>,
) -> Result<()> {
let create_and_send_proposal_handle = publish_proposal_if_able(
let create_and_send_proposal_handle = publish_proposal_from_commitment_and_metadata(
view,
event_sender,
event_receiver,
Expand Down
6 changes: 2 additions & 4 deletions crates/task-impls/src/helpers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ use hotshot_types::{
};
#[cfg(async_executor_impl = "tokio")]
use tokio::task::JoinHandle;
use tracing::{debug, error, info, instrument, warn};
use tracing::{debug, info, instrument, warn};

use crate::{events::HotShotEvent, request::REQUEST_TIMEOUT};

Expand Down Expand Up @@ -699,9 +699,7 @@ pub(crate) async fn update_view<TYPES: NodeType>(
debug!("Updating view from {} to {}", *old_view, *new_view);

if *old_view / 100 != *new_view / 100 {
// TODO (https://github.com/EspressoSystems/HotShot/issues/2296):
// switch to info! when INFO logs become less cluttered
error!("Progress: entered view {:>6}", *new_view);
info!("Progress: entered view {:>6}", *new_view);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Willing to swap this one back if we think it's relevant. My logic is that we pretty much can search for it anyway in all of our log ingestion sources, so it feels silly to place it on the error log.

}

*cur_view = new_view;
Expand Down
7 changes: 2 additions & 5 deletions crates/task-impls/src/network.rs
Original file line number Diff line number Diff line change
Expand Up @@ -170,9 +170,6 @@ impl<TYPES: NodeType> NetworkMessageTaskState<TYPES> {
}
},
};
// TODO (Keyao benchmarking) Update these event variants (similar to the
// `TransactionsRecv` event) so we can send one event for a vector of messages.
// <https://github.com/EspressoSystems/HotShot/issues/1428>
broadcast_event(Arc::new(event), &self.internal_event_stream).await;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one appears to be a very old ticket pertaining to events that no longer exist, so I'm not sure what the context is, but my thought is that if it were really a huge issue, we'd sort it out. To this point, I have not encountered this.

}

Expand Down Expand Up @@ -298,7 +295,7 @@ impl<
sender: sender.clone(),
kind: MessageKind::<TYPES>::from_consensus_message(SequencingMessage::Da(
DaConsensusMessage::VidDisperseMsg(proposal),
)), // TODO not a DaConsensusMessage https://github.com/EspressoSystems/HotShot/issues/1696
)),
};
let serialized_message = match self.upgrade_lock.serialize(&message).await {
Ok(serialized) => serialized,
Expand Down Expand Up @@ -363,7 +360,7 @@ impl<
/// which will be used to create a message and transmit on the wire.
/// Returns `None` if the parsing result should not be sent on the wire.
/// Handles the `VidDisperseSend` event separately using a helper method.
#[allow(clippy::too_many_lines)] // TODO https://github.com/EspressoSystems/HotShot/issues/1704
#[allow(clippy::too_many_lines)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is just marking that this is a too_many_lines function, but we have several of these in the codebase, and the refactor is painful. Having a bespoke issue for each instance seems like overkill. These are easily "not correct", so they can be fixed whenever it's necessary. Therefore, having an issue seems to be making no material difference in team awareness of the issue. Furthermore, since it's one of our trademark enormous match statements it's actually quite easy to follow despite it's size.

async fn parse_event(
&mut self,
event: Arc<HotShotEvent<TYPES>>,
Expand Down
3 changes: 0 additions & 3 deletions crates/task-impls/src/view_sync.rs
Original file line number Diff line number Diff line change
Expand Up @@ -300,7 +300,6 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> ViewSyncTaskSta

// We do not have a relay task already running, so start one
if self.membership.leader(vote_view + relay) != self.public_key {
// TODO ED This will occur because everyone is pulling down votes for now. Will be fixed in `https://github.com/EspressoSystems/HotShot/issues/1471`
debug!("View sync vote sent to wrong leader");
return;
}
Expand Down Expand Up @@ -339,7 +338,6 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> ViewSyncTaskSta

// We do not have a relay task already running, so start one
if self.membership.leader(vote_view + relay) != self.public_key {
// TODO ED This will occur because everyone is pulling down votes for now. Will be fixed in `https://github.com/EspressoSystems/HotShot/issues/1471`
debug!("View sync vote sent to wrong leader");
return;
}
Expand Down Expand Up @@ -378,7 +376,6 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> ViewSyncTaskSta

// We do not have a relay task already running, so start one
if self.membership.leader(vote_view + relay) != self.public_key {
// TODO ED This will occur because everyone is pulling down votes for now. Will be fixed in `https://github.com/EspressoSystems/HotShot/issues/1471`
debug!("View sync vote sent to wrong leader");
return;
}
Expand Down
8 changes: 2 additions & 6 deletions crates/testing/src/helpers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ use hotshot_types::{
node_implementation::{NodeType, Versions},
},
utils::{View, ViewInner},
vid::{vid_scheme, VidCommitment, VidSchemeType},
vid::{vid_scheme, VidCommitment, VidProposal, VidSchemeType},
vote::{Certificate, HasViewNumber, Vote},
};
use jf_vid::VidScheme;
Expand Down Expand Up @@ -291,16 +291,12 @@ pub fn build_payload_commitment<TYPES: NodeType>(
}

/// TODO: <https://github.com/EspressoSystems/HotShot/issues/2821>
#[allow(clippy::type_complexity)]
pub fn build_vid_proposal<TYPES: NodeType>(
quorum_membership: &<TYPES as NodeType>::Membership,
view_number: TYPES::Time,
transactions: Vec<TestTransaction>,
private_key: &<TYPES::SignatureKey as SignatureKey>::PrivateKey,
) -> (
Proposal<TYPES, VidDisperse<TYPES>>,
Vec<Proposal<TYPES, VidDisperseShare<TYPES>>>,
) {
) -> VidProposal<TYPES> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name could be different. Just based on the function it seemed like an obvious name.

let mut vid = vid_scheme_from_view_number::<TYPES>(quorum_membership, view_number);
let encoded_transactions = TestTransaction::encode(&transactions);

Expand Down
4 changes: 0 additions & 4 deletions crates/testing/src/test_builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -335,8 +335,6 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> TestDescription
pub fn default_multiple_rounds() -> Self {
let num_nodes_with_stake = 10;
TestDescription::<TYPES, I, V> {
// TODO: remove once we have fixed the DHT timeout issue
// https://github.com/EspressoSystems/HotShot/issues/2088
num_bootstrap_nodes: num_nodes_with_stake,
num_nodes_with_stake,
start_nodes: num_nodes_with_stake,
Expand Down Expand Up @@ -371,8 +369,6 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> TestDescription
// The first 14 (i.e., 20 - f) nodes are in the DA committee and we may shutdown the
// remaining 6 (i.e., f) nodes. We could remove this restriction after fixing the
// following issue.
// TODO: Update message broadcasting to avoid hanging
// <https://github.com/EspressoSystems/HotShot/issues/1567>
da_staked_committee_size: 14,
completion_task_description: CompletionTaskDescription::TimeBasedCompletionTaskBuilder(
TimeBasedCompletionTaskDescription {
Expand Down
2 changes: 0 additions & 2 deletions crates/testing/tests/tests_1/test_with_failures_2.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,6 @@ cross_tests!(
// The first 14 (i.e., 20 - f) nodes are in the DA committee and we may shutdown the
// remaining 6 (i.e., f) nodes. We could remove this restriction after fixing the
// following issue.
// TODO: Update message broadcasting to avoid hanging
// <https://github.com/EspressoSystems/HotShot/issues/1567>
let dead_nodes = vec![
ChangeNode {
idx: 10,
Expand Down
2 changes: 0 additions & 2 deletions crates/testing/tests/tests_2/test_with_failures_one.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,6 @@ cross_tests!(
// The first 14 (i.e., 20 - f) nodes are in the DA committee and we may shutdown the
// remaining 6 (i.e., f) nodes. We could remove this restriction after fixing the
// following issue.
// TODO: Update message broadcasting to avoid hanging
// <https://github.com/EspressoSystems/HotShot/issues/1567>
let dead_nodes = vec![ChangeNode {
idx: 19,
updown: NodeAction::Down,
Expand Down
2 changes: 0 additions & 2 deletions crates/testing/tests/tests_3/test_with_failures_half_f.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,6 @@ cross_tests!(
// The first 14 (i.e., 20 - f) nodes are in the DA committee and we may shutdown the
// remaining 6 (i.e., f) nodes. We could remove this restriction after fixing the
// following issue.
// TODO: Update message broadcasting to avoid hanging
// <https://github.com/EspressoSystems/HotShot/issues/1567>
let dead_nodes = vec![
ChangeNode {
idx: 17,
Expand Down
2 changes: 0 additions & 2 deletions crates/testing/tests/tests_4/test_with_failures_f.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,6 @@ cross_tests!(
// The first 14 (i.e., 20 - f) nodes are in the DA committee and we may shutdown the
// remaining 6 (i.e., f) nodes. We could remove this restriction after fixing the
// following issue.
// TODO: Update message broadcasting to avoid hanging
// <https://github.com/EspressoSystems/HotShot/issues/1567>
let dead_nodes = vec![
ChangeNode {
idx: 14,
Expand Down
6 changes: 0 additions & 6 deletions crates/testing/tests/tests_5/timeout.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@
#[cfg(test)]
#[cfg_attr(async_executor_impl = "tokio", tokio::test(flavor = "multi_thread"))]
#[cfg_attr(async_executor_impl = "async-std", async_std::test)]
// TODO Add memory network tests after this issue is finished:
// https://github.com/EspressoSystems/HotShot/issues/1790
async fn test_timeout() {
use std::time::Duration;

Expand Down Expand Up @@ -56,8 +54,6 @@ async fn test_timeout() {
},
);

// TODO ED Test with memory network once issue is resolved
// https://github.com/EspressoSystems/HotShot/issues/1790
metadata
.gen_launcher(0)
.launch()
Expand Down Expand Up @@ -120,8 +116,6 @@ async fn test_timeout_libp2p() {
},
);

// TODO ED Test with memory network once issue is resolved
// https://github.com/EspressoSystems/HotShot/issues/1790
metadata
.gen_launcher(0)
.launch()
Expand Down
Loading
Loading