Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add debugging info for StorageWeightReclaim #5594

Merged
merged 5 commits into from
Sep 7, 2024

Conversation

Moliholy
Copy link
Contributor

@Moliholy Moliholy commented Sep 4, 2024

When inspecting the logs we often encounter the following message:

Benchmarked storage weight smaller than consumed storage weight. benchmarked: {benchmarked_weight} consumed: {consumed_weight} unspent: {unspent}

However, it is very hard to guess which call is causing the issue.

With the changes proposed in this PR, information about the call is provided so that we can easily identify the source of the problem without further delay, and this way work more efficiently in solving the issue.

@Moliholy Moliholy changed the title Add debugging info for StorageWeightReclaim Add debugging info for StorageWeightReclaim Sep 4, 2024
_info: &sp_runtime::traits::DispatchInfoOf<Self::Call>,
_len: usize,
) -> Result<Self::Pre, sp_runtime::transaction_validity::TransactionValidityError> {
Ok(get_proof_size())
Ok((get_proof_size(), call.clone()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloning the entire call is a little bit too expensive.

Copy link
Contributor Author

@Moliholy Moliholy Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding the GetCallMetadata to T::RuntimeCall and debugging the corresponding CallMetadata? We'd miss the call arguments, but at least it already points in the right direction. In that case we could also add the extrinsic number, as you suggested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just see the transaction with all its arguments in the block explorer. I don't see any need to log them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I just pushed the changes as per your suggestion.

@@ -183,13 +183,13 @@ where
if consumed_weight > benchmarked_weight {
log::error!(
target: LOG_TARGET,
"Benchmarked storage weight smaller than consumed storage weight. benchmarked: {benchmarked_weight} consumed: {consumed_weight} unspent: {unspent}"
"Benchmarked storage weight smaller than consumed storage weight. call: {call:?} benchmarked: {benchmarked_weight} consumed: {consumed_weight} unspent: {unspent}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the call we could print the frame_system::Pallet::extrinsic_count(). Then you have the index of the transaction.

Copy link
Contributor Author

@Moliholy Moliholy Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find that less straightforward. You might check the logs at a later point in time, when the state could, for instance, have been pruned. In that scenario you wouldn't be able to retrieve the block (after taking the time to inspect the block in PolkadotJS).

Also, it'd take more time to figure out which extrinsic is. My purpose is to increase as much as possible developer productivity, and the most go-to way I found is this one.

That being said, I understand passing the call and printing its debug information has a performance penalty. Still, I think it'd be better off with the proposed changes, as a small performance penalty could potentially save hours of debugging and improve overall productivity.

Still... if you think otherwise, I'll change it as per your suggestion. But I'd prefer to softly push for the proposed solution, as in my opinion it represents the best trade-off.

EDIT: besides, looking at the implementation of extrinsic_count():

/// Gets extrinsics count.
pub fn extrinsic_count() -> u32 {
    ExtrinsicCount::<T>::get().unwrap_or_default()
}

It does access an storage item. Wouldn't that introduce a higher performance penalty than the proposed solution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might check the logs at a later point in time, when the state could, for instance, have been pruned. In that scenario you wouldn't be able to retrieve the block (after taking the time to inspect the block in PolkadotJS).

State pruning is different to blocks pruning. Block pruning needs to be enabled directly.

We don't have that much memory and wasting the memory just for some debugging which is only required in 0.01% of the cases is not a good idea.

It does access an storage item. Wouldn't that introduce a higher performance penalty than the proposed solution?

The storage item will only be accessed when the log level is enabled . For the release build on chain logging should be disabled any way. Thus, this should not lead to any performance impacts on production networks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some debugging which is only required in 0.01% of the cases is not a good idea.

I'd say an incorrectly benchmarked custom pallet in a parachain is much more common than that. Still, I implemented it as per your suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just take all the nodes into account that need to execute the block and that are not used for debugging. Debugging is a small fraction and you get the index of the transaction and that gives you all the information for free.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bkchr please double check, but to me it looks like ExtrinsicCount is not updated on the fly, but only at the end of the block here (also in line with the docs):

pub fn note_finished_extrinsics() {
let extrinsic_index: u32 =
storage::unhashed::take(well_known_keys::EXTRINSIC_INDEX).unwrap_or_default();
ExtrinsicCount::<T>::put(extrinsic_index);
ExecutionPhase::<T>::put(Phase::Finalization);
}

I think we should use extrinsic_index() instead which seems to be updated after every xt.

storage::unhashed::put(well_known_keys::EXTRINSIC_INDEX, &next_extrinsic_index);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈

@bkchr bkchr requested a review from skunert September 6, 2024 06:59
@bkchr bkchr added the T2-pallets This PR/Issue is related to a particular pallet. label Sep 6, 2024
@bkchr bkchr enabled auto-merge September 6, 2024 06:59
@bkchr bkchr added this pull request to the merge queue Sep 7, 2024
Merged via the queue into paritytech:master with commit 016421a Sep 7, 2024
230 of 234 checks passed
@bkchr bkchr added A3-backport Pull request is already reviewed well in another branch. A4-needs-backport Pull request must be backported to all maintained releases. and removed A3-backport Pull request is already reviewed well in another branch. labels Sep 19, 2024
github-actions bot pushed a commit that referenced this pull request Sep 19, 2024
When inspecting the logs we often encounter the following message:

`Benchmarked storage weight smaller than consumed storage weight.
benchmarked: {benchmarked_weight} consumed: {consumed_weight} unspent:
{unspent}`

However, it is very hard to guess which call is causing the issue.

With the changes proposed in this PR, information about the call is
provided so that we can easily identify the source of the problem
without further delay, and this way work more efficiently in solving the
issue.

---------

Co-authored-by: Bastian Köcher <[email protected]>
(cherry picked from commit 016421a)
@paritytech-cmd-bot-polkadot-sdk

Successfully created backport PR for stable2407:

@paritytech-cmd-bot-polkadot-sdk

Successfully created backport PR for stable2409:

github-actions bot pushed a commit that referenced this pull request Sep 19, 2024
When inspecting the logs we often encounter the following message:

`Benchmarked storage weight smaller than consumed storage weight.
benchmarked: {benchmarked_weight} consumed: {consumed_weight} unspent:
{unspent}`

However, it is very hard to guess which call is causing the issue.

With the changes proposed in this PR, information about the call is
provided so that we can easily identify the source of the problem
without further delay, and this way work more efficiently in solving the
issue.

---------

Co-authored-by: Bastian Köcher <[email protected]>
(cherry picked from commit 016421a)
bkchr pushed a commit that referenced this pull request Sep 19, 2024
Backport #5594 into `stable2407` from Moliholy.

See the
[documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md)
on how to use this bot.

<!--
  # To be used by other automation, do not modify:
  original-pr-number: #${pull_number}
-->

Co-authored-by: José Molina Colmenero <[email protected]>
bkchr pushed a commit that referenced this pull request Sep 19, 2024
Backport #5594 into `stable2409` from Moliholy.

See the
[documentation](https://github.com/paritytech/polkadot-sdk/blob/master/docs/BACKPORT.md)
on how to use this bot.

<!--
  # To be used by other automation, do not modify:
  original-pr-number: #${pull_number}
-->

Co-authored-by: José Molina Colmenero <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A4-needs-backport Pull request must be backported to all maintained releases. T2-pallets This PR/Issue is related to a particular pallet.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants