Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase peppermint branch to v0.38.5 #2

Merged
merged 138 commits into from
Mar 1, 2024

Conversation

Raneet10
Copy link
Member

This PR ports in the relevant changes from peppermint fork taking into account the points mentioned in informalsystems/heimdall-migration#11 .

vaibhavchellani and others added 30 commits May 11, 2019 01:26
* Remove go func {}()

closes cometbft#357

- Remove go func(){}() that caused race condiditon

- To reproduce
	- add -race in make file to `install_abci`
	- Remove `CGO_ENABLED=0` & add -race to `install`

Signed-off-by: Marko Baricevic <[email protected]>

* remove -race

* fix data race

also, reorder callbacks similarly to socket client
…ometbft#3608)

* docs: go built-in guide

* fix package imports, add badger db, simplify Query

* newTendermint function

* working example

* finish the first guide

* add one more note

* add the second Golang guide - external ABCI app

* fix typos
…ft#3811)

* Remove db from tendemrint in favor of tendermint/tm-cmn

- remove db from `libs`
- update dependancy, there have been no breaking changes in the updated deps
	- https://github.com/grpc/grpc-go/releases
	- https://github.com/golang/protobuf/releases

Signed-off-by: Marko Baricevic <[email protected]>

* changelog add

* gofmt

* more gofmt
* ADR TOC in readme.md

* Added A TOC to the Readme.md of ADR Section

- Added table of contents to the Readme of the architecture section.
	- Easier to traverse and when you know what is there.
	- If the Adr's become viewable online it would help guide the user

Signed-off-by: Marko Baricevic <[email protected]>

* add tm-cmn to subprojects

* normalize word
…3818)

* rpc: make max_body_bytes and max_header_bytes configurable

* update changelog pending
* use byte buffer pool to decreass allocs

* wrap to put buffer in defer

* wapper defer

* add dependency

* remove Gopkg,*

* add change log
* implement broadcast_duplicate_vote endpoint

* fix test_cover

* address comments

* address comments

* Update abci/example/kvstore/persistent_kvstore.go

Co-Authored-By: mossid <[email protected]>

* Update rpc/client/main_test.go

Co-Authored-By: mossid <[email protected]>

* address comments in progress

* reformat the code

* make linter happy

* make tests pass

* replace BroadcastDuplicateVote with BroadcastEvidence

* fix test

* fix endpoint name

* improve doc

* fix TestBroadcastEvidenceDuplicateVote

* Update rpc/core/evidence.go

Co-Authored-By: Thane Thomson <[email protected]>

* add changelog entry

* fix TestBroadcastEvidenceDuplicateVote
* mempool: make max_msg_bytes configurable

* apply suggestions from code review

* update changelog pending

* apply suggestions from code review again
cometbft#3825)

* rpc: return err if page is incorrect (less than 0 or greater than total pages)

Fixes cometbft#3813

* fix rpc_test
* go routines in blockchain reactor

* Added reference to the go routine diagram

* Initial commit

* cleanup

* Undo testing_logger change, committed by mistake

* Fix the test loggers

* pulled some fsm code into pool.go

* added pool tests

* changes to the design

added block requests under peer

moved the request trigger in the reactor poolRoutine, triggered now by a ticker

in general moved everything required for making block requests smarter in the poolRoutine

added a simple map of heights to keep track of what will need to be requested next

added a few more tests

* send errors to FSM in a different channel than blocks

send errors (RemovePeer) from switch on a different channel than the
one receiving blocks
renamed channels
added more pool tests

* more pool tests

* lint errors

* more tests

* more tests

* switch fast sync to new implementation

* fixed data race in tests

* cleanup

* finished fsm tests

* address golangci comments :)

* address golangci comments :)

* Added timeout on next block needed to advance

* updating docs and cleanup

* fix issue in test from previous cleanup

* cleanup

* Added termination scenarios, tests and more cleanup

* small fixes to adr, comments and cleanup

* Fix bug in sendRequest()

If we tried to send a request to a peer not present in the switch, a
missing continue statement caused the request to be blackholed in a peer
that was removed and never retried.

While this bug was manifesting, the reactor kept asking for other
blocks that would be stored and never consumed. Added the number of
unconsumed blocks in the math for requesting blocks ahead of current
processing height so eventually there will be no more blocks requested
until the already received ones are consumed.

* remove bpPeer's didTimeout field

* Use distinct err codes for peer timeout and FSM timeouts

* Don't allow peers to update with lower height

* review comments from Ethan and Zarko

* some cleanup, renaming, comments

* Move block execution in separate goroutine

* Remove pool's numPending

* review comments

* fix lint, remove old blockchain reactor and duplicates in fsm tests

* small reorg around peer after review comments

* add the reactor spec

* verify block only once

* review comments

* change to int for max number of pending requests

* cleanup and godoc

* Add configuration flag fast sync version

* golangci fixes

* fix config template

* move both reactor versions under blockchain

* cleanup, golint, renaming stuff

* updated documentation, fixed more golint warnings

* integrate with behavior package

* sync with master

* gofmt

* add changelog_pending entry

* move to improvments

* suggestion to changelog entry
* Renamed wire.go to codec.go

- Wire was the previous name of amino
- Codec describes the file better than `wire` & `amino`

Signed-off-by: Marko Baricevic <[email protected]>

* ide error

* rename amino.go to codec.go
cleanup to add linter

    grpc change:
        https://godoc.org/google.golang.org/grpc#WithContextDialer
        https://godoc.org/google.golang.org/grpc#WithDialer
        grpc/grpc-go#2627
    prometheous change:
        due to UninstrumentedHandler, being deprecated in the future
    empty branch = empty if or else statement
        didn't delete them entirely but commented
        couldn't find a reason to have them
    could not replicate the issue cometbft#3406
        but if want to keep it commented then we should comment out the if statement as well
* p2p: fix false-positive error logging when stopping connections

This changeset fixes two types of false-positive errors occurring during
connection shutdown.

The first occurs when the process invokes FlushStop() or Stop() on a
connection. While the previous behavior did properly wait for the sendRoutine
to finish, it did not notify the recvRoutine that the connection was shutting
down. This would cause the recvRouting to receive and error when reading and
log this error. The changeset fixes this by notifying the recvRoutine that
the connection is shutting down.

The second occurs when the connection is terminated (gracefully) by the other side.
The recvRoutine would get an EOF error during the read, log it, and stop the connection
with an error. The changeset detects EOF and gracefully shuts down the connection.

* bring back the comment about flushing

* add changelog entry

* listen for quitRecvRoutine too

* we have to call stopForError

Otherwise peer won't be removed from the peer set and maybe readded
later.
…ds (cometbft#3834)

* Do not write 'Couldn't connect to any seeds' if there are no seeds

* changelog

* remove privValUpgrade

* Fix typo in changelog

* Update CHANGELOG_PENDING.md

Co-Authored-By: Marko <[email protected]>

I'm setting up all peers dynamically by calling dial_peers, so p2p.seeds in configs is empty, and I'm seeing error log a lot in logs.
…ometbft#3838)

* add abci grpc kotlin guide

* Update docs/guides/kotlin.md

Co-Authored-By: Anton Kaliaev <[email protected]>

* Update docs/guides/kotlin.md

Co-Authored-By: Anton Kaliaev <[email protected]>

* Update docs/guides/kotlin.md

Co-Authored-By: Anton Kaliaev <[email protected]>

* Update kotlin.md
* node: allow replacing existing p2p.Reactor(s)

using [`CustomReactors`
option](https://godoc.org/github.com/tendermint/tendermint/node#CustomReactors).
Warning: beware of accidental name clashes. Here is the list of existing
reactors: MEMPOOL, BLOCKCHAIN, CONSENSUS, EVIDENCE, PEX.

* check the absence of "CUSTOM" prefix

* merge 2 tests

* add doc.go to node package
    Add gocritic as a linter

    The linting is not complete, but should i complete in this PR or in a following.

    23 files have been touched so it may be better to do in a following PR


Commits:

* Add gocritic to linting

- Added gocritic to linting

Signed-off-by: Marko Baricevic <[email protected]>

* gocritic

* pr comments

* remove switch in cmdBatch
* tm-cmn to tm-db

* go.mod changes

* go.mod changes

* more go.mod

* fix tm-db

* ci fix, pending change
- Replace the previous intersect call, which was called at each query condition, with a map intersection.
- Replace fmt.Sprintf with string()

closes: cometbft#3076

Benchmarks

```
Old
goos: darwin
goarch: amd64
pkg: github.com/tendermint/tendermint/state/txindex/kv
BenchmarkTxSearch-4   	     200	 103641206 ns/op	 7998416 B/op	   71171 allocs/op
PASS
ok  	github.com/tendermint/tendermint/state/txindex/kv	26.019s

New
goos: darwin
goarch: amd64
pkg: github.com/tendermint/tendermint/state/txindex/kv
BenchmarkTxSearch-4   	    1000	  38615024 ns/op	13515226 B/op	  166460 allocs/op
PASS
ok  	github.com/tendermint/tendermint/state/txindex/kv	53.618s
```

~62% performance improvement

Commits:

* Refactor tx search

* Add pending changelog entry

* Add tx search benchmarking

* remove intermediate hashes list

also reset timer in BenchmarkTxSearch
and fix other benchmark

* fix import

* Add test cases

* Fix searching

* Replace fmt.Sprintf with string

* Update state/txindex/kv/kv.go

Co-Authored-By: Anton Kaliaev <[email protected]>

* Rename params

* Cleanup

* Check error in benchmarks
abci/types/messages.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
types/block.go Outdated Show resolved Hide resolved
@Raneet10
Copy link
Member Author

Raneet10 commented Feb 9, 2024

Lints failing all of a sudden ? 👀

Copy link

@sergio-mena sergio-mena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Raneet10 for adding the new protobuf.
This PR LGTM.

Copy link

sonarcloud bot commented Mar 1, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
9.4% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

@Raneet10 Raneet10 merged commit 62a78ac into v0.38.5-upstream Mar 1, 2024
16 of 17 checks passed
@Raneet10 Raneet10 deleted the raneet10/peppermint-changes branch March 1, 2024 10:01
Comment on lines +546 to +551
if r.Switch.Peers().Size() == 0 {
r.Logger.Error("Peer Info", "numPeers", r.Switch.Peers().Size())
} else {
r.Logger.Info("Peer Info", "numPeers", r.Switch.Peers().Size())
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for reopening the discussion, but I would like to understand the motivation behind logging here.

Here are some of my hypotheses. Please forgive my ignorance and fill me in with the correct information:

If you want to monitor the number of connected peers on the node, there are methods already in place for it: you can query the /net_info RPC endpoint, and I'm fairly sure there is also a Prometheus endpoint that will get you the information.

If you wanted to put this information in the log file specifically, why having no peers is an error message while having peers is an info message? Maybe emitting a debug message in every case is a more homogenous solution. Maybe there is an implicit error that zero peers bring that I'm unaware of.

Also, node operators tend to look at Prometheus and RPC for node monitoring before they start parsing the logs. I understand that developers usually start parsing the logs before turning to other troubleshooting methods.

Is this logging for developer troubleshooting purposes? If that's the case and the number of peers doesn't impact execution, I would still consider a debug message a better log level.

Finally, I'm trying to understand why the logging happens precisely here. (In dialPeer and at the beginning of it). dialPeer is called three times in the whole p2p library (not including tests):
ReceiveAddr - if we receive a seed node address, let's call it immediately
ensurePeers - if we can add more peers, let's add more peers (ensure, that there are peers we connected to)
crawlPeers - Find new peers on the network and add them

Are these log messages expected to be emitted only in these cases?

The reason I'm asking these questions is because I'd like to port this to CometBFT main, if it adds functionality for either the developer or the end user. But I need to understand the use case better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @greg-szabo , no problems. I'd let @pratikspatil024 jump in because I might be wrong here but AFAIR, as you mentioned, we wanted to add some logging on the peer stats. I directly ported that in from maticnetwork/tendermint#32. If these numbers are trivial to get from an RPC endpoint then I guess we can remove the logs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as @Raneet10 mentioned, the purpose of these logs was just to get the number of peers a node is connected to. We can remove these logs as we can get the information through RPC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.