-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Introduce request retry mechanism to Waved to retry failed requests #2035
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @senalw for working on this 🙏
I'm neither a go expert nor an expert with the wave codebase so I'm not the right person to decide whether this PR should go in or not, just a couple of comments from my side:
- This continues to be the only release blocker for MLOps 0.62.0. We won't be able to release our software because of it and we're already late by a couple of weeks. So any help from the wave team @mturoci @lo5 will be extremely appreciated 🙏
- @senalw I don't see any linked issue in the PR. Did we not create an issue? I know there have been slack conversations but the issues are how the work is being tracked so please create one if it's not there already.
- @senalw Since there's a go package for everything nowadays why try to implement this yourself if you can use an already existing and tested and much more configurable package for it?
Thanks for the suggestions. I linked the issue for this PR. |
This PR is a needlessly complicated solution to what's obviously an availability / resource-scheduling problem on your side. If you really want the wave server to not drop the app after a TCP reset, the simple fix is to introduce a flag that will do exactly that. e.g. func (app *App) forward(clientID string, session *Session, data []byte) {
if err := app.send(clientID, session, data); err != nil {
echo(Log{"t": "app", "route": app.route, "host": app.addr, "error": err.Error()})
if (dropIfUnresponsive) { // get global setting from env var or arg, default true to preserve current behavior.
app.broker.dropApp(app.route)
}
}
} This way, the underlying |
Closing this PR as fix is provided with this PR. |
Issue: Currently Waved disconnects app when a request failure occurs between Waved and app. Therefore, this PR will introduce a request retry mechanism to retry failed requests for a certain amount of time period which can be configured by a user. If this retry count (
H2O_WAVE_MAX_REQUEST_RETRY_COUNT
) is not configured, it wont enable request retry mechanism.Testing Conditions,
H2O_WAVE_MAX_REQUEST_RETRY_COUNT=20
andH2O_WAVE_REQUEST_RETRY_INTERVAL=1s
Testing Records,
read: connection reset by peer
disconnection for 2 seconds and reset back to normal with deleting iptables rule after 2 seconds.sleep 5 && iptables -A INPUT -p tcp --dport 8000 -j REJECT --reject-with tcp-reset && sleep 2 && iptables -D INPUT -p tcp --dport 8000 -j REJECT --reject-with tcp-reset
Result:
App was able to successfully recover after 2 seconds,
Screen.Recording.2023-06-23.at.19.28.52.mov
read: connection reset by peer
disconnection until requests reachH2O_WAVE_MAX_REQUEST_RETRY_COUNT
and drops the appsleep 5 && iptables -A INPUT -p tcp --dport 8000 -j REJECT --reject-with tcp-reset
Result:
App drops after requests reach maximum retry count and unable to recover the connection.
Screen.Recording.2023-06-23.at.23.24.58.mov