-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Add transport request retry capability for async workflow steps #158
Comments
I see at least 3 (good) ways to do this:
|
|
I think integrating this into |
Agreed with option 1 as a short term fix, option 3 for long term |
Is your feature request related to a problem?
Coming from PR #155 which adds a
GetMLTaskStep
. The underlying API is an async operation that is used after registering a local model. This step is used to ascertain the local model registration status and retrieve the model id once the status isCOMPLETED
.Local model registration takes upwards of 30 seconds, depending on the size of the model zip that needs to be downloaded into the cluster, so it is necessary for the
GetMLTaskStep
to repeatedly call this API until the model ID is returned.This will be a common issue for any
WorkflowStep
that invokes an async APIWhat alternatives have you considered?
Brute force method of retrying requests until a
COMPLETED
status is returned results in scores of requests sent across the transport layer. Perhaps sleeping the thread for some time and then retrying the request may be the right path, but it does run the risk of having theWorkflowStep
node_timeout
run out.Neural Plugin has this retry method for executing the predict API:
The text was updated successfully, but these errors were encountered: