Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combined register local model and get ml task #220

Merged
merged 4 commits into from
Nov 30, 2023

Conversation

joshpalis
Copy link
Member

Description

Combines register local model and get ml task together in a single step

Steps to test :

  1. Enable flow framework APIs
curl -i -XPUT "localhost:9200/_cluster/settings" -H "Content-Type:application/json" --data '{"transient":{"plugins.flow_framework.enabled":true}}'

HTTP/1.1 200 OK
X-OpenSearch-Version: OpenSearch/3.0.0-SNAPSHOT (opensearch)
content-type: application/json; charset=UTF-8
content-length: 99

{"acknowledged":true,"persistent":{},"transient":{"plugins":{"flow_framework":{"enabled":"true"}}}}
  1. Enable model registration via url and model deployment on non-ml nodes
curl -i -XPUT "localhost:9200/_cluster/settings" -H "Content-Type:application/json" --data '{"persistent":{"plugins.ml_commons.allow_registering_model_via_url":true,"plugins.ml_commons.only_run_on_ml_node":false}}'
HTTP/1.1 200 OK
X-OpenSearch-Version: OpenSearch/3.0.0-SNAPSHOT (opensearch)
content-type: application/json; charset=UTF-8
content-length: 149

{"acknowledged":true,"persistent":{"plugins":{"ml_commons":{"only_run_on_ml_node":"false","allow_registering_model_via_url":"true"}}},"transient":{}}
  1. Create workflow with the following 3 step template
curl -i -XPOST "localhost:9200/_plugins/_flow_framework/workflow" -H "Content-Type:application/json" --data '{"name":"registermodelgroup-registerlocalmodel-deploymodel","description":"test case","use_case":"TEST_CASE","version":{"template":"1.0.0","compatibility":["2.12.0","3.0.0"]},"workflows":{"provision":{"nodes":[{"id":"workflow_step_1","type":"model_group","user_inputs":{"name":"my-model-group"}},{"id":"workflow_step_2","type":"register_local_model","previous_node_inputs":{"workflow_step_1":"model_group_id"},"user_inputs":{"node_timeout":"60s","name":"all-MiniLM-L6-v2","version":"1.0.0","description":"test model","model_format":"TORCH_SCRIPT","model_content_hash_value":"c15f0d2e62d872be5b5bc6c84d2e0f4921541e29fefbef51d59cc10a8ae30e0f","model_type":"bert","embedding_dimension":"384","framework_type":"sentence_transformers","all_config":"{\"_name_or_path\":\"nreimers/MiniLM-L6-H384-uncased\",\"architectures\":[\"BertModel\"],\"attention_probs_dropout_prob\":0.1,\"gradient_checkpointing\":false,\"hidden_act\":\"gelu\",\"hidden_dropout_prob\":0.1,\"hidden_size\":384,\"initializer_range\":0.02,\"intermediate_size\":1536,\"layer_norm_eps\":1e-12,\"max_position_embeddings\":512,\"model_type\":\"bert\",\"num_attention_heads\":12,\"num_hidden_layers\":6,\"pad_token_id\":0,\"position_embedding_type\":\"absolute\",\"transformers_version\":\"4.8.2\",\"type_vocab_size\":2,\"use_cache\":true,\"vocab_size\":30522}","url":"https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/all-MiniLM-L6-v2/1.0.1/torch_script/sentence-transformers_all-MiniLM-L6-v2-1.0.1-torch_script.zip"}},{"id":"workflow_step_3","type":"deploy_model","previous_node_inputs":{"workflow_step_2":"model_id"}}],"edges":[{"source":"workflow_step_1","dest":"workflow_step_2"},{"source":"workflow_step_2","dest":"workflow_step_3"}]}}}'
HTTP/1.1 201 Created
X-OpenSearch-Version: OpenSearch/3.0.0-SNAPSHOT (opensearch)
content-type: application/json; charset=UTF-8
content-length: 38

{"workflow_id":"V8amHYwBtzN_VGd3WZRu"}
  1. Provision, logs show that all steps completed successfully
curl -i -XPOST "localhost:9200/_plugins/_flow_framework/workflow/V8amHYwBtzN_VGd3WZRu/_provision"
HTTP/1.1 200 OK
X-OpenSearch-Version: OpenSearch/3.0.0-SNAPSHOT (opensearch)
content-type: application/json; charset=UTF-8
content-length: 38

{"workflow_id":"V8amHYwBtzN_VGd3WZRu"}

OpenSearch Logs :

[2023-11-30T00:35:13,874][INFO ][o.o.f.w.ProcessNode      ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Starting workflow_step_1.
[2023-11-30T00:35:13,875][INFO ][o.o.f.t.ProvisionWorkflowTransportAction] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Queueing process [workflow_step_3]. Must wait for [workflow_step_2] to complete first.
[2023-11-30T00:35:13,903][INFO ][o.o.p.PluginsService     ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] PluginService:onIndexModule index:[.plugins-ml-model-group/gD8gFRTHTF-RfdRR-jzH1w]
[2023-11-30T00:35:13,911][INFO ][o.o.c.m.MetadataCreateIndexService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] [.plugins-ml-model-group] creating index, cause [api], templates [], shards [1]/[1]
[2023-11-30T00:35:13,913][INFO ][o.o.c.r.a.AllocationService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] updating number_of_replicas to [0] for indices [.plugins-ml-model-group]
[2023-11-30T00:35:13,920][INFO ][o.o.f.t.ProvisionWorkflowTransportAction] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] updated workflow V8amHYwBtzN_VGd3WZRu state to PROVISIONING
[2023-11-30T00:35:13,936][INFO ][o.o.p.PluginsService     ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] PluginService:onIndexModule index:[.plugins-ml-model-group/gD8gFRTHTF-RfdRR-jzH1w]
[2023-11-30T00:35:13,979][INFO ][o.o.c.r.a.AllocationService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.plugins-ml-model-group][0]]]).
[2023-11-30T00:35:14,001][INFO ][o.o.m.i.MLIndicesHandler ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] create index:.plugins-ml-model-group
[2023-11-30T00:35:14,018][INFO ][o.o.f.w.ModelGroupStep   ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Model group registration successful
[2023-11-30T00:35:14,020][INFO ][o.o.f.w.ProcessNode      ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Finished workflow_step_1.
[2023-11-30T00:35:14,020][INFO ][o.o.f.w.ProcessNode      ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Starting workflow_step_2.
[2023-11-30T00:35:14,029][INFO ][stdout                   ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] registering the model
[2023-11-30T00:35:14,034][INFO ][o.o.p.PluginsService     ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] PluginService:onIndexModule index:[.plugins-ml-task/cbYGRPRMSXK6wtWorzj5FQ]
[2023-11-30T00:35:14,040][INFO ][o.o.c.m.MetadataCreateIndexService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] [.plugins-ml-task] creating index, cause [api], templates [], shards [1]/[1]
[2023-11-30T00:35:14,041][INFO ][o.o.c.r.a.AllocationService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] updating number_of_replicas to [0] for indices [.plugins-ml-task]
[2023-11-30T00:35:14,063][INFO ][o.o.p.PluginsService     ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] PluginService:onIndexModule index:[.plugins-ml-task/cbYGRPRMSXK6wtWorzj5FQ]
[2023-11-30T00:35:14,105][INFO ][o.o.c.r.a.AllocationService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.plugins-ml-task][0]]]).
[2023-11-30T00:35:14,124][INFO ][o.o.m.i.MLIndicesHandler ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] create index:.plugins-ml-task
[2023-11-30T00:35:14,138][INFO ][o.o.f.w.RegisterLocalModelStep] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Local Model registration task creation successful
[2023-11-30T00:35:14,168][INFO ][o.o.p.PluginsService     ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] PluginService:onIndexModule index:[.plugins-ml-model/mDQlJQmeQVuJkADy09jRgA]
[2023-11-30T00:35:14,177][INFO ][o.o.c.m.MetadataCreateIndexService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] [.plugins-ml-model] creating index, cause [api], templates [], shards [1]/[1]
[2023-11-30T00:35:14,178][INFO ][o.o.c.r.a.AllocationService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] updating number_of_replicas to [0] for indices [.plugins-ml-model]
[2023-11-30T00:35:14,196][INFO ][o.o.p.PluginsService     ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] PluginService:onIndexModule index:[.plugins-ml-model/mDQlJQmeQVuJkADy09jRgA]
[2023-11-30T00:35:14,244][INFO ][o.o.c.r.a.AllocationService] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.plugins-ml-model][0]]]).
[2023-11-30T00:35:14,261][INFO ][o.o.m.i.MLIndicesHandler ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] create index:.plugins-ml-model
[2023-11-30T00:35:14,280][INFO ][o.o.m.m.MLModelManager   ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] create new model meta doc WsamHYwBtzN_VGd3tpTX for register model task WcamHYwBtzN_VGd3tpRN
Downloading: 100% |████████████████████████████████████████| all-MiniLM-L6-v2.zipc8aa11.us-west-2.amazon.com] 
[2023-11-30T00:35:18,595][INFO ][o.o.m.m.MLModelManager   ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Model registered successfully, model id: WsamHYwBtzN_VGd3tpTX, task id: WcamHYwBtzN_VGd3tpRN
[2023-11-30T00:35:19,147][INFO ][o.o.f.w.RegisterLocalModelStep] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Local model registeration successful
[2023-11-30T00:35:19,148][INFO ][o.o.f.w.ProcessNode      ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Starting workflow_step_3.
[2023-11-30T00:35:19,147][INFO ][o.o.f.w.ProcessNode      ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Finished workflow_step_2.
[2023-11-30T00:35:19,155][INFO ][o.o.m.a.d.TransportDeployModelAction] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Will deploy model on these nodes: DVa9RGdfTuKcBNIHyvVWxQ
[2023-11-30T00:35:19,169][INFO ][o.o.f.w.DeployModelStep  ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Model deployment state CREATED
[2023-11-30T00:35:19,170][INFO ][o.o.f.t.ProvisionWorkflowTransportAction] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Provisioning completed successfully for workflow V8amHYwBtzN_VGd3WZRu
[2023-11-30T00:35:19,170][INFO ][o.o.f.w.ProcessNode      ] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] Finished workflow_step_3.
[2023-11-30T00:35:19,183][INFO ][o.o.f.t.ProvisionWorkflowTransportAction] [dev-dsk-jpalis-2c-27c8aa11.us-west-2.amazon.com] updated workflow V8amHYwBtzN_VGd3WZRu state to COMPLETED

Issues Resolved

Part of issue #88

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Member

@amitgalitz amitgalitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the quick change

Signed-off-by: Joshua Palis <[email protected]>
@joshpalis joshpalis merged commit e2581a1 into opensearch-project:main Nov 30, 2023
19 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 30, 2023
* Combined register local model and get ml task

Signed-off-by: Joshua Palis <[email protected]>

* Handling task failure, returning error to user, added related test

Signed-off-by: Joshua Palis <[email protected]>

* Removing unnecessary MLConfig and spy

Signed-off-by: Joshua Palis <[email protected]>

---------

Signed-off-by: Joshua Palis <[email protected]>
(cherry picked from commit e2581a1)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 30, 2023
* Combined register local model and get ml task

Signed-off-by: Joshua Palis <[email protected]>

* Handling task failure, returning error to user, added related test

Signed-off-by: Joshua Palis <[email protected]>

* Removing unnecessary MLConfig and spy

Signed-off-by: Joshua Palis <[email protected]>

---------

Signed-off-by: Joshua Palis <[email protected]>
(cherry picked from commit e2581a1)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
joshpalis pushed a commit that referenced this pull request Nov 30, 2023
Combined register local model and get ml task (#220)

* Combined register local model and get ml task



* Handling task failure, returning error to user, added related test



* Removing unnecessary MLConfig and spy



---------


(cherry picked from commit e2581a1)

Signed-off-by: Joshua Palis <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
joshpalis pushed a commit that referenced this pull request Nov 30, 2023
…get ml task (#224)

Combined register local model and get ml task (#220)

* Combined register local model and get ml task



* Handling task failure, returning error to user, added related test



* Removing unnecessary MLConfig and spy



---------


(cherry picked from commit e2581a1)

Signed-off-by: Joshua Palis <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants