Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r2r container failed to become healthy. #1117

Open
linuxreitt opened this issue Sep 11, 2024 · 6 comments
Open

r2r container failed to become healthy. #1117

linuxreitt opened this issue Sep 11, 2024 · 6 comments

Comments

@linuxreitt
Copy link

Describe the bug
After setting up the Hatchet engine using Docker Compose, the hatchet-engine service is marked as unhealthy, and I receive a "connection refused" error when attempting to connect to 172.17.0.1:7077. The alias host.docker.internal seems to be misconfigured within the docker-compose network, causing issues with the hatchet-engine healthcheck and service communication.

To Reproduce
Steps to reproduce the behavior:

  1. r2r serve --docker --config-name=local_llm
    2.R2R now runs on port 7272 by default!
    Pulling Docker images...
    [+] Pulling 22/22
    ✔ setup-token Skipped - Image is already being pulled by hatchet-setup-config 0.0s
    ✔ neo4j Pulled 1.9s
    ✔ r2r-dashboard Pulled 2.1s
    ✔ hatchet-dashboard Pulled 1.3s
    ✔ hatchet-rabbitmq Pulled 1.9s
    ✔ postgres Pulled 2.0s
    ✔ r2r Pulled 1.9s
    ✔ hatchet-setup-config Pulled 35.8s
    ✔ c6a83fedfae6 Already exists 0.0s
    ✔ a7adfaf8acb2 Already exists 0.0s
    ✔ 2e1aee94ff63 Pull complete 34.1s
    ✔ 0cc9b3b5c238 Pull complete 34.1s
    ✔ abcedb1bd74f Pull complete 34.4s
    ✔ hatchet-engine Pulled 1.5s
    ✔ traefik Pulled 2.1s
    ✔ hatchet-migration Pulled 37.6s
    ✔ 5a08e4bb3ddd Pull complete 33.6s
    ✔ 8a86e811c654 Pull complete 36.0s
    ✔ 2bc6a5dc2d85 Pull complete 36.0s
    ✔ c5978cfa4652 Pull complete 36.0s
    ✔ a6d3ef985032 Pull complete 36.0s
    ✔ hatchet-api Pulled 1.5s
    Starting Docker Compose setup...
    [+] Running 12/12
    ✔ Container r2r-hatchet-rabbitmq-1 Healthy 11.8s
    ✔ Container r2r-traefik-1 Started 0.6s
    ✔ Container r2r-hatchet-dashboard-1 Started 0.5s
    ✔ Container r2r-postgres-1 Healthy 13.8s
    ✔ Container r2r-r2r-dashboard-1 Started 0.6s
    ✔ Container r2r-neo4j-1 Healthy 21.8s
    ✔ Container r2r-hatchet-migration-1 Exited 12.1s
    ✔ Container r2r-hatchet-setup-config-1 Exited 12.8s
    ✔ Container r2r-hatchet-api-1 Started 13.1s
    ✔ Container r2r-hatchet-engine-1 Started 13.1s
    ✔ Container r2r-setup-token-1 Exited 13.6s
    ✔ Container r2r-r2r-1 Started 21.7s
    Waiting for all services to become healthy...
    Timeout waiting for r2r to be healthy.
    r2r container failed to become healthy.
    Navigating to R2R application at http://localhost:7273.

R2R$ Opening in existing browser session.

Expected behavior
Healty containers, ability to access webui.

Desktop (please complete the following information):

  • OS: Ubuntu 24.04
@NolanTrem
Copy link
Collaborator

Can you share the logs in your R2R server container (should be marked r2r-r1r-1 or similar.)

This can happen if it fails to respond to a health check, so I expect that it's binding somewhere.

@br00t4c
Copy link

br00t4c commented Sep 11, 2024

This was happening for me last night with v3.1.19, when I upgraded to v3.1.21 by pip install --upgrade r2r the docker image starts successfully

@emrgnt-cmplxty
Copy link
Contributor

@linuxreitt - We are pushing a correct fix right now for the other github issue that you brought to our attention.

Can you follow what @br00t4c has advised and try updating your R2R version? If the error persists we will investigate the hatchet configuration.

@linuxreitt
Copy link
Author

I have upgraded r2r to the very latest version, I still have issues.

2024-09-11 20:07:45,465 - INFO - core.main.app_entry - Environment CONFIG_PATH:
2024-09-11 20:07:45,466 - INFO - core.base.providers.embedding - Initializing EmbeddingProvider with config extra_fields={} provider='ollama' base_model='mxbai-embed-large' base_dimension=1024 rerank_model=None rerank_dimension=None rerank_transformer_type=None batch_size=128 prefixes=None add_title_as_prefix=True concurrent_request_limit=2 max_retries=2 initial_backoff=1.0 max_backoff=60.0.
2024-09-11 20:07:45,467 - INFO - core.providers.embeddings.ollama - Using Ollama API base URL: http://host.docker.internal:11434
2024-09-11 20:07:45,495 - INFO - core.base.providers.llm - Initializing CompletionProvider with config: extra_fields={} provider='litellm' generation_config=GenerationConfig(model='ollama/llama3.1', temperature=0.1, top_p=1.0, max_tokens_to_sample=1024, stream=False, functions=None, tools=None, add_generation_kwargs={}, api_base=None) concurrent_request_limit=1 max_retries=2 initial_backoff=1.0 max_backoff=60.0
2024-09-11 20:07:45,504 - INFO - core.base.providers.database - Initializing DatabaseProvider with config extra_fields={} provider='postgres' user=None password=None host=None port=None db_name=None vecs_collection=None.
2024-09-11 20:07:45,504 - INFO - core.providers.database.vector - Using TCP connection
2024-09-11 20:07:45,517 - INFO - core.providers.database.vector - Successfully initialized PGVectorDB with collection: local_llm
2024-09-11 20:07:45,518 - INFO - core.base.providers.prompt - Initializing PromptProvider with config extra_fields={} provider='r2r' default_system_name='default_system' default_task_name='default_rag' file_path=None.
2024-09-11 20:07:45,519 - INFO - core.providers.prompts.r2r_prompts - Created table prompts
2024-09-11 20:07:45,519 - INFO - core.providers.prompts.r2r_prompts - Loading prompts from /app/core/providers/prompts/defaults
2024-09-11 20:07:45,538 - INFO - core.providers.auth.r2r_auth - Default admin user already exists.
2024-09-11 20:07:45,538 - WARNING - core.providers.parsing.unstructured_parsing - Excluded parsers are not supported by the unstructured parsing provider.
2024-09-11 20:07:46,066 - INFO - core.main.assembly.factory - Initializing PostgresFileProvider
2024-09-11 20:07:46,067 - INFO - core.providers.file.postgres - Created table file_storage
2024-09-11 20:07:46,071 - INFO - core.pipes.retrieval.query_transform_pipe - Initalizing an QueryTransformPipe pipe.
2024-09-11 20:07:46,071 - INFO - core.pipes.retrieval.query_transform_pipe - Initalizing an QueryTransformPipe pipe.
[ERROR] 🪓 -- 2024-09-11 20:07:46,106 - failed to register workflow: ingest-file
[ERROR] 🪓 -- 2024-09-11 20:07:46,106 - Could not put workflow: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:172.17.0.1:7077: Failed to connect to remote host: connect: Connection refused (111)"
debug_error_string = "UNKNOWN:Error received from peer {created_time:"2024-09-11T20:07:46.106751256+00:00", grpc_status:14, grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:172.17.0.1:7077: Failed to connect to remote host: connect: Connection refused (111)"}"

/usr/local/lib/python3.10/site-packages/pydantic/_internal/_config.py:341: UserWarning: Valid config keys have changed in V2:

  • 'underscore_attrs_are_private' has been removed
    warnings.warn(message, UserWarning)
    2024-09-11 20:07:55,709 - INFO - core.main.app_entry - Environment CONFIG_NAME: local_llm
    2024-09-11 20:07:55,710 - INFO - core.main.app_entry - Environment CONFIG_PATH:
    2024-09-11 20:07:55,711 - INFO - core.base.providers.embedding - Initializing EmbeddingProvider with config extra_fields={} provider='ollama' base_model='mxbai-embed-large' base_dimension=1024 rerank_model=None rerank_dimension=None rerank_transformer_type=None batch_size=128 prefixes=None add_title_as_prefix=True concurrent_request_limit=2 max_retries=2 initial_backoff=1.0 max_backoff=60.0.
    2024-09-11 20:07:55,711 - INFO - core.providers.embeddings.ollama - Using Ollama API base URL: http://host.docker.internal:11434
    2024-09-11 20:07:55,741 - INFO - core.base.providers.llm - Initializing CompletionProvider with config: extra_fields={} provider='litellm' generation_config=GenerationConfig(model='ollama/llama3.1', temperature=0.1, top_p=1.0, max_tokens_to_sample=1024, stream=False, functions=None, tools=None, add_generation_kwargs={}, api_base=None) concurrent_request_limit=1 max_retries=2 initial_backoff=1.0 max_backoff=60.0
    2024-09-11 20:07:55,752 - INFO - core.base.providers.database - Initializing DatabaseProvider with config extra_fields={} provider='postgres' user=None password=None host=None port=None db_name=None vecs_collection=None.
    2024-09-11 20:07:55,752 - INFO - core.providers.database.vector - Using TCP connection
    2024-09-11 20:07:55,767 - INFO - core.providers.database.vector - Successfully initialized PGVectorDB with collection: local_llm
    2024-09-11 20:07:55,769 - INFO - core.base.providers.prompt - Initializing PromptProvider with config extra_fields={} provider='r2r' default_system_name='default_system' default_task_name='default_rag' file_path=None.
    2024-09-11 20:07:55,769 - INFO - core.providers.prompts.r2r_prompts - Created table prompts
    2024-09-11 20:07:55,770 - INFO - core.providers.prompts.r2r_prompts - Loading prompts from /app/core/providers/prompts/defaults
    2024-09-11 20:07:55,790 - INFO - core.providers.auth.r2r_auth - Default admin user already exists.
    2024-09-11 20:07:55,790 - WARNING - core.providers.parsing.unstructured_parsing - Excluded parsers are not supported by the unstructured parsing provider.
    2024-09-11 20:07:56,341 - INFO - core.main.assembly.factory - Initializing PostgresFileProvider
    2024-09-11 20:07:56,341 - INFO - core.providers.file.postgres - Created table file_storage
    2024-09-11 20:07:56,345 - INFO - core.pipes.retrieval.query_transform_pipe - Initalizing an QueryTransformPipe pipe.
    2024-09-11 20:07:56,345 - INFO - core.pipes.retrieval.query_transform_pipe - Initalizing an QueryTransformPipe pipe.
    [ERROR] 🪓 -- 2024-09-11 20:07:56,378 - failed to register workflow: ingest-file
    [ERROR] 🪓 -- 2024-09-11 20:07:56,378 - Could not put workflow: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.UNAVAILABLE
    details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:172.17.0.1:7077: Failed to connect to remote host: connect: Connection refused (111)"
    debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:172.17.0.1:7077: Failed to connect to remote host: connect: Connection refused (111)", grpc_status:14, created_time:"2024-09-11T20:07:56.378394374+00:00"}"

/usr/local/lib/python3.10/site-packages/pydantic/_internal/_config.py:341: UserWarning: Valid config keys have changed in V2:

  • 'underscore_attrs_are_private' has been removed
    warnings.warn(message, UserWarning)
    2024-09-11 20:08:12,523 - INFO - core.main.app_entry - Environment CONFIG_NAME: local_llm
    2024-09-11 20:08:12,523 - INFO - core.main.app_entry - Environment CONFIG_PATH:
    2024-09-11 20:08:12,524 - INFO - core.base.providers.embedding - Initializing EmbeddingProvider with config extra_fields={} provider='ollama' base_model='mxbai-embed-large' base_dimension=1024 rerank_model=None rerank_dimension=None rerank_transformer_type=None batch_size=128 prefixes=None add_title_as_prefix=True concurrent_request_limit=2 max_retries=2 initial_backoff=1.0 max_backoff=60.0.
    2024-09-11 20:08:12,525 - INFO - core.providers.embeddings.ollama - Using Ollama API base URL: http://host.docker.internal:11434
    2024-09-11 20:08:12,552 - INFO - core.base.providers.llm - Initializing CompletionProvider with config: extra_fields={} provider='litellm' generation_config=GenerationConfig(model='ollama/llama3.1', temperature=0.1, top_p=1.0, max_tokens_to_sample=1024, stream=False, functions=None, tools=None, add_generation_kwargs={}, api_base=None) concurrent_request_limit=1 max_retries=2 initial_backoff=1.0 max_backoff=60.0
    2024-09-11 20:08:12,564 - INFO - core.base.providers.database - Initializing DatabaseProvider with config extra_fields={} provider='postgres' user=None password=None host=None port=None db_name=None vecs_collection=None.
    2024-09-11 20:08:12,564 - INFO - core.providers.database.vector - Using TCP connection
    2024-09-11 20:08:12,579 - INFO - core.providers.database.vector - Successfully initialized PGVectorDB with collection: local_llm
    2024-09-11 20:08:12,584 - INFO - core.base.providers.prompt - Initializing PromptProvider with config extra_fields={} provider='r2r' default_system_name='default_system' default_task_name='default_rag' file_path=None.
    2024-09-11 20:08:12,585 - INFO - core.providers.prompts.r2r_prompts - Created table prompts
    2024-09-11 20:08:12,587 - INFO - core.providers.prompts.r2r_prompts - Loading prompts from /app/core/providers/prompts/defaults
    2024-09-11 20:08:12,609 - INFO - core.providers.auth.r2r_auth - Default admin user already exists.
    2024-09-11 20:08:12,609 - WARNING - core.providers.parsing.unstructured_parsing - Excluded parsers are not supported by the unstructured parsing provider.
    2024-09-11 20:08:13,193 - INFO - core.main.assembly.factory - Initializing PostgresFileProvider
    2024-09-11 20:08:13,194 - INFO - core.providers.file.postgres - Created table file_storage
    2024-09-11 20:08:13,199 - INFO - core.pipes.retrieval.query_transform_pipe - Initalizing an QueryTransformPipe pipe.
    2024-09-11 20:08:13,199 - INFO - core.pipes.retrieval.query_transform_pipe - Initalizing an QueryTransformPipe pipe.
    [ERROR] 🪓 -- 2024-09-11 20:08:13,232 - failed to register workflow: ingest-file
    [ERROR] 🪓 -- 2024-09-11 20:08:13,232 - Could not put workflow: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.UNAVAILABLE
    details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:172.17.0.1:7077: Failed to connect to remote host: connect: Connection refused (111)"
    debug_error_string = "UNKNOWN:Error received from peer {created_time:"2024-09-11T20:08:13.232549504+00:00", grpc_status:14, grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:172.17.0.1:7077: Failed to connect to remote host: connect: Connection refused (111)"}"

/usr/local/lib/python3.10/site-packages/pydantic/_internal/_config.py:341: UserWarning: Valid config keys have changed in V2:

  • 'underscore_attrs_are_private' has been removed
    warnings.warn(message, UserWarning)
    2024-09-11 20:08:42,160 - INFO - core.main.app_entry - Environment CONFIG_NAME: local_llm
    2024-09-11 20:08:42,160 - INFO - core.main.app_entry - Environment CONFIG_PATH:
    2024-09-11 20:08:42,161 - INFO - core.base.providers.embedding - Initializing EmbeddingProvider with config extra_fields={} provider='ollama' base_model='mxbai-embed-large' base_dimension=1024 rerank_model=None rerank_dimension=None rerank_transformer_type=None batch_size=128 prefixes=None add_title_as_prefix=True concurrent_request_limit=2 max_retries=2 initial_backoff=1.0 max_backoff=60.0.
    2024-09-11 20:08:42,161 - INFO - core.providers.embeddings.ollama - Using Ollama API base URL: http://host.docker.internal:11434
    2024-09-11 20:08:42,188 - INFO - core.base.providers.llm - Initializing CompletionProvider with config: extra_fields={} provider='litellm' generation_config=GenerationConfig(model='ollama/llama3.1', temperature=0.1, top_p=1.0, max_tokens_to_sample=1024, stream=False, functions=None, tools=None, add_generation_kwargs={}, api_base=None) concurrent_request_limit=1 max_retries=2 initial_backoff=1.0 max_backoff=60.0
    2024-09-11 20:08:42,199 - INFO - core.base.providers.database - Initializing DatabaseProvider with config extra_fields={} provider='postgres' user=None password=None host=None port=None db_name=None vecs_collection=None.
    2024-09-11 20:08:42,199 - INFO - core.providers.database.vector - Using TCP connection
    2024-09-11 20:08:42,213 - INFO - core.providers.database.vector - Successfully initialized PGVectorDB with collection: local_llm
    2024-09-11 20:08:42,216 - INFO - core.base.providers.prompt - Initializing PromptProvider with config extra_fields={} provider='r2r' default_system_name='default_system' default_task_name='default_rag' file_path=None.
    2024-09-11 20:08:42,216 - INFO - core.providers.prompts.r2r_prompts - Created table prompts
    2024-09-11 20:08:42,217 - INFO - core.providers.prompts.r2r_prompts - Loading prompts from /app/core/providers/prompts/defaults
    2024-09-11 20:08:42,243 - INFO - core.providers.auth.r2r_auth - Default admin user already exists.
    2024-09-11 20:08:42,243 - WARNING - core.providers.parsing.unstructured_parsing - Excluded parsers are not supported by the unstructured parsing provider.
    2024-09-11 20:08:42,894 - INFO - core.main.assembly.factory - Initializing PostgresFileProvider
    2024-09-11 20:08:42,895 - INFO - core.providers.file.postgres - Created table file_storage
    2024-09-11 20:08:42,898 - INFO - core.pipes.retrieval.query_transform_pipe - Initalizing an QueryTransformPipe pipe.
    2024-09-11 20:08:42,898 - INFO - core.pipes.retrieval.query_transform_pipe - Initalizing an QueryTransformPipe pipe.
    [ERROR] 🪓 -- 2024-09-11 20:08:42,935 - failed to register workflow: ingest-file
    [ERROR] 🪓 -- 2024-09-11 20:08:42,935 - Could not put workflow: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.UNAVAILABLE
    details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:172.17.0.1:7077: Failed to connect to remote host: connect: Connection refused (111)"
    debug_error_string = "UNKNOWN:Error received from peer {grpc_message:"failed to connect to all addresses; last error: UNKNOWN: ipv4:172.17.0.1:7077: Failed to connect to remote host: connect: Connection refused (111)", grpc_status:14, created_time:"2024-09-11T20:08:42.935305949+00:00"}"

@linuxreitt
Copy link
Author

Has anyone, any ideas for this? Have any users who have successfully built with docker any advise on changes they may have made? The project itself seems fantastic but I feel the docker implementation completely lets the project down, I'd love to spend more time trying to figure out what the issue is but my feeling is if the devs can't then I have no chance and so I'll have to find a different solution.

@vanetreg
Copy link

vanetreg commented Oct 3, 2024

Same
"✘ Container r2r-postgres-1 Error
dependency failed to start: container r2r-postgres-1 is unhealthy
"
for me, trying with version 3.2.0 ( Windows 10 ), installed with docker.

Started serving multiple time using:
r2r serve --docker
even after restarting IDE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants