Skip to content

Commit

Permalink
feat(interactive): Support customizing service configuration via `gsc…
Browse files Browse the repository at this point in the history
…tl` (#4205)

The customized engine configuration can be specified during the
deployment of the interactive instance.
- [x] Add CI test verify it works
  • Loading branch information
zhanglei1949 authored Sep 5, 2024
1 parent 72faf82 commit 0bd27de
Show file tree
Hide file tree
Showing 3 changed files with 60 additions and 60 deletions.
5 changes: 4 additions & 1 deletion .github/workflows/flex-interactive.yml
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,13 @@ jobs:
# install gsctl
python3 -m pip install ${GITHUB_WORKSPACE}/python/dist/*.whl
# launch service: 8080 for coordinator http port; 7687 for cypher port;
docker run -p 8080:8080 -p 7688:7687 graphscope/interactive:latest --enable-coordinator &
gsctl instance deploy --type interactive --image-registry graphscope --image-tag latest --interactive-config ${GITHUB_WORKSPACE}/flex/tests/hqps/interactive_config_test.yaml
sleep 20
# test
python3 -m pip install --no-cache-dir pytest pytest-xdist
python3 -m pytest -d --tx popen//python=python3 \
-s -v \
$(dirname $(python3 -c "import graphscope.gsctl as gsctl; print(gsctl.__file__)"))/tests/test_interactive.py
# destroy instance
gsctl instance destroy --type interactive -y
84 changes: 27 additions & 57 deletions docs/flex/interactive/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,13 @@ Below is a list of all configurable items:
| admin-port | 7777 | The port of the interactive admin service | v0.3 |
| storedproc-port | 10000 | The port of the interactive stored procedure service | v0.3 |
| cypher-port | 7687 | The port of the cypher service | v0.3 |
| config | None | The customized configuration file for engine interactive service | v0.4 |
<!-- | gremlin-port | None | The port of the gremlin service | v0.3 | -->


<!-- *Note: The default value for `gremlin-port` is `None`, meaning the Gremlin service will not be initiated by default. -->

### Default Ports
### Ports

By default, Interactive will launch the following services on these ports:

Expand All @@ -43,74 +44,46 @@ The Gremlin service is disabled by default. To enable it, add the `--gremlin-por
gsctl instance deploy --type interactive --coordinator-port 8081 --admin-port 7778 --cypher-port 7688 --storedproc-port 10001 --gremlin-port 8183
``` -->

### Service Configuration

<!-- Those content are commented but not deleted, since we will support those configurations later.
> TODO: Currently `gsctl` doesn't support the following command!
By default, `Interactive` will initialize the service with its default settings.
However, GraphScope Interactive is designed to be flexible and adaptable to your specific needs. This means you can tailor the service's behavior using custom configurations.

Starting your GraphScope Interactive service can be straightforward, as demonstrated in our [getting_started](./getting_started.md) guide. By default, executing the command:

```bash
gsctl use GRAPH <name>
```
will initialize the service with its default settings. However, GraphScope is designed to be flexible and adaptable to your specific needs. This means you can tailor the service's behavior using custom configurations.
## Customizing Your Service Configuration
To customize the service's settings, you can provide a YAML configuration file. This file allows you to specify various parameters, from directory paths to log levels, ensuring the service aligns with your requirements. To use a custom configuration, simply pass the YAML file to the command as follows:
#### Customizing Your Service Configuration
To customize the service's settings, you can provide a YAML configuration file `interactive_config.yaml`. This file allows you to specify various parameters, from directory paths to log levels, ensuring the service aligns with your requirements. To use a custom configuration, simply pass the YAML file to the command as follows:

```bash
gsctl use GRAPH <name> -c ./interactive_config.yaml
gsctl instance deploy --type interactive --config ./interactive_config.yaml
```

Note: Please be aware that you're not required to configure every option. Simply adjust the settings that are relevant to your needs. Any options left unconfigured will automatically adopt their default values, as detailed in the sections that follow.
If you already have an Interactive service running and wish to apply a new set of configurations, a simple restart with the custom configuration is required. This ensures that the service updates its settings and operates according to your newly specified preferences.
To restart the service with your custom configuration, use the following command:
```bash
gsctl service restart -c ./conf/interactive_config.yaml
```{note}
Please be aware that you're not required to configure every option. Simply adjust the settings that are relevant to your needs. Any options left unconfigured will automatically adopt their default values, as detailed in the following sections.
```
Remember, any changes made in the configuration file will only take effect after the service has been restarted with the updated file.



## Sample Configuration
##### Sample Configuration
Here's a glimpse of what a typical YAML configuration file might look like:

```yaml
log_level: INFO # default INFO
verbose_level: 2 # verbose all logs above level 2(including)
log_level: INFO # default INFO, available(INFO,WARNING,ERROR,FATAL)
verbose_level: 0 # default 0, should be a int in range [0,10]. 10 will verbose all logs
compute_engine:
thread_num_per_worker: 1 # the number of shared workers, default 1
thread_num_per_worker: 1 # the number of threads for each worker, default 1
compiler:
planner:
is_on: true
opt: RBO
rules:
- FilterMatchRule
- FilterIntoJoinRule
- NotExistToAntiJoinRule
is_on: true
opt: RBO
rules:
- FilterMatchRule
- FilterIntoJoinRule
- NotExistToAntiJoinRule
query_timeout: 20000 # query timeout in milliseconds, default 20000
endpoint:
default_listen_address: localhost
bolt_connector: # cypher query endpoint
disabled: false # disable cypher endpoint or not.
port: 7687
gremlin_connector: # gremlin query endpoint
disabled: false # disable gremlin endpoint or not.
port: 8182
http_service:
default_listen_address: localhost
admin_port: 7777
query_port: 10000
```
## Available Configurations
For configurations associated with the root directory, we do not accept relative paths to ensure consistency.
### Service configurations
##### Available Configurations
In this following table, we use the `.` notation to represent the hierarchy within the `YAML` structure.

Expand All @@ -119,20 +92,17 @@ In this following table, we use the `.` notation to represent the hierarchy with
| -------- | -------- | -------- |----------- |
| log_level | INFO | The level of database log, INFO/WARNING/ERROR/FATAL | 0.0.1 |
| verbose_level | 0 | The verbose level of database log, should be a int | 0.0.3 |
|default_graph | modern | The name of default graph on which to start the graph service. | 0.0.1 |
| compute_engine.thread_num_per_worker | 1 | The number of threads will be used to process the queries. Increase the number can benefit the query throughput | 0.0.1 |
| compiler.planner.is_on | true | Determines if query optimization is enabled for compiling Cypher queries | 0.0.1 |
| compiler.planner.opt | RBO | Specifies the optimizer to be used for query optimization. Currently, only the Rule-Based Optimizer (RBO) is supported | 0.0.1 |
| compiler.planner.rules.FilterMatchRule | N/A | An optimization rule that pushes filter (`Where`) conditions into the `Match` clause | 0.0.1 |
| compiler.planner.rules.FilterIntoJoinRule | N/A | A native Calcite optimization rule that pushes filter conditions to the Join participants before performing the join | 0.0.1 |
| compiler.planner.rules.NotMatchToAntiJoinRule | N/A | An optimization rule that transforms a "not exist" pattern into an anti-join operation | 0.0.1 |
| compiler.endpoint.default_listen_address | localhost | The address for compiler endpoint to bind | 0.0.3 |
| compiler.endpoint.bolt_connector.disabled | false | Whether to disable the cypher endpoint| 0.0.3 |
| compiler.endpoint.bolt_connector.port | 7687 | The port for compiler's cypher endpoint.| 0.0.3 |
| compiler.endpoint.gremlin_connector.disabled | true | Whether to disable the gremlin endpoint| 0.0.3 |
| compiler.endpoint.gremlin_connector.port | 8182 | The port for compiler's cypher endpoint.| 0.0.3 |
| http_service.default_listen_address | localhost | The address for http service to bind | 0.0.2 |
| http_service.admin_port | 7777 | The port for admin service to listen on | 0.0.2 |
| http_service.query_port | 10000 | The port for query service to listen on, for stored procedure queries, user can directory submit queries to query_port without compiler involved | 0.0.2 | -->
| compiler.query_timeout | 3000000 | The maximum time for compiler to wait engine's reply, in `ms` | 0.0.3 |

#### TODOs

We currently only allow service configuration during instance deployment. In the near future, we will support:

- Graph-level configurations
- Modifying service configurations
31 changes: 29 additions & 2 deletions python/graphscope/gsctl/commands/dev.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
# Interactive docker container config
INTERACTIVE_DOCKER_CONTAINER_NAME = "gs-interactive-instance"
INTERACTIVE_DOCKER_CONTAINER_LABEL = "flex=interactive"
INTERACTIVE_DOCKER_DEFAULT_CONFIG_PATH = "/opt/flex/share/interactive_config.yaml"

scripts_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), "..", "scripts")
install_deps_script = os.path.join(scripts_dir, "install_deps.sh")
Expand Down Expand Up @@ -182,6 +183,12 @@ def interactive(app, graphscope_repo):
show_default=True,
required=False,
)
@click.option(
"--interactive-config",
help="Interactive config file path [docker only]",
required=False,
default=None,
)
@click.option(
"--gremlin-port",
help="Mapping port of gremlin query, -1 means disable mapping [docker only]",
Expand Down Expand Up @@ -220,6 +227,7 @@ def deploy(
storedproc_port,
cypher_port,
gremlin_port,
interactive_config,
): # noqa: F811
"""Deploy a GraphScope Flex instance"""
cmd = []
Expand All @@ -244,6 +252,17 @@ def deploy(
if gremlin_port != -1:
cmd.extend(["-p", f"{gremlin_port}:8182"])
image = f"{image_registry}/{type}:{image_tag}"
if interactive_config is not None:
if not os.path.isfile(interactive_config):
click.secho(
f"Interactive config file {interactive_config} does not exist.",
fg="red",
)
return
interactive_config = os.path.abspath(interactive_config)
cmd.extend(
["-v", f"{interactive_config}:{INTERACTIVE_DOCKER_DEFAULT_CONFIG_PATH}"]
)
cmd.extend([image, "--enable-coordinator"])
returncode = run_shell_cmd(cmd, os.getcwd())
if returncode == 0:
Expand Down Expand Up @@ -293,9 +312,17 @@ def deploy(
show_default=True,
required=False,
)
def destroy(type, container_name):
@click.option(
"-y",
"--yes",
is_flag=True,
default=False,
help="Do not ask for confirmation",
required=False,
)
def destroy(type, container_name, yes):
"""Destroy Flex Interactive instance"""
if click.confirm(f"Do you want to destroy {container_name} instance?"):
if yes or click.confirm(f"Do you want to destroy {container_name} instance?"):
cmd = []
if type == "interactive":
cmd = [
Expand Down

0 comments on commit 0bd27de

Please sign in to comment.