Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support secure connections to ClickHouse DB #5171

Merged
merged 1 commit into from
Jul 22, 2023

Conversation

yuntanghsu
Copy link
Contributor

@yuntanghsu yuntanghsu commented Jun 23, 2023

In this PR, we are adding following changes:

  1. Support TLS when connecting the FlowAggregator to ClickHouse.
  2. Support HTTP/HTTPS. Users can directly modify the clickHouse.databaseURL in order to switch protocol.
  3. Support custom CA certificate. User need to create a clickhouse-ca Secret in flow-aggregator namespace. The custom CA certificate will be used when clickhouse.tls.caCert is true.
  4. Update network-flow-visibility.md
  5. Add e2e test for http/https/tls protocol when connecting the FlowAggregator to ClickHouse.

resolve #4902

Signed-off-by: Yun-Tang Hsu [email protected]

@yuntanghsu yuntanghsu force-pushed the clickhouse_tls branch 7 times, most recently from 1ba2ceb to 0f1e1e0 Compare June 23, 2023 19:50
build/charts/flow-aggregator/conf/flow-aggregator.conf Outdated Show resolved Hide resolved
build/charts/flow-aggregator/templates/deployment.yaml Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
pkg/flowaggregator/clickhouseclient/clickhouseclient.go Outdated Show resolved Hide resolved
@yanjunz97
Copy link
Contributor

I notice you did not update the chart doc. You may also need to run
make under build/charts.

Copy link
Contributor

@heanlan heanlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build/charts/flow-aggregator/templates/deployment.yaml Outdated Show resolved Hide resolved
pkg/flowaggregator/clickhouseclient/clickhouseclient.go Outdated Show resolved Hide resolved
pkg/flowaggregator/clickhouseclient/clickhouseclient.go Outdated Show resolved Hide resolved
build/charts/flow-aggregator/conf/flow-aggregator.conf Outdated Show resolved Hide resolved
build/charts/flow-aggregator/conf/flow-aggregator.conf Outdated Show resolved Hide resolved
@heanlan
Copy link
Contributor

heanlan commented Jun 28, 2023

@yuntanghsu I was trying to reply one of the comment reply, but I cannot find it anymore. Did you accidentally delete that comment? I remember I was suggesting we can add "tls" here. To reply your comment, IMO tls is a secure protocol building on top of tcp. So if user specify tcp, means tcp + tls disabled. If they specify tls, means tcp protocol + tls enabled.

image

I believe it is fine to put tcp in parallel with tls. And we have something similar in FA:

# Provide the transport protocol for the flow aggregator collecting process, which is tls, tcp or udp.

@yuntanghsu
Copy link
Contributor Author

yuntanghsu commented Jun 28, 2023

@yuntanghsu I was trying to reply one of the comment reply, but I cannot find it anymore. Did you accidentally delete that comment? I remember I was suggesting we should can add "tls" here. To reply your comment, IMO tls is a secure protocol

I deleted it. Cuz I think you are right. This will be more straightforward to just let user to specify tls/tcp/http/https instead of giving them to options and letting them to make the combinations themselves.

@yuntanghsu yuntanghsu force-pushed the clickhouse_tls branch 6 times, most recently from 59557f9 to e16b177 Compare June 28, 2023 18:01
build/charts/flow-aggregator/conf/flow-aggregator.conf Outdated Show resolved Hide resolved
build/charts/flow-aggregator/conf/flow-aggregator.conf Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
pkg/flowaggregator/exporter/clickhouse.go Outdated Show resolved Hide resolved
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any thoughts on enabling TLS for e2e tests? cc @heanlan @yanjunz97

build/charts/flow-aggregator/README.md Outdated Show resolved Hide resolved
build/charts/flow-aggregator/README.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
pkg/flowaggregator/clickhouseclient/clickhouseclient.go Outdated Show resolved Hide resolved
pkg/flowaggregator/clickhouseclient/clickhouseclient.go Outdated Show resolved Hide resolved
@@ -79,6 +79,10 @@ spec:
readOnly: true
- mountPath: /var/log/antrea/flow-aggregator
name: host-var-log-antrea-flow-aggregator
{{- if and .Values.clickHouse.tls.customCACert .Values.clickHouse.tls.customCACert }}
- name: clickhouse-ca
mountPath: /etc/ssl/certs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is using this directory really necessary in our case? We are explicitly providing the CA certificate when we create the ClickHouse client connection, so I think using this standard unix directory can only create confusion. I'd recommend just mounting the certificate to /etc/flow-aggregator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already use /etc/flow-aggregator for flow-aggregator conifg. I think we are not able to mount two volumes to the same path?
I changed the path to /etc/flow-aggregator/ssl/certs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already use /etc/flow-aggregator for flow-aggregator conifg. I think we are not able to mount two volumes to the same path?
I changed the path to /etc/flow-aggregator/ssl/certs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use subPath and have both files in /etc/flow-aggregator.
https://stackoverflow.com/a/68179095/4538702

However, I am fine with using a subdirectory if it works. Maybe rename /etc/flow-aggregator/ssl/certs to /etc/flow-aggregator/certs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed it to /etc/flow-aggregator/certs. Thanks.

@yanjunz97
Copy link
Contributor

Hi @antoninbas , I have a offline discussion with @heanlan and @yuntanghsu . And here’re our thoughts:

There are basically two ways to do the e2e test.

One way is to use the TLS connection supported by ClickHouse server. As ClickHouse operator does not support cert-manager, we will need to manage our own certs and key. We remember you’ve mentioned the default root CAs, but we did not think of a way to use the default root CAs to sign our ClickHouse server cert. Please correct me if I’m wrong. Thus I believe there are 2 ways to deal with the certs:

  • We can pre-generate the certs and save it under buld/yamls/flow-aggregator for test purpose. In this case, we need to rotate the certs in the future.
  • We can generate the certs during setup step in the e2e test, inject them into the flow-visibility-e2e.yaml and create the Secret for CA cert.

Another way is to use the ingress and cert-manager. We haven’t verified it yet, but I think we need to follow the steps below for this strategy:

  • Enable the LoadBalancer in Antrea. Not sure if Service external IP management by Antrea works in kind as I failed to do this before. But Anlan has verified that MetalLB will work
  • Install the ingress controller
  • Install the cert-manager and create the CertificateIssuer
  • Create a ingress resource for ClickHouse service

I think after these steps, we will be able to test the secure connection between FA client and CH server. But this might introduce quite a lot of resources to our test environment and we are not sure if it is acceptable.

Do you have any suggestion on which way we should select?

@yuntanghsu
Copy link
Contributor Author

Hi,
I update the e2e test by changing the endpoint from the ingress to ClickHouse server. Currently, we test tcp/http when we disable secure option in ClickHouse server and test tls/https when we enable secure option in ClickHouse server.

@yuntanghsu yuntanghsu force-pushed the clickhouse_tls branch 3 times, most recently from c24a395 to 17a7e77 Compare July 18, 2023 17:25
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that all recent changes are for e2e tests, and you haven't changed anything in the FA code?

<server>
<certificateFile>/opt/certs/tls.crt</certificateFile>
<privateKeyFile>/opt/certs/tls.key</privateKeyFile>
<dhParamsFile>/etc/clickhouse-server/config.d/dhparam.pem</dhParamsFile>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that providing dhParamsFile is optional, I think we should omit it for testing to keep things simpler

I actually don't even see dhParamsFile mentioned in https://clickhouse.com/docs/en/guides/sre/configuring-ssl. Modern ciphers should not require this. In any case, we should not really be worried about security hardening when it comes to e2e testing.

Copy link
Contributor Author

@yuntanghsu yuntanghsu Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that all recent changes are for e2e tests, and you haven't changed anything in the FA code?

Yes, that's right.

Removed dhParamsFile. Thanks.

test/e2e/framework.go Outdated Show resolved Hide resolved
test/e2e/flowaggregator_test.go Outdated Show resolved Hide resolved
test/e2e/framework.go Outdated Show resolved Hide resolved
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple more small comments

@heanlan do you want to take another look?

test/e2e/charts/flow-visibility/templates/namespace.yaml Outdated Show resolved Hide resolved
test/e2e/flowaggregator_test.go Outdated Show resolved Hide resolved
test/e2e/framework.go Outdated Show resolved Hide resolved
yanjunz97
yanjunz97 previously approved these changes Jul 20, 2023
Copy link
Contributor

@yanjunz97 yanjunz97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, only some nits

docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
@@ -877,7 +892,21 @@ func (data *TestData) deployFlowAggregator(ipfixCollector string) error {
if err != nil || rc != 0 {
return fmt.Errorf("error when deploying the Flow Aggregator; %s not available on the control-plane Node", flowAggYaml)
}
if err = data.mutateFlowAggregatorConfigMap(ipfixCollector); err != nil {
if o.secureConnection {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add some comments explaining we need to copy the secret to flow-aggregator namespace to make it accessible by FA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment there. Please take a look to see if it makes sense? Thanks.

docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
Comment on lines 321 to 324
#### Configuration for ClickHouse pre Antrea v1.13

We don't support secure connection to ClickHouse database prior to Antrea v1.13.
We only support TCP when connecting to ClickHouse server from Flow Aggregator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not sure I understand the value of this section, or why it is its own section

I feel like it should just be a quick note at the end of the previous section:

Prior to Antrea v1.13, secure connections to ClickHouse are not supported, and TCP is the only supported protocol when connecting to the ClickHouse server from the Flow Aggregator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added this section is because I saw we usually add extra section when we have configuration changes in other document, like node-port-local.
I've updated it. Thanks.

test/e2e/charts/flow-visibility/values.yaml Outdated Show resolved Hide resolved
test/e2e/framework.go Outdated Show resolved Hide resolved
test/e2e/framework.go Outdated Show resolved Hide resolved
test/e2e/framework.go Outdated Show resolved Hide resolved
@yuntanghsu yuntanghsu force-pushed the clickhouse_tls branch 3 times, most recently from 0b167ef to 649a84c Compare July 20, 2023 20:12
test/e2e/framework.go Outdated Show resolved Hide resolved
test/e2e/framework.go Show resolved Hide resolved
@yuntanghsu
Copy link
Contributor Author

yuntanghsu commented Jul 20, 2023

@antoninbas
I found the e2e test keeps failing because of the error: no space left. It happens in other PRs as well.
I also found the available memory is 2G less than what we have yesterday. Not sure if it is github's issue?
Do you think I can add a workaround to free more space before running the flow-visibility e2e test?

@antoninbas
Copy link
Contributor

@yuntanghsu I think your workaround is ok. Github is probably installing additional / new software since yesterday, hence the reduction in available disk space. Later on, you could try to find ways to reduce the space used by the e2e visibility test (e.g., delete images from the host after they have been loaded into the Kind Nodes, or use 2 Kind Nodes instead of 3).

@elton-furtado elton-furtado added this to the Antrea v1.13 release milestone Jul 21, 2023
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more small comment. Please address and rebase the PR, and we can merge this.

test/e2e/framework.go Outdated Show resolved Hide resolved
Adding following changes:

1. Support TLS when connecting the Flow Aggregator to ClickHouse.
2. Support HTTP/HTTPS. Users can directly modify the clickHouse.databaseURL in order to switch protocol.
3. Support custom CA certificate. User need to create a clickhouse-ca Secret in flow-aggregator namespace. The custom CA certificate will be used when clickhouse.tls.caCert is true.
4. Update network-flow-visibility.md
5. Add e2e test for http/https/tls protocol when connecting the Flow Aggregator to ClickHouse

Signed-off-by: Yun-Tang Hsu <[email protected]>
@antoninbas
Copy link
Contributor

/test-all

@antoninbas
Copy link
Contributor

/test-conformance

@antoninbas antoninbas merged commit 7919ef4 into antrea-io:main Jul 22, 2023
39 of 43 checks passed
@antoninbas antoninbas added action/release-note Indicates a PR that should be included in release notes. area/flow-visibility Issues or PRs related to flow visibility support in Antrea labels Jul 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. area/flow-visibility Issues or PRs related to flow visibility support in Antrea
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FlowAggregator] Support secure connections to ClickHouse DB
5 participants