Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[otel-infrastructure-collector] Add CRD generation + MySQL preset #255

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions otel-infrastructure-collector/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,13 @@

## OpenTelemtry-Infrastructure-Collector

### v0.1.3 / 2023-07-17

* [FEATURE] Add support for deploying `otel-infrastructure-collector` with OpenTelemetry Operator
* [FEATURE] Add MySQL preset for metrics and extra logs
* [CHORE] Update OpenTelemetry Collector to v0.77.0
* [CHORE] Use Coralogix fork for OpenTelemetry Collector Helm chart dependency

### v0.1.2 / 2023-05-08

* [FEATURE] Allow users to configure Coralogix domain instead of endpoints
Expand Down
6 changes: 3 additions & 3 deletions otel-infrastructure-collector/k8s-helm/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
apiVersion: v2
name: otel-infrastructure-collector
description: OpenTelemetry Infrastructure collector
version: 0.1.2
version: 0.1.3
keywords:
- OpenTelemetry Collector
- OpenTelemetry Infrastructure Collector
- Coralogix
dependencies:
- name: opentelemetry-collector
version: "0.55.0"
repository: https://open-telemetry.github.io/opentelemetry-helm-charts
version: "0.63.0"
repository: https://cgx.jfrog.io/artifactory/coralogix-charts-virtual
sources:
- https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-collector
maintainers:
Expand Down
121 changes: 121 additions & 0 deletions otel-infrastructure-collector/k8s-helm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,13 @@ This Infrastructure collector provides:

- [Coralogix Exporter](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/coralogixexporter) - Coralogix exporter is preconfigured to enrich data using Kubernetes Attributes, which allows quick correlation of telemetry signals using consistent ApplicationName and SubsytemName fields.
- [Cluster Metrics Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sclusterreceiver) - The Kubernetes Cluster receiver collects cluster-level metrics from the Kubernetes API server. Alternative to Kube State Metrics project.
- [Integrations presets](#integration-presets) - This chart provides support to integrate with various applications running on your cluster to monitor them out of the box.

### OpenTelemetry Operator (for CRD users)

If you wish to use this Helm chart as an `OpenTelemetryCollector` CRD, you will need to have the OpenTelemetry Operator installed in your cluster. Please refer to the [OpenTelemetry Operator documentation](https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md) for full details.

We recommend to install the operator with the help of the community Helm charts from the [OpenTelemetry Helm Charts](https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-operator) repository.

### Required

Expand Down Expand Up @@ -52,6 +59,31 @@ helm upgrade --install otel-infrastructure-collector coralogix-charts-virtual/ot
-f values.yaml
```

### Generating OpenTelemetryCollector CRD for OpenTelemetry Operator users

If you wish to deploy the `otel-agent` using the OpenTelemetry Operator, you can generate an `OpenTelemetryCollector` CRD. You might want to do this if you'd like to take advantage of some advanced features provided by the operator, such as automatic collector upgrade or CRD-defined auto-instrumentation.

For full details on how to install and use the operator, please refer to the [OpenTelemetry Operator documentation](https://github.com/open-telemetry/opentelemetry-operator/blob/main/README.md).

First make sure to add our Helm charts repository to the local repos list with the following command:

```bash
helm repo add coralogix-charts-virtual https://cgx.jfrog.io/artifactory/coralogix-charts-virtual
```

In order to get the updated Helm charts from the added repository, please run:

```bash
helm repo update
```

Install the chart with the CRD `values.yaml` file:

```bash
helm upgrade --install otel-coralogix-agent coralogix-charts-virtual/opentelemetry-coralogix \
-f values-crd.yaml
```

# Infrastructure Monitoring

## Kubernetes Events
Expand Down Expand Up @@ -86,6 +118,95 @@ This configuration is filtering out any event that has the field `reason` with o

## Alerts

# Integration presets

The `otel-infrastructure-collector` chart also provides support to integrate with different applications. The following integration presets are available.

## MySQL

The MySQL preset is able to collect metrics and extra logs (slow query log, general query log) from your MySQL instances. **Extra logs collection is available only when running the `otel-infrastructure-collector` as CRD with the OpenTelemetry Operator.**

### Prerequisites

This preset supports MySQL version 8.0

Collecting most metrics requires the ability of the database user to execute `SHOW GLOBAL STATUS`.

### Configuration for metrics collection

The metrics collection has to be enabled by setting the `metrics.enabled` to `true`.

Each MySQL instance is configured in the `metrics.instances` section. You can configure multiple instances, if you have more than one instance you'd like to monitor.

Required instance settings:
- `username`: The username of the database user that will be used to collect metrics.
- `password`: The password of the database user that will be used to collect metrics. We strongly recommend to provide this via a Kuberetes secret as an environment variable, e.g `MYSQL_PASSWORD`, which should be provided in the `extraEnv` section of the chart. This parameter should be passed in format `${env:MYSQL_PASSWORD}` in order for the collector to be able to read it.

Optional instance settings:
- `port`: The port of the MySQL instance. Defaults to `3306`. Unless you use non-standard port, there is no need to set this parameter.
- `labelSelectors`: A list of label selectors to select the pods that run the MySQL instances. If you wish to monitor mutiple instance, the selectors will determine which pods belong to a given instance.

### Configuration for extra logs collection

The extra logs collection has to be enabled by setting the `extraLogs.enabled` to `true`. Note that the extra logs have to enabled on your MySQL instance (please refer to [relevant documentation](https://dev.mysql.com/doc/refman/8.0/en/server-logs.html)). Please also note that extra logs collection is only available when running `otel-infrastructure-collector` with OpenTelemetry Operator.

**PLEASE NOTE:** In order for the collection to take effect, you need to annotate your MySQL instance(s) pod templates with the following:

```bash
kubectl patch sts <YOUR_MYSQL_INSTANCE_NAME> -p '{"spec": {"template":{"metadata":{"annotations":{"sidecar.opentelemetry.io/inject":"otel-infrastructure-collector-mysql-logs-sidecar"}}}} }'
```

Required settings:
- `volumeMountName`: specifies the name of the volume mount. It should correspond to the volume name of the MySQL data volume.
- `mountPath`: specifies the path at which to mount the volume. This should correspond the mount path of your MySQL data volume. Provide this parameter without trailing slash.

Optional settings:
- `logFilesPath`: specifies which directory to watch for log files. This will typically be the MySQL data directory,
such as `/var/lib/mysql`. If not specified, the value of `mountPath` will be used.
- `logFilesExtension`: specifies which file extensions to watch for. Defaults to `.log`.

### Common issues

- Metrics collection is failing with error `"Error 1227 (42000): Access denied; you need (at least one of) the PROCESS privilege(s) for this operation"`
- This error indicates that the database user you provided does not have the required privileges to collect metrics. Provide the `PROCESS` privilege to the user, e.g. by running query
`GRANT PROCESS ON *.* TO 'user'@'%'`

### Example preset configuration for single instance

```yaml
mysql:
metrics:
enabled: true
instances:
matej-g marked this conversation as resolved.
Show resolved Hide resolved
- username: "otel-coralogix-collector"
password: ${env:MYSQL_PASSWORD}
extraLogs:
enabled: true
volumeMountName: "data"
mountPath: "/var/log/mysql"
```

### Example preset configuration for multiple instance

```yaml
mysql:
metrics:
enabled: true
instances:
- username: "otel-coralogix-collector"
password: ${env:MYSQL_PASSWORD_INSTANCE_A}
labelSelectors:
app.kubernetes.io/name: "mysql-a"
- username: "otel-coralogix-collector"
password: ${env:MYSQL_PASSWORD_INSTANCE_B}
labelSelectors:
app.kubernetes.io/name: "mysql-b"
extraLogs:
enabled: true
volumeMountName: "data"
mountPath: "/var/log/mysql"
```

# Dependencies

This chart uses [openetelemetry-collector](https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-collector) helm chart.
182 changes: 182 additions & 0 deletions otel-infrastructure-collector/k8s-helm/values-crd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
global:
domain: ""
traces:
endpoint: ""
metrics:
endpoint: ""
logs:
endpoint: ""
defaultApplicationName: "default"
defaultSubsystemName: "nodes"

opentelemetry-collector:
mode: deployment
collectorCRD:
generate: true
fullnameOverride: otel-infrastructure-collector
configMap:
create: false
clusterRole:
name: "otel-infrastructure-collector"
create: true
rules:
- apiGroups: ["", "events.k8s.io"]
resources: ["events"]
verbs: ["watch", "list"]
clusterRoleBinding:
name: "otel-infrastructure-collector"
replicaCount: 1
presets:
clusterMetrics:
enabled: true
kubernetesAttributes:
enabled: true
mysql:
metrics:
enabled: false
instances:
- username: ""
password: ""
port: 3306
extraLogs:
enabled: false
volumeMountName: ""
mountPath: ""

ports:
otlp:
enabled: true
otlp-http:
enabled: false
jaeger-compact:
enabled: false
jaeger-thrift:
enabled: false
jaeger-grpc:
enabled: false
zipkin:
enabled: false

extraEnvs:
- name: CORALOGIX_PRIVATE_KEY
valueFrom:
secretKeyRef:
name: coralogix-keys
key: PRIVATE_KEY
config:
extensions:
zpages:
endpoint: localhost:55679
receivers:
k8sobjects:
objects:
- name: events
mode: pull
interval: 15s
group: events.k8s.io
prometheus:
config:
scrape_configs:
- job_name: opentelemetry-infrastructure-collector
scrape_interval: 30s
static_configs:
- targets:
- ${MY_POD_IP}:8888
exporters:
coralogix:
timeout: "1m"
private_key: "${CORALOGIX_PRIVATE_KEY}"
domain: "{{.Values.global.domain}}"
traces:
endpoint: "{{ .Values.global.traces.endpoint }}"
metrics:
endpoint: "{{ .Values.global.metrics.endpoint }}"
logs:
endpoint: "{{ .Values.global.logs.endpoint }}"
application_name_attributes:
- "k8s.namespace.name"
- "service.namespace"
subsystem_name_attributes:
- "k8s.deployment.name"
- "k8s.statefulset.name"
- "k8s.daemonset.name"
- "k8s.cronjob.name"
- "k8s.job.name"
- "k8s.container.name"
- "k8s.node.name"
- "service.name"
application_name: "{{.Values.global.defaultApplicationName }}"
subsystem_name: "{{.Values.global.defaultSubsystemName }}"
processors:
memory_limiter: null # Will get the k8s resource limits
transform/kube-events:
log_statements:
- context: log
statements:
- keep_keys(body, ["type", "action", "eventTime", "reason", "regarding", "reportingController", "note", "series", "metadata", "deprecatedFirstTimestamp", "deprecatedLastTimestamp"])
resource/kube-events:
attributes:
- key: service.name
value: "kube-events"
action: upsert
service:
extensions:
- zpages
- health_check
- memory_ballast
telemetry:
logs:
encoding: json
metrics:
address: ${MY_POD_IP}:8888
pipelines:
logs/kube-events:
exporters:
- coralogix
processors:
- memory_limiter
- batch
- transform/kube-events
- resource/kube-events
receivers:
- k8sobjects
metrics:
exporters:
- coralogix
processors:
- memory_limiter
- batch
receivers:
- prometheus
- otlp
logs:
exporters:
- coralogix
processors:
- memory_limiter
- batch
receivers:
- otlp
tolerations:
- operator: Exists

resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 1
memory: 2G

# In order to enable serviceMonitor, following part must be enabled in order to expose the required port:
# ports:
# metrics:
# enabled: true

# serviceMonitor:
# enabled: true

# prometheusRule:
# enabled: true
# defaultRules:
# enabled: true
14 changes: 14 additions & 0 deletions otel-infrastructure-collector/k8s-helm/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,20 @@ opentelemetry-collector:
presets:
clusterMetrics:
enabled: true
kubernetesAttributes:
enabled: true
mysql:
metrics:
enabled: false
instances:
- username: ""
password: ""
port: 3306
extraLogs:
enabled: false
volumeMountName: ""
mountPath: ""

ports:
otlp:
enabled: true
Expand Down
Loading