Skip to content

Commit

Permalink
[BREAKING][MAJOR] Update ECS Fargate integration to submit logs using…
Browse files Browse the repository at this point in the history
… OTEL vs fluentbit (#448)

* [BREAKING][MAJOR] Update ECS Fargate integration
to submit logs using OTEL vs fluentbit

* [DOC] Updated CHANGELOG with proper formatting
  • Loading branch information
MichaelBriggs-Coralogix committed Sep 11, 2024
1 parent 35ac6ac commit 4c40458
Show file tree
Hide file tree
Showing 5 changed files with 134 additions and 22 deletions.
4 changes: 3 additions & 1 deletion logs/fluent-bit/ecs-fargate/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# fluentbit ECS Fargate container

# Note: This integration is for logs only. You can now collect logs from ECS fargate tasks through OTEL using our otel-ecs-fargate integration. Only use this integration if you only intend to ingest logs.

fluentbit is a lightweight data shipper that we are using as a logs shipper for AWS ECS Fargate workloads.

Here we explain how to deploy the fluentbit log_router into an existing AWS ECS Fargate task definition. We use an AWS customized fluentbit image called aws-for-fluent-bit, init version, as it has several features that allow for more convenient management of the configuration. We also have an example cloudformation template for review [here](https://github.com/coralogix/cloudformation-coralogix-aws/tree/master/aws-integrations/ecs-fargate)
Expand Down Expand Up @@ -76,7 +78,7 @@ In order to allow container access to the S3 object, you'll need to provide the
}
```

Note: Don't confuse Task Execution Role for Task Role, this permission needs to be added to the Task Role. (Contrary to the ADOT (OTEL) Metrics and Traces integration)
Note: Don't confuse Task Execution Role for Task Role, this permission needs to be added to the Task Role.

After you've added the above container to your existing Task Definition, you need to adjust the logConfiguration for the containers you wish to forward to Coralogix.

Expand Down
14 changes: 14 additions & 0 deletions otel-ecs-fargate/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Changelog

## otel-ecs-fargate

<!-- To add a new entry write: -->

<!-- ### version / full date -->

<!-- * [Update/Bug fix] message that describes the changes that you apply -->

### 0.0.1 / 2024-09-11

### 🛑 Breaking changes 🛑
* [UPDATE] Update ecs-fargate integration to OTEL only (remove fluentbit logrouter)
50 changes: 29 additions & 21 deletions otel-ecs-fargate/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,17 @@

### Note: Previous versions of this integration used an ADOT (AWS Distribution for OpenTelemetry) collector image. If you are upgrading an existing deployment, make sure you upgrade both the configuration and the task definition.

### Note: Previous versions of this integration required logs to be processed using fluentbit logrouter. This is no longer necessary and logs can be collected by OTEL along with the metrics and traces.

The OpenTelemetry collector offers a vendor-agnostic implementation of how to receive, process and export telemetry data.

In this document, we'll explain how to add the OTEL collector as a sidecar agent to your ECS Task Definitions. We use the standard Opentelemetry Collector Contrib distribution but leverage the envprovider to generate the configuration from an AWS SSM Parameter Store. There is an example cloudformation template for review [here](https://github.com/coralogix/cloudformation-coralogix-aws/tree/master/aws-integrations/ecs-fargate)

The envprovider is used for loading of the OpenTelemetry configuration via Systems Manager Parameter Stores. This makes adjusting your configuration more convenient and more dynamic than baking a static configuration into your container image.

Our config.yaml file includes a standard configuration that'll ensure proper ingestion by our backend. Make sure to create this parameter store in the same region as your ECS cluster. We've included a sample cloudformation template to deploy this parameter store to simplify this process.
Our config.yaml file includes a standard configuration that'll ensure proper ingestion of logs, metrics and traces by our backend. Make sure to create this parameter store in the same region as your ECS cluster. We've included a sample cloudformation template to deploy this parameter store to simplify this process.

Once the Parameter Store has been created, you'll need to add the container to your existing Task Definition.
Once the Parameter Store has been created, you'll need to add the OTEL container to your existing Task Definition(s).

Example container declaration within a Task Definition:

Expand Down Expand Up @@ -62,38 +64,33 @@ Example container declaration within a Task Definition:
"valueFrom": "/CX_OTEL/config.yaml"
}
],
"user": "0",
"logConfiguration": {
"logDriver": "awsfirelens",
"options": {
"Format": "json_lines",
"Header": "Authorization Bearer <Coralogix PrivateKey>",
"Host": "ingress.<Coralogix Domain>",
"Name": "http",
"Port": "443",
"Retry_Limit": "10",
"TLS": "On",
"URI": "/logs/v1/singles",
"compress": "gzip"
"Name": "OpenTelemetry"
}
},
"systemControls": [],
"firelensConfiguration": {
"type": "fluentbit"
}
}
]
```

In the example above, you'll need to set two instances each of `<Coralogix PrivateKey>` and `<Coralogix Domain>`. The logConfiguration section included in the example will forward OTEL logs to the Coralogix platform, as documented in our fluentbit log processing configuration instructions [here](../logs/fluent-bit/ecs-fargate/README.md).

**NOTE:** If you wish to store your Coralogix Privatekey in Secret Manager, you can remove the `"Header"` from `"options"` and create one under `"secretOptions"` and reference the Secret's ARN. Create the secret as plaintext with the same format as above. You will also need to add the secretsmanager:GetSecretValue permission to your ecs Task Execution Role.
In the example above, you'll need to set `<Coralogix PrivateKey>` and `<Coralogix Domain>`. The logConfiguration section included in the example will forward OTEL logs to the Coralogix platform. Make sure you set all your existing containers' logConfiguration to the same.

```
"secretOptions": [
{
"name": "Header",
"valueFrom": "arn:aws:secretsmanager:us-east-1:<redacted>:secret:<redacted>"
"logConfiguration": {
"logDriver": "awsfirelens",
"options": {
"Name": "OpenTelemetry"
}
]
},
```

If you don't want to have them submitted to the Coralogix platform, you can replace the logConfiguration with whichever logDriver configuration you would prefer. To submit to Cloudwatch, you can configure as so:
If you don't want to have logs submitted to the Coralogix platform, you can replace the logConfiguration with whichever logDriver configuration you would prefer. To submit to Cloudwatch, you can configure as so:

```
"logConfiguration": {
Expand All @@ -107,6 +104,17 @@ If you don't want to have them submitted to the Coralogix platform, you can repl
}
```

**NOTE:** If you wish to store your Coralogix Privatekey in Secret Manager, you can remove the `"Header"` from `"options"` and create one under `"secretOptions"` and reference the Secret's ARN. Create the secret as plaintext with the same format as above. You will also need to add the secretsmanager:GetSecretValue permission to your ecs Task Execution Role.

```
"secretOptions": [
{
"name": "Header",
"valueFrom": "arn:aws:secretsmanager:us-east-1:<redacted>:secret:<redacted>"
}
]
```

In order to allow container access to the Systems Manager Parameter Store, you'll need to provide the ssm:GetParameters action permissions to the task execution role:

```
Expand All @@ -126,6 +134,6 @@ In order to allow container access to the Systems Manager Parameter Store, you'l
}
```

Note: Don't confuse Task Role for Task Execution Role, this permission needs to be added to the Task Execution Role. (Contrary to the fluentbit Logs integration)
Note: This permission needs to be added to the Task Execution Role.

After adding the above container to your existing Task Definition, your applications can submit their traces and metrics exports to http://localhost:4318/v1/traces and /v1/metrics. It will also collect container metrics from all containers in the Task Definition.
44 changes: 44 additions & 0 deletions otel-ecs-fargate/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,42 @@ exporters:
subsystem_name_attributes:
- service.name
- aws.ecs.docker.name
- container_name
timeout: 30s
traces:
headers:
X-Coralogix-Distribution: ecs-fargate-integration/0.0.1
processors:
transform/firelens:
log_statements:
- context: log
statements:
# parse json logs
- merge_maps(cache, ParseJSON(body), "insert") where IsMatch(body, "^\\{")
# set message
- set(body, cache["message"]) where cache["message"] != nil

# set trace/span id
- set(trace_id.string, cache["trace_id"]) where cache["trace_id"] != nil
- set(span_id.string, cache["span_id"]) where cache["span_id"] != nil

# set severity when available
- set(severity_number, SEVERITY_NUMBER_INFO) where IsMatch(cache["level"], "(?i)info")
- set(severity_number, SEVERITY_NUMBER_WARN) where IsMatch(cache["level"], "(?i)warn")
- set(severity_number, SEVERITY_NUMBER_ERROR) where IsMatch(cache["level"], "(?i)err")
- set(severity_number, SEVERITY_NUMBER_DEBUG) where IsMatch(cache["level"], "(?i)debug")
- set(severity_number, SEVERITY_NUMBER_TRACE) where IsMatch(cache["level"], "(?i)trace")
- set(severity_number, cache["severity_number"]) where cache["severity_number"] != nil

# move log_record attributes to resource
- set(resource.attributes["container_name"], attributes["container_name"])
- set(resource.attributes["container_id"], attributes["container_id"])
- delete_key(attributes, "container_id")
- delete_key(attributes, "container_name")

- delete_matching_keys(cache, "^(message|trace_id|span_id|severity_number)$")

- merge_maps(attributes,cache, "insert")
batch:
send_batch_max_size: 2048
send_batch_size: 1024
Expand All @@ -38,6 +69,9 @@ processors:
override: true
timeout: 2s
receivers:
fluentforward/socket:
# ECS will send logs to this socket
endpoint: unix:///var/run/fluent.sock
awsecscontainermetrics:
collection_interval: 10s
otlp:
Expand All @@ -56,6 +90,16 @@ receivers:
- 127.0.0.1:8888
service:
pipelines:
logs:
exporters:
- coralogix
processors:
- transform/firelens
- resource/metadata
- resourcedetection
- batch
receivers:
- fluentforward/socket
metrics:
exporters:
- coralogix
Expand Down
44 changes: 44 additions & 0 deletions otel-ecs-fargate/parameter_store.cf
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,42 @@ Resources:
subsystem_name_attributes:
- service.name
- aws.ecs.docker.name
- container_name
timeout: 30s
traces:
headers:
X-Coralogix-Distribution: ecs-fargate-integration/0.0.1
processors:
transform/firelens:
log_statements:
- context: log
statements:
# parse json logs
- merge_maps(cache, ParseJSON(body), "insert") where IsMatch(body, "^\\{")
# set message
- set(body, cache["message"]) where cache["message"] != nil

# set trace/span id
- set(trace_id.string, cache["trace_id"]) where cache["trace_id"] != nil
- set(span_id.string, cache["span_id"]) where cache["span_id"] != nil

# set severity when available
- set(severity_number, SEVERITY_NUMBER_INFO) where IsMatch(cache["level"], "(?i)info")
- set(severity_number, SEVERITY_NUMBER_WARN) where IsMatch(cache["level"], "(?i)warn")
- set(severity_number, SEVERITY_NUMBER_ERROR) where IsMatch(cache["level"], "(?i)err")
- set(severity_number, SEVERITY_NUMBER_DEBUG) where IsMatch(cache["level"], "(?i)debug")
- set(severity_number, SEVERITY_NUMBER_TRACE) where IsMatch(cache["level"], "(?i)trace")
- set(severity_number, cache["severity_number"]) where cache["severity_number"] != nil

# move log_record attributes to resource
- set(resource.attributes["container_name"], attributes["container_name"])
- set(resource.attributes["container_id"], attributes["container_id"])
- delete_key(attributes, "container_id")
- delete_key(attributes, "container_name")

- delete_matching_keys(cache, "^(message|trace_id|span_id|severity_number)$")

- merge_maps(attributes,cache, "insert")
batch:
send_batch_max_size: 2048
send_batch_size: 1024
Expand All @@ -49,6 +80,9 @@ Resources:
override: true
timeout: 2s
receivers:
fluentforward/socket:
# ECS will send logs to this socket
endpoint: unix:///var/run/fluent.sock
awsecscontainermetrics:
collection_interval: 10s
otlp:
Expand All @@ -67,6 +101,16 @@ Resources:
- 127.0.0.1:8888
service:
pipelines:
logs:
exporters:
- coralogix
processors:
- transform/firelens
- resource/metadata
- resourcedetection
- batch
receivers:
- fluentforward/socket
metrics:
exporters:
- coralogix
Expand Down

0 comments on commit 4c40458

Please sign in to comment.