-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/tags for elasticsearchwriter #10074
base: master
Are you sure you want to change the base?
Feature/tags for elasticsearchwriter #10074
Conversation
Thank you for your pull request. Before we can look at it, you'll need to sign a Contributor License Agreement (CLA). Please follow instructions at https://icinga.com/company/contributor-agreement to sign the CLA. After that, please reply here with a comment and we'll verify. Contributors that have not signed yet: @SebastianOpeni
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
The CLA should be signed now. |
@cla-bot check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for your contribution! I've just been looking over the code changes and left some inline comments below, but haven't tested it with an Elastic/OpenSearch instance yet. Other than that, can you please rebase this against the current master to fix the GitHub actions. Thanks!
lib/perfdata/elasticsearchwriter.ti
Outdated
[config, required] Dictionary::Ptr host_tags_template { | ||
default {{{ | ||
return new Dictionary({ | ||
}); | ||
}}} | ||
}; | ||
[config, required] Dictionary::Ptr service_tags_template { | ||
default {{{ | ||
return new Dictionary({ | ||
}); | ||
}}} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are nested dictionaries/arrays supported or only simple key-value pairs? If not, it would be advisable to add such a validator for these two new attributes, otherwise, Icinga 2 will simply fail without any indication of that misconfigured type, e.g. when using such a config: host_tags_template = { os_name = {"foo" = "host.vars.os$"} }
[2024-09-13 11:39:38 +0200] information/ConfigItem: Committing config item(s).
[2024-09-13 11:39:38 +0200] critical/config: Error: Operator + cannot be applied to values of type 'String' and 'Dictionary'
[2024-09-13 11:39:38 +0200] critical/config: 1 error
[2024-09-13 11:39:38 +0200] critical/cli: Config validation failed. Re-run with 'icinga2 daemon -C' after fixing the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my testing netsted dictionarys worked just fine when I remember correctly, but I'm not sure about arrays so I will test both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is indeed supported, then you probably need to adjust validation errors, i.e. notice that in my example config snippet the opening $
is missing and when trying to produce an error message, you use the +
operator to concatenate string
with the nested dictionary ("Closing $ not found in macro format string '" + pair.second
and pair.second
is the nested dictionary in this case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My tests had the following results:
- 1st level nested dictionaries are supported
service_tags_template= { a = { b = "c" } }
=>tags.a.b = "c"
- 2nd-level nested dictionaries are valid, but interpreted as (multi line) strings contained in the first nested dictionary.
service_tags_template= { a = { b = { c = "d" } } }
=>tags.a.b = "{ c= "d" }"
- arrays are also valid
service_tags_template= { a = ["b", "c"] }
=>tags.a = ["b", "c"]
All of them are possible to contain macros therefore I would need a Macro Validator similar to MacroProcessor::ValidateCustomVars.
Would you prefer me:
- rewriting a similar method in the elastic-search-writer
- or creating a more general method MacroProcessor::ValidateMacroObject (or similar name) that gets called by my validator methods and also by MacroProcessor::ValidateCustomVars?
I would prefer the second option, but it would mean I change the MacroProcessor Class so I wanted to check back with you first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather question the support for endlessly nested dicts in general before you do such a big change. As far as I'm concerned, I would just accept simple key => value pairs, but I don't know all the use cases you're going to need that for. So my question to you: is it worth making that lot of major code changes and probably investing even more time for testing just to support nested dicts/arrays?
I've looked at some other writers, and none of them support nested dictionaries up to unknown levels. So, if you absolutely need nested tags, couldn't you use such "tags.a.b.c.d" = "$foo$"
as simple key-value pairs? This would really simplify the whole thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided that key-value pairs and one-level arrays of strings should be enough.
(More complex tagging can still be implemented via the usage of makros if needed.)
Hi thanks for your review :) |
44a55ec
to
2f3c2fa
Compare
lib/perfdata/elasticsearchwriter.cpp
Outdated
} | ||
} | ||
} | ||
fields->Set("tags", tags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure if tags
is the right field for this. Isn't it common that tags
is just a simple list instead of key-value pairs? I would set a different field or make it configureable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field tags
as a dictionary was used because the influxdb-writer is also using it that way.
The functionality would stay the same with a different field name so if you have a preference I could change it accordingly.
(labels
would come to mind as a alternative.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave it some thought and I could also simply allow tags to be of type array. (in addition to dict)
That would change some of the logic we had reviewed until now but if that would be preferred by you, I could also implement it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it common that
tags
is just a simple list instead of key-value pairs?
I'm afraid, in perfdata DB context, they indeed are key-value pairs:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid, in perfdata DB context, they indeed are key-value pairs:
Please show me where we set a field called tags
in one of them. Supporting tags in both just means adding key-value pairs, with no reference to a tags
field, right?. But here we set the field where I wonder if users of Elasticsearch or similar would expect a list rather than key-value pairs. tag_set
sounds more reasonable to me. Anyway, before we just merge this, I'd like to get an informed opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Icingas influxdbcommonwriter this is done here:
https://github.com/Icinga/icinga2/blob/master/lib/perfdata/influxdbcommonwriter.cpp#L253
But tags
never reaches Influx DB.
The graphite writer does only allow to configure templates for the host and service name.
Otherwise there would not be a tags
field either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found it as well just after Commenting. 🙈
That was my bad.
I'm very sorry for the confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To prevent confusion about the Type of tags
, would you prefer to:
- Rename the field to
tag_set
or - Add the configured key-values directly into the elasticSearch-Entries as done in the influxdb case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To prevent confusion about the Type of
tags
, would you prefer to:
- Rename the field to
tag_set
or
I think a tag_set property would be rather confusing in ES/Kibana.
- Add the configured key-values directly into the elasticSearch-Entries as done in the influxdb case?
This, in contrast, has the advantages:
- No
tags.
boilerplate in our ES data - Consistency with InfluxDB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After my changes the configured tags are now added to the root element.
That also allows for tags = ["foo", "bar"]
if desired.
Notice:
Since the templated_tags are added after the default fields and the performance Values this change will also allow to overwrite them.
Thanks for your answers in the review. |
Apart from #10074 (comment), code wise it should be fine now, I haven't tested it with a actual ElasticSearch instance though, but till I get the time, some of my colleagues might want a look at this as well. |
lib/perfdata/elasticsearchwriter.cpp
Outdated
} | ||
} | ||
} | ||
fields->Set("tags", tags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it common that
tags
is just a simple list instead of key-value pairs?
I'm afraid, in perfdata DB context, they indeed are key-value pairs:
doc/14-features.md: correct Elasticsearch versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests
TL;DR: It works! 👍
➜ ~ curl -fsS http://127.0.0.1:9200/icinga2-2024.09.25/_search/ |python3 -mjson.tool
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "icinga2-2024.09.25",
"_type": "_doc",
"_id": "pAd1KZIB9Wwi17GmLApi",
"_score": 1.0,
"_source": {
"@timestamp": "2024-09-25T15:53:07.855+0200",
"check_command": "hostalive",
"check_result.check_source": "ws-aklimov7777777.local",
"check_result.command": [
"/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_ping",
"-H",
"127.0.0.1",
"-c",
"5000,100%",
"-w",
"3000,80%"
],
"check_result.execution_end": "2024-09-25T15:53:07.855+0200",
"check_result.execution_start": "2024-09-25T15:53:07.854+0200",
"check_result.execution_time": 0.0009140968322753906,
"check_result.exit_status": 128,
"check_result.latency": 0.0381932258605957,
"check_result.output": "execvpe(/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_ping) failed: No such file or directory<Terminated with exit code 128 (0x80).>",
"check_result.schedule_end": "2024-09-25T15:53:07.855+0200",
"check_result.schedule_start": "2024-09-25T15:53:07.816+0200",
"check_result.state": 3,
"check_result.vars_after": {
"attempt": 2,
"reachable": true,
"state": 3,
"state_type": 0
},
"check_result.vars_before": null,
"current_check_attempt": 2,
"host": "ws-aklimov7777777.local",
"last_hard_state": 1,
"last_state": 1,
"max_check_attempts": 3,
"reachable": true,
"state": 1,
"state_type": 0,
"tags": {
"a_list_of_items": [
"foo",
"bar"
],
"custom_label": "A Custom Label",
"os_name": "Linux"
},
"timestamp": "2024-09-25T15:53:07.855+0200",
"type": "icinga2.event.checkresult"
}
},
{
"_index": "icinga2-2024.09.25",
"_type": "_doc",
"_id": "pQd1KZIB9Wwi17GmLApj",
"_score": 1.0,
"_source": {
"@timestamp": "2024-09-25T15:53:07.855+0200",
"check_command": "hostalive",
"check_result.check_source": "ws-aklimov7777777.local",
"check_result.command": [
"/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_ping",
"-H",
"127.0.0.1",
"-c",
"5000,100%",
"-w",
"3000,80%"
],
"check_result.execution_end": "2024-09-25T15:53:07.855+0200",
"check_result.execution_start": "2024-09-25T15:53:07.854+0200",
"check_result.execution_time": 0.0009140968322753906,
"check_result.exit_status": 128,
"check_result.latency": 0.0381932258605957,
"check_result.output": "execvpe(/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_ping) failed: No such file or directory<Terminated with exit code 128 (0x80).>",
"check_result.schedule_end": "2024-09-25T15:53:07.855+0200",
"check_result.schedule_start": "2024-09-25T15:53:07.816+0200",
"check_result.state": 3,
"check_result.vars_after": {
"attempt": 2,
"reachable": true,
"state": 3,
"state_type": 0
},
"check_result.vars_before": null,
"current_check_attempt": 2,
"host": "ws-aklimov7777777.local",
"last_hard_state": 1,
"last_state": 1,
"max_check_attempts": 3,
"state": 1,
"tags": {
"a_list_of_items": [
"foo",
"bar"
],
"custom_label": "A Custom Label",
"os_name": "Linux"
},
"timestamp": "2024-09-25T15:53:07.855+0200",
"type": "icinga2.event.statechange"
}
},
{
"_index": "icinga2-2024.09.25",
"_type": "_doc",
"_id": "pgd1KZIB9Wwi17GmLApj",
"_score": 1.0,
"_source": {
"@timestamp": "2024-09-25T15:53:10.689+0200",
"check_command": "disk",
"check_result.check_source": "ws-aklimov7777777.local",
"check_result.command": [
"/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_disk",
"-c",
"10%",
"-w",
"20%",
"-X",
"none",
"-X",
"tmpfs",
"-X",
"sysfs",
"-X",
"proc",
"-X",
"configfs",
"-X",
"devtmpfs",
"-X",
"devfs",
"-X",
"mtmfs",
"-X",
"tracefs",
"-X",
"cgroup",
"-X",
"fuse.*",
"-X",
"fuse.gvfsd-fuse",
"-X",
"fuse.gvfs-fuse-daemon",
"-X",
"fuse.portal",
"-X",
"fuse.sshfs",
"-X",
"fdescfs",
"-X",
"overlay",
"-X",
"nsfs",
"-X",
"squashfs",
"-p",
"/"
],
"check_result.execution_end": "2024-09-25T15:53:10.689+0200",
"check_result.execution_start": "2024-09-25T15:53:10.684+0200",
"check_result.execution_time": 0.004788875579833984,
"check_result.exit_status": 128,
"check_result.latency": 0.015203237533569336,
"check_result.output": "execvpe(/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_disk) failed: No such file or directory<Terminated with exit code 128 (0x80).>",
"check_result.schedule_end": "2024-09-25T15:53:10.690+0200",
"check_result.schedule_start": "2024-09-25T15:53:10.670+0200",
"check_result.state": 3,
"check_result.vars_after": {
"attempt": 2,
"reachable": true,
"state": 3,
"state_type": 0
},
"check_result.vars_before": null,
"current_check_attempt": 2,
"host": "ws-aklimov7777777.local",
"last_hard_state": 3,
"last_state": 3,
"max_check_attempts": 5,
"reachable": true,
"service": "disk /",
"state": 3,
"state_type": 0,
"tags": {
"a_list_of_items": [
"foo",
"bar"
],
"custom_label": "A Custom Label",
"os_name": "Linux",
"sv_name": "disk /"
},
"timestamp": "2024-09-25T15:53:10.689+0200",
"type": "icinga2.event.checkresult"
}
},
{
"_index": "icinga2-2024.09.25",
"_type": "_doc",
"_id": "pwd1KZIB9Wwi17GmLApj",
"_score": 1.0,
"_source": {
"@timestamp": "2024-09-25T15:53:10.689+0200",
"check_command": "disk",
"check_result.check_source": "ws-aklimov7777777.local",
"check_result.command": [
"/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_disk",
"-c",
"10%",
"-w",
"20%",
"-X",
"none",
"-X",
"tmpfs",
"-X",
"sysfs",
"-X",
"proc",
"-X",
"configfs",
"-X",
"devtmpfs",
"-X",
"devfs",
"-X",
"mtmfs",
"-X",
"tracefs",
"-X",
"cgroup",
"-X",
"fuse.*",
"-X",
"fuse.gvfsd-fuse",
"-X",
"fuse.gvfs-fuse-daemon",
"-X",
"fuse.portal",
"-X",
"fuse.sshfs",
"-X",
"fdescfs",
"-X",
"overlay",
"-X",
"nsfs",
"-X",
"squashfs",
"-p",
"/"
],
"check_result.execution_end": "2024-09-25T15:53:10.689+0200",
"check_result.execution_start": "2024-09-25T15:53:10.684+0200",
"check_result.execution_time": 0.004788875579833984,
"check_result.exit_status": 128,
"check_result.latency": 0.015203237533569336,
"check_result.output": "execvpe(/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_disk) failed: No such file or directory<Terminated with exit code 128 (0x80).>",
"check_result.schedule_end": "2024-09-25T15:53:10.690+0200",
"check_result.schedule_start": "2024-09-25T15:53:10.670+0200",
"check_result.state": 3,
"check_result.vars_after": {
"attempt": 2,
"reachable": true,
"state": 3,
"state_type": 0
},
"check_result.vars_before": null,
"current_check_attempt": 2,
"host": "ws-aklimov7777777.local",
"last_hard_state": 3,
"last_state": 3,
"max_check_attempts": 5,
"service": "disk /",
"state": 3,
"tags": {
"a_list_of_items": [
"foo",
"bar"
],
"custom_label": "A Custom Label",
"os_name": "Linux",
"sv_name": "disk /"
},
"timestamp": "2024-09-25T15:53:10.689+0200",
"type": "icinga2.event.statechange"
}
},
{
"_index": "icinga2-2024.09.25",
"_type": "_doc",
"_id": "qAd1KZIB9Wwi17GmLApj",
"_score": 1.0,
"_source": {
"@timestamp": "2024-09-25T15:53:15.753+0200",
"check_command": "ping4",
"check_result.check_source": "ws-aklimov7777777.local",
"check_result.command": [
"/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_ping",
"-4",
"-H",
"127.0.0.1",
"-c",
"200,15%",
"-w",
"100,5%"
],
"check_result.execution_end": "2024-09-25T15:53:15.753+0200",
"check_result.execution_start": "2024-09-25T15:53:15.750+0200",
"check_result.execution_time": 0.002724885940551758,
"check_result.exit_status": 128,
"check_result.latency": 0.012976884841918945,
"check_result.output": "execvpe(/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_ping) failed: No such file or directory<Terminated with exit code 128 (0x80).>",
"check_result.schedule_end": "2024-09-25T15:53:15.755+0200",
"check_result.schedule_start": "2024-09-25T15:53:15.739+0200",
"check_result.state": 3,
"check_result.vars_after": {
"attempt": 2,
"reachable": true,
"state": 3,
"state_type": 0
},
"check_result.vars_before": null,
"current_check_attempt": 2,
"host": "ws-aklimov7777777.local",
"last_hard_state": 3,
"last_state": 3,
"max_check_attempts": 5,
"reachable": true,
"service": "ping4",
"state": 3,
"state_type": 0,
"tags": {
"a_list_of_items": [
"foo",
"bar"
],
"custom_label": "A Custom Label",
"os_name": "Linux",
"sv_name": "ping4"
},
"timestamp": "2024-09-25T15:53:15.753+0200",
"type": "icinga2.event.checkresult"
}
},
{
"_index": "icinga2-2024.09.25",
"_type": "_doc",
"_id": "qQd1KZIB9Wwi17GmLApj",
"_score": 1.0,
"_source": {
"@timestamp": "2024-09-25T15:53:15.753+0200",
"check_command": "ping4",
"check_result.check_source": "ws-aklimov7777777.local",
"check_result.command": [
"/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_ping",
"-4",
"-H",
"127.0.0.1",
"-c",
"200,15%",
"-w",
"100,5%"
],
"check_result.execution_end": "2024-09-25T15:53:15.753+0200",
"check_result.execution_start": "2024-09-25T15:53:15.750+0200",
"check_result.execution_time": 0.002724885940551758,
"check_result.exit_status": 128,
"check_result.latency": 0.012976884841918945,
"check_result.output": "execvpe(/Users/aklimov/NET/WS/icinga2/prefix/usr/lib/nagios/plugins/check_ping) failed: No such file or directory<Terminated with exit code 128 (0x80).>",
"check_result.schedule_end": "2024-09-25T15:53:15.755+0200",
"check_result.schedule_start": "2024-09-25T15:53:15.739+0200",
"check_result.state": 3,
"check_result.vars_after": {
"attempt": 2,
"reachable": true,
"state": 3,
"state_type": 0
},
"check_result.vars_before": null,
"current_check_attempt": 2,
"host": "ws-aklimov7777777.local",
"last_hard_state": 3,
"last_state": 3,
"max_check_attempts": 5,
"service": "ping4",
"state": 3,
"tags": {
"a_list_of_items": [
"foo",
"bar"
],
"custom_label": "A Custom Label",
"os_name": "Linux",
"sv_name": "ping4"
},
"timestamp": "2024-09-25T15:53:15.753+0200",
"type": "icinga2.event.statechange"
}
}
]
}
}
➜ ~
Offers to add additional tags to entries written by the ElasticSearchWriter.
As discussed with Eric at the icinga summit we are submitting this feature. Please revise the code and let us know if there is something we need to improve. otherwise we would be happy for timely merge.
fixes #6837