Skip to content

Commit

Permalink
host.id has lower cardinality
Browse files Browse the repository at this point in the history
host.hostname has cardinality 100 while host.id has cardinality 50.
This happen because in the dataset there is a host.if per each couple
ho hostnames, like a single host.id and for each of them two hostnames
like 'dustin.windows' and 'dustin.linux'. This is probably an artifact
of the data generation script.

Lower cardinality fields might:
* reduce sorting overhead due to less comparisons
* improve compression due to more data clustering together

This change should at least allow us if there is any benefit in choosing
a lower cardinality field.
  • Loading branch information
salvatore-campagna committed Oct 2, 2024
1 parent 11555ab commit 36d09e9
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"synthetic_source_keep": "{{ synthetic_source_keep }}"
},
{% endif %}
"sort.field": [ "host.hostname", "@timestamp" ],
"sort.field": [ "host.id", "@timestamp" ],
"sort.order": [ "asc", "desc" ],
"sort.missing": ["_first", "_last"]
}
Expand Down

0 comments on commit 36d09e9

Please sign in to comment.