Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relocate additional metrics v2 content #1327

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

feorlen
Copy link
Collaborator

@feorlen feorlen commented Sep 19, 2024

More rearranging to feature metrics v3 information:

  • Remove metrics v2 and Grafana dashboard pages from the TOC, access by link only.
  • Replace examples that use v2 metrics with corresponding v3 metrics.
  • Consolidate remaining references to v2 behavior/metrics on the metrics v2 page.
  • Update mc admin prometheus generate.
  • Update mc admin prometheus metrics.

Update: v2 isn't actually deprecated.

  • "un-deprecate" (more-or-less) but it's still demoted.

Note: refs to the new mc admin prometheus parameters don't work yet and cause build warnings. Likely require a fixup PR after merge, because intersphinx.

Staged:

Metrics and alerts
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/metrics-and-alerts.html

Prometheus
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/collect-minio-metrics-using-prometheus.html

Deprecated v2
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/metrics-v2-deprecated.html#minio-metrics-v2

Deprecated (?) Grafana dashboards
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/grafana.html

mc admin prometheus generate
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/reference/minio-mc-admin/mc-admin-prometheus-generate.html

mc admin prometheus metrics
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/reference/minio-mc-admin/mc-admin-prometheus-metrics.html

@feorlen feorlen marked this pull request as draft September 19, 2024 21:23

- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this "for v3" stuff is clumsy, but I'm concerned users will forget which they are using and follow incorrect instructions. Ideas?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tabs or separate pages is my only suggestion.
No matter what, people will use the wrong command for the wrong version, unfortunately.

static_configs:
- targets: [minio.example.net]

To scrape multiple types of metrics, run :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` for each type and add the ``job_name`` section to the ``scrape_configs`` in your Prometheus configuration.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is how it works? I found examples in the prom docs that have multiple job_name sections.

Also, do we call parts of yaml files "sections?"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are nodes officially, but that doesn't seem very widely known or user friendly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, nodes is a conflict with other terms for our product.
So, section works well enough, I think?

Comment on lines +79 to +80
global:
scrape_interval: 60s
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this global: so if when somebody copies this into their own config there's less a chance they cause a problem with a too-small interval.

Comment on lines +97 to +98
If needed, edit the generated configuration for your environment.
Common changes include:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of stuff here for an unordered list. But tables suck. Ideas?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we aren't trying to be exhaustive with covering all the possible changes they might make, tabs might work here.


minio_node_drive_latency_us{job-"minio-job"}[5m]
Use a unique value for each job to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is accurate?

Comment on lines +145 to +155
.. code-block:: shell
:class: copyable

* - ``minio_node_drive_used_bytes``
- Total storage used on a drive.
minio_system_drive_used_bytes{job-"minio-job"}[5m]
minio_system_drive_used_inodes{job-"minio-job"}[5m]

* - ``minio_node_drive_errors_timeout``
- Total number of drive timeout errors since server start.
minio_cluster_usage_buckets_total_bytes{job-"minio-job"}[5m]
minio_cluster_usage_buckets_objects_count{job-"minio-job"}[5m]

* - ``minio_node_drive_errors_availability``
- Total number of drive I/O errors, permission denied and timeouts since server start.
minio_api_requests_total{job-"minio-job"}[5m]
minio_api_requests_errors_total{job-"minio-job"}[5m]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the original v2 metrics with something that looked likely from v3. Are there better examples?

@@ -349,7 +170,7 @@ You can modify or otherwise use these examples as guidance in building your own
- name: minio-alerts
rules:
- alert: NodesOffline
expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0
expr: avg_over_time(minio_cluster_health_nodes_offline_count{job="minio-job"}[5m]) > 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with what appeared to be a rational v2-v3 swap

@@ -358,7 +179,7 @@ You can modify or otherwise use these examples as guidance in building your own
description: "Node(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"

- alert: DisksOffline
expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0
expr: avg_over_time(minio_system_drive_offline_count{job="minio-job"}[5m]) > 0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here with what appeared to be a rational v2-v3 swap

To enable historical data visualization in MinIO Console, set the following environment variables on each node in the MinIO deployment:

- Set :envvar:`MINIO_PROMETHEUS_URL` to the URL of the Prometheus service
- Set :envvar:`MINIO_PROMETHEUS_JOB_ID` to the unique job ID assigned to the collected metrics

MinIO Grafana Dashboard
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove reference to Grafana entirely in the v3 text. Moved to the v2 page.

@@ -67,6 +70,310 @@ The following sections describe the deprecated endpoints and metrics.
For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.


Configure Prometheus to Collect and Alert using MinIO Metrics
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not bothering to make the headings sentence case on this deprecated page

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No actually I'll go back and do that. Because Reasons

Comment on lines 54 to 55
[--api_version v3] \
[TYPE --bucket <bucket name> --api_version v3]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a reasonable way to represent optional parameters that themselves have required parameters?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔
I think I'd list them as separate, optional flags. Then in the description of them mention they are required if using TYPE. And in the TYPE description, add that it requires those two flags.

@@ -90,18 +129,97 @@ Global Flags
:end-before: end-minio-mc-globals


Example
-------
Examples
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to add examples that would clarify the complexity of invoking two incompatible APIs with the same command. Suggestions welcome.

@@ -145,7 +145,6 @@ server cannot support may result in undesired behavior.
Setting or modifying the default server-side encryption settings does *not*
automatically encrypt or decrypt the existing bucket contents. If the bucket
contents *must* have consistent encryption, use the
:mc:`mc mv` mc with the :mc-cmd:`~mc mv --encrypt` or
:mc-cmd:`~mc mv --encrypt-key` arguments to manually modify the
:mc:`mc mv` mc with the :mc-cmd:`~mc mv --enc-c` argument to manually modify the
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by build warning fix

Copy link
Collaborator

@djwfyi djwfyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for your consideration.
We should also verify whether v2 really is deprecated. If it isn't, we need to back off that language throughout.


- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tabs or separate pages is my only suggestion.
No matter what, people will use the wrong command for the wrong version, unfortunately.


minio_node_drive_total{job-"minio-job"}[5m]
- Set the ``scheme`` to http for MinIO deployments not using TLS.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is the value you're setting, seems like http should also be monotyped.

Generate a v3 cluster metrics config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment:
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster type metrics for a MinIO deployment:

Generate a v3 bucket replication metrics config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:
The following example generates a scrape configuration for v3 replication type metrics of bucket ``mybucket``:

I was confused by cluster metrics earlier. I don't think the type is as needed here, but if you agree to put it with the cluster metrics, then have to be consistent.

Use :mc-cmd:`mc admin prometheus generate` to generate a scrape configuration that collects bucket metrics for a MinIO deployment:

Generate a default metrics v2 config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So perhaps consider having headings like:

Examples
----------

v3 Examples
~~~~~~~~~~

v3 example 1
++++++++++


v2 examples
~~~~~~~~~~

etc.

Comment on lines +97 to +98
If needed, edit the generated configuration for your environment.
Common changes include:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we aren't trying to be exhaustive with covering all the possible changes they might make, tabs might work here.

@feorlen feorlen changed the title Relocate additional metrics v3 content Relocate additional metrics v2 content Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants