Relocate additional metrics v2 content #1327

feorlen · 2024-09-19T21:23:05Z

More rearranging to feature metrics v3 information:

Remove metrics v2 and Grafana dashboard pages from the TOC, access by link only.
Replace examples that use v2 metrics with corresponding v3 metrics.
Consolidate remaining references to v2 behavior/metrics on the metrics v2 page.
Update mc admin prometheus generate.
Update mc admin prometheus metrics.

Update: v2 isn't actually deprecated.

"un-deprecate" (more-or-less) but it's still demoted.

Note: refs to the new mc admin prometheus parameters don't work yet and cause build warnings. Likely require a fixup PR after merge, because intersphinx.

Staged:

Metrics and alerts
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/metrics-and-alerts.html

Prometheus
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/collect-minio-metrics-using-prometheus.html

Deprecated v2
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/metrics-v2-deprecated.html#minio-metrics-v2

Deprecated (?) Grafana dashboards
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/operations/monitoring/grafana.html

mc admin prometheus generate
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/reference/minio-mc-admin/mc-admin-prometheus-generate.html

mc admin prometheus metrics
http://192.241.195.202:9000/staging/remove-recommended-v2-metrics/linux/reference/minio-mc-admin/mc-admin-prometheus-metrics.html

source/operations/checklists/software.rst

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

feorlen · 2024-09-19T21:43:25Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst


- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
+For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:


All of this "for v3" stuff is clumsy, but I'm concerned users will forget which they are using and follow incorrect instructions. Ideas?

Tabs or separate pages is my only suggestion.
No matter what, people will use the wrong command for the wrong version, unfortunately.

feorlen · 2024-09-19T21:45:02Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

+     static_configs:
+     - targets: [minio.example.net]
+
+To scrape multiple types of metrics, run :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` for each type and add the ``job_name`` section to the ``scrape_configs`` in your Prometheus configuration.


I guess this is how it works? I found examples in the prom docs that have multiple job_name sections.

Also, do we call parts of yaml files "sections?"

I think they are nodes officially, but that doesn't seem very widely known or user friendly.

Also, nodes is a conflict with other terms for our product.
So, section works well enough, I think?

feorlen · 2024-09-19T21:46:28Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

+   global:
+     scrape_interval: 60s


Added this global: so if when somebody copies this into their own config there's less a chance they cause a problem with a too-small interval.

feorlen · 2024-09-19T21:47:16Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

+If needed, edit the generated configuration for your environment.
+Common changes include:


There's a lot of stuff here for an unordered list. But tables suck. Ideas?

Since we aren't trying to be exhaustive with covering all the possible changes they might make, tabs might work here.

feorlen · 2024-09-19T21:47:52Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst


-         minio_node_drive_latency_us{job-"minio-job"}[5m]
+  Use a unique value for each job to ensure isolation of the deployment metrics from any others collected by that Prometheus service.


I think this is accurate?

feorlen · 2024-09-19T21:49:33Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

+.. code-block:: shell
+   :class: copyable

-         * - ``minio_node_drive_used_bytes``
-           - Total storage used on a drive.
+   minio_system_drive_used_bytes{job-"minio-job"}[5m]
+   minio_system_drive_used_inodes{job-"minio-job"}[5m]

-         * - ``minio_node_drive_errors_timeout``
-           - Total number of drive timeout errors since server start.
+   minio_cluster_usage_buckets_total_bytes{job-"minio-job"}[5m]
+   minio_cluster_usage_buckets_objects_count{job-"minio-job"}[5m]

-         * - ``minio_node_drive_errors_availability``
-           - Total number of drive I/O errors, permission denied and timeouts since server start.
+   minio_api_requests_total{job-"minio-job"}[5m]
+   minio_api_requests_errors_total{job-"minio-job"}[5m]


Replaced the original v2 metrics with something that looked likely from v3. Are there better examples?

feorlen · 2024-09-19T21:50:57Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

@@ -349,7 +170,7 @@ You can modify or otherwise use these examples as guidance in building your own
   - name: minio-alerts
     rules:
     - alert: NodesOffline
-       expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0
+       expr: avg_over_time(minio_cluster_health_nodes_offline_count{job="minio-job"}[5m]) > 0


Same here with what appeared to be a rational v2-v3 swap

feorlen · 2024-09-19T21:51:09Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

@@ -358,7 +179,7 @@ You can modify or otherwise use these examples as guidance in building your own
         description: "Node(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"

     - alert: DisksOffline
-       expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0
+       expr: avg_over_time(minio_system_drive_offline_count{job="minio-job"}[5m]) > 0


Same here with what appeared to be a rational v2-v3 swap

source/operations/monitoring/metrics-and-alerts.rst

feorlen · 2024-09-19T21:55:04Z

source/operations/monitoring/metrics-and-alerts.rst

   To enable historical data visualization in MinIO Console, set the following environment variables on each node in the MinIO deployment:

 - Set :envvar:`MINIO_PROMETHEUS_URL` to the URL of the Prometheus service
 - Set :envvar:`MINIO_PROMETHEUS_JOB_ID` to the unique job ID assigned to the collected metrics

-MinIO Grafana Dashboard


Remove reference to Grafana entirely in the v3 text. Moved to the v2 page.

feorlen · 2024-09-19T21:56:42Z

source/operations/monitoring/metrics-v2-deprecated.rst

@@ -67,6 +70,310 @@ The following sections describe the deprecated endpoints and metrics.
      For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.


+Configure Prometheus to Collect and Alert using MinIO Metrics


Not bothering to make the headings sentence case on this deprecated page

No actually I'll go back and do that. Because Reasons

feorlen · 2024-09-19T22:00:05Z

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst

+					   [--api_version v3]                              \
+					   [TYPE --bucket <bucket name> --api_version v3]


Maybe this is a reasonable way to represent optional parameters that themselves have required parameters?

🤔
I think I'd list them as separate, optional flags. Then in the description of them mention they are required if using TYPE. And in the TYPE description, add that it requires those two flags.

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst

feorlen · 2024-09-19T22:03:10Z

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst

@@ -90,18 +129,97 @@ Global Flags
   :end-before: end-minio-mc-globals


-Example
-------
+Examples


Tried to add examples that would clarify the complexity of invoking two incompatible APIs with the same command. Suggestions welcome.

feorlen · 2024-09-19T22:07:06Z

source/reference/minio-mc/mc-encrypt-set.rst

@@ -145,7 +145,6 @@ server cannot support may result in undesired behavior.
 Setting or modifying the default server-side encryption settings does *not*
 automatically encrypt or decrypt the existing bucket contents. If the bucket
 contents *must* have consistent encryption, use the
-:mc:`mc mv` mc with the :mc-cmd:`~mc mv --encrypt` or
-:mc-cmd:`~mc mv --encrypt-key` arguments to manually modify the
+:mc:`mc mv` mc with the :mc-cmd:`~mc mv --enc-c` argument to manually modify the


drive-by build warning fix

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst

djwfyi

Suggestions for your consideration.
We should also verify whether v2 really is deprecated. If it isn't, we need to back off that language throughout.

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

djwfyi · 2024-09-20T15:49:44Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst


- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
+For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:


Tabs or separate pages is my only suggestion.
No matter what, people will use the wrong command for the wrong version, unfortunately.

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

djwfyi · 2024-09-20T16:25:15Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst


-         minio_node_drive_total{job-"minio-job"}[5m]
+- Set the ``scheme`` to http for MinIO deployments not using TLS.


Since it is the value you're setting, seems like http should also be monotyped.

djwfyi · 2024-09-20T16:56:02Z

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst

+Generate a v3 cluster metrics config
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment:


Suggested change

Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment:

Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster type metrics for a MinIO deployment:

djwfyi · 2024-09-20T17:01:26Z

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst

+Generate a v3 bucket replication metrics config
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:


Suggested change

The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:

The following example generates a scrape configuration for v3 replication type metrics of bucket ``mybucket``:

I was confused by cluster metrics earlier. I don't think the type is as needed here, but if you agree to put it with the cluster metrics, then have to be consistent.

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst

djwfyi · 2024-09-20T17:08:24Z

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst

-Use :mc-cmd:`mc admin prometheus generate` to generate a scrape configuration that collects bucket metrics for a MinIO deployment:
+
+Generate a default metrics v2 config
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


So perhaps consider having headings like:

Examples ---------- v3 Examples ~~~~~~~~~~ v3 example 1 ++++++++++ v2 examples ~~~~~~~~~~

etc.

djwfyi · 2024-09-20T17:12:19Z

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst

+If needed, edit the generated configuration for your environment.
+Common changes include:


Since we aren't trying to be exhaustive with covering all the possible changes they might make, tabs might work here.

feorlen added 3 commits September 13, 2024 17:02

draft: more metrics v2/v3 rework

3fd1af0

fix random build warning

1ff838a

rearrange more metrics v2 content

2e38434

feorlen marked this pull request as draft September 19, 2024 21:23

feorlen commented Sep 19, 2024

View reviewed changes

source/operations/checklists/software.rst Outdated Show resolved Hide resolved

feorlen commented Sep 19, 2024

View reviewed changes

source/operations/monitoring/collect-minio-metrics-using-prometheus.rst Outdated Show resolved Hide resolved

feorlen commented Sep 19, 2024

View reviewed changes

source/operations/monitoring/metrics-and-alerts.rst Outdated Show resolved Hide resolved

feorlen commented Sep 19, 2024

View reviewed changes

source/operations/monitoring/metrics-and-alerts.rst Show resolved Hide resolved

feorlen commented Sep 19, 2024

View reviewed changes

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst Outdated Show resolved Hide resolved

feorlen commented Sep 19, 2024

View reviewed changes

source/reference/minio-mc-admin/mc-admin-prometheus-generate.rst Outdated Show resolved Hide resolved

djwfyi reviewed Sep 20, 2024

View reviewed changes

partially address feedback, plus other edits

e6c442e

feorlen changed the title ~~Relocate additional metrics v3 content~~ Relocate additional metrics v2 content Sep 20, 2024

partial edits

a5bfd27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relocate additional metrics v2 content #1327

Relocate additional metrics v2 content #1327

feorlen commented Sep 19, 2024 •

edited

Loading

feorlen Sep 19, 2024

djwfyi Sep 20, 2024

feorlen Sep 19, 2024

djwfyi Sep 20, 2024

djwfyi Sep 20, 2024

feorlen Sep 19, 2024

feorlen Sep 19, 2024

djwfyi Sep 20, 2024

feorlen Sep 19, 2024

feorlen Sep 19, 2024

feorlen Sep 19, 2024

feorlen Sep 19, 2024

feorlen Sep 19, 2024

feorlen Sep 19, 2024

feorlen Sep 20, 2024

feorlen Sep 19, 2024

djwfyi Sep 20, 2024

feorlen Sep 19, 2024

feorlen Sep 19, 2024

djwfyi left a comment

djwfyi Sep 20, 2024

djwfyi Sep 20, 2024

djwfyi Sep 20, 2024

djwfyi Sep 20, 2024

djwfyi Sep 20, 2024

djwfyi Sep 20, 2024


		- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
		For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:

		If needed, edit the generated configuration for your environment.
		Common changes include:


		minio_node_drive_latency_us{job-"minio-job"}[5m]
		Use a unique value for each job to ensure isolation of the deployment metrics from any others collected by that Prometheus service.

		@@ -67,6 +70,310 @@ The following sections describe the deprecated endpoints and metrics.
		For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.


		Configure Prometheus to Collect and Alert using MinIO Metrics

		[--api_version v3] \
		[TYPE --bucket <bucket name> --api_version v3]


		minio_node_drive_total{job-"minio-job"}[5m]
		- Set the ``scheme`` to http for MinIO deployments not using TLS.

	Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster metrics for a MinIO deployment:
	Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects v3 cluster type metrics for a MinIO deployment:

	The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:
	The following example generates a scrape configuration for v3 replication type metrics of bucket ``mybucket``:

Relocate additional metrics v2 content #1327

Are you sure you want to change the base?

Relocate additional metrics v2 content #1327

Conversation

feorlen commented Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djwfyi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

feorlen commented Sep 19, 2024 •

edited

Loading