Skip to content

Commit

Permalink
Propose release 1.5.4 (#534)
Browse files Browse the repository at this point in the history
* Fix xrefs for director Operator (#481)

Fix the xrefs for the director Operator. The xrefs were referring to the
filename instead of the id+assembly value.

* Initial pass for external ES (#483)

* Initial pass for external ES

* Updates for external ES

* Notice about deprecated behaviour
* Mention how the migration works (automatically)
* Adjust comments about observabilityStrategy: none
** Events SGs will now deploy if events are enabled
** Adjusted relevant outputs

* Apply suggestions from code review

Co-authored-by: Leif Madsen <[email protected]>

* Minor adjustments from review

---------

Co-authored-by: Leif Madsen <[email protected]>

* Trivial leftover suggestions (#485)

* Trivial leftover suggestions

* Link ES section to KB article (#486)

* Link ES section to KB article

* Update doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc

---------

Co-authored-by: Leif Madsen <[email protected]>

* Initial changes to installation for STF 1.5.3 (#484)

* Initial changes to installation for STF 1.5.3

Make the initial changes to the installation documentation for STF
1.5.3, which uses observabilityStrategy: use_redhat by default along
with preferring to install Observability Operator. Uses the community
operators catalogsource for now until OBO is officially available from
redhat-operators CatalogSource.

Updates the Makefile as well to include Red Hat OpenStack Platform 17.1.

Signed-off-by: Leif Madsen <[email protected]>

* Update install guide for pre-installed Operators

Update the installation guide layout for pre-installed Operators that
cannot be managed with OLM (due to them being cluster-scoped Operators
vs namespace-scoped Operators).

Resolves: STF-1485
Signed-off-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc

Co-authored-by: mickogeary <[email protected]>

* Adjust wording for cert-manager installation module

* Update doc-Service-Telemetry-Framework/modules/proc_deploying-stf-to-the-openshift-environment.adoc

Co-authored-by: mickogeary <[email protected]>

* Reword section that repeats itself

---------

Signed-off-by: Leif Madsen <[email protected]>
Co-authored-by: mickogeary <[email protected]>

* use_redhat and migration link (#462)

* use_redhat and migration link

* Update doc-Service-Telemetry-Framework/modules/con_observability-strategy.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/con_observability-strategy.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/con_observability-strategy.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Apply suggestions from code review

Co-authored-by: Leif Madsen <[email protected]>

* Minor typo fix

* Visual tweak

* Update doc-Service-Telemetry-Framework/modules/con_observability-strategy.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/con_observability-strategy.adoc

Co-authored-by: Leif Madsen <[email protected]>

---------

Co-authored-by: Leif Madsen <[email protected]>

* Override qdr::router_id defaults in stf-connectors (#487)

Update the documentation to provide an override to the FQDN in the
qdr::router_id configuration to avoid hostnames longer than 61 chars.

Closes rhbz#2208020

* Don't enable event collection by default on OSP (#488)

* Don't enable event collection by default on OSP

Closes STF-1498

* Remove events configurations and use defaults

The defaults for events pipeline and Ceilometer QDR publish events is
disabled and does not need to be called out specifically.

* No longer import the events dashboard (#490)

With a refocus on telemetry by default and without event usage, remove
the event dashboards as an event data store is optional and no longer
included by default.

Related STF-1498

* Installation of cluster monitoring is no longer necessary (#491)

Installation of cluster monitoring in CRC (and elsewhere) is no longer necessary for installation of STF.

Resolved by #465

* Adjust the default polling interval for collectd (#489)

Adjust the collectd polling interval to be 30 seconds instead of 5
seconds.

Related STF-1512

Co-authored-by: Victoria Martinez de la Cruz <[email protected]>

* Remove logs configuration from sample CR (#493)

Related STF-1504

* mg_master_RHOSPDOC-1380_chunk-installation-procedure (#492)

* mg_master_RHOSPDOC-1380_chunk-installation-procedure

* mg_master_RHOSPDOC-1380_chunk-installation-procedure 2nd commoit with further modularisation and chunkage

* Commit 3: renaming proc_deploying-stf-to-the-openshift-environment.adoc to con_deploying-stf-to-the-openshift-environment.adoc

* Reduce the number of Ceilometer pollsters (#497)

Reduce the number of Ceilometer pollsters to only those used by the
sample STF dashboards.

Closes: rhbz#2239390

* Deprecate the use of high availability mode in STF (#494)

* Deprecate the use of high availability mode in STF

Resolves STF-1507

* Update doc-Service-Telemetry-Framework/modules/con_high-availability.adoc

Co-authored-by: mickogeary <[email protected]>

---------

Co-authored-by: mickogeary <[email protected]>

* Fix up the table syntax in Observability Strategy (#495)

The existing table was in markdown format which isn't compatible with asciidoc syntax.

* Do not manage the event pipeline by default (#498)

We do not want events to be sent to QDR by default, as the STF 1.5.3
default configuration will deploy telemetry only

Related STF-1498

* Minor clean up and user experience updates (#496)

Some minor clean up items and convert some commands to be a bit more
user friendly and generic

Resolves STF-1533

* Creating an alert does not use curl (#500)

The Creating a standard alert route in Alertmanager section no longer
uses curl to verify the configuration was loaded, since it uses the
prometheus pod and the wget command instance. Removes an extra procedure
step that is no longer applicable.

* Eliminate duplicate line (#501)

* Adding details for QDR password auth (#502)

* Adding details for QDR password auth

* Move note about disabling auth to main section

* Update doc-Service-Telemetry-Framework/modules/proc_retrieving-the-qdr-password.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_retrieving-the-qdr-password.adoc

Co-authored-by: mickogeary <[email protected]>

---------

Co-authored-by: Leif Madsen <[email protected]>
Co-authored-by: mickogeary <[email protected]>

* Support OCP versions 4.12 through 4.14 (#503)

* Support OCP versions 4.12 through 4.14

Update the stf-attributes to cover OCP 4.12 through 4.14 as our default,
as OCP 4.10 is EOL. Update the Makefile for building to only cover RHOSP
17.1 and 16.2.

* Need html-latest for upstream publish script

* Summary: Replace incorrect stf-connectors.yaml filename with enable-stf.yaml (#504)

Tracker: https://bugzilla.redhat.com/show_bug.cgi?id=2239871

Branch: master-BZ-2239871

* Clean up the STF install (#505)

* Clean up the STF install for OCP 4.12 and later

Clean up the STF installation documentation along with a command that
will wait for the STO CSV to be ready and then automatically show the
dependencies.

Also hide contents that are no longer applicable when the supported base
version of OCP is greater than 4.10, since OCP 4.10 is now EOL.

* Adjust the ifeval to be < 4.12

* Provide the preferred STF object for deployment (#507)

Provide the preferred ServiceTelemetry object for deployments rather
than asking the administrator to build a configuration. The provided
object will result in a metrics-focused deployment without extra
configuration options, which will be a foundation for disconnected
installations in the future.

* Fix various RHOSP links and versions (#508)

Fix various links to RHOSP documentation as the paths are different between RHOSP 16.2 and 17.1. Guides were updated but there is no auto-redirect, so we'll need to verify every link that uses defaultURL parameter. This covers the initial ones while working through documentation.

Update some older version links and add a new parameter for 17.1 paths specifically.

* Update and adjust dashboard procedures (#509)

Update and adjust the dashboard installation procedures based on
testing.

* Add deprecation note for Grafana authentication (#510)

Deprecate the basic auth login parameters for Grafana login. Preference
is to use the Log in for OpenShift button going forward.

Fix syntax issues in asciidoc.

* Update deprecated Grafana login warning (#511)

Update the Grafana login deprecation warning with wording from the
documentation team.

* Add updated architecture diagrams (#499)

* Add updated architecture diagrams

* Use updated architecture diagrams

* Update architecture overview to focus on metrics

Update the architecture overview to make it clear STF is focused on
delivery of metrics from RHOSP. Provide information about use_redhat
observability strategy, and note that prior versions of STF would manage
Elasticsearch instances. Note that new installations use the
observability strategy of use_redhat, and that the guide will focus on
that deployment model. Provide a placeholder for a new xref that would
guide the user towards the deprecated architecture using events, where
our updated metrics-and-events architecture diagram would live.

* Link to observability strategy

* Remove community components from core overview

* Use ObservabilityOperator parameter to refer to OBO/COO

* Update doc-Service-Telemetry-Framework/modules/con_stf-architecture.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/con_stf-architecture.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/con_stf-architecture.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/con_stf-architecture.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/con_stf-architecture.adoc

* Update doc-Service-Telemetry-Framework/modules/con_stf-architecture.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/con_stf-architecture.adoc

Co-authored-by: mickogeary <[email protected]>

---------

Co-authored-by: mickogeary <[email protected]>

* Update install guide for dependent operators (#513)

* Update install guide for dependent operators

Update the installation guide for dependent operators. Adds installation
instructions for Cluster Observability Operator and cert-manager for Red
Hat OpenShift using the latest channels available for those Operators.
The result is that deployment of observabilityStrategy: use_redhat is
now possible as the default installation method.

Related: STF-1636

* Update doc-Service-Telemetry-Framework/modules/proc_deploying-certificate-manager-for-openshift-operator.adoc

Co-authored-by: Chris Sibbitt <[email protected]>

* Add prerequisites to STF deployment

* Adjust wording based on editorial feedback

---------

Co-authored-by: Chris Sibbitt <[email protected]>

* Clean up the prerequisites lists (#514)

The prerequisite lists were slightly wrong and have been adjusted for
correctness. Minor update of output in the same area to match latest
version of STF.

* Add removal instructions for COO (#516)

* Add removal instructions for COO

Add removal instructions for Cluster Observability Operator, pointing at
the existing product documentation.

Closes: STF-1643

* Update based on editor feedback

* Refer to cert-manager removal documentation (#515)

* Refer to cert-manager removal documentation

Update the STF removal guide to refer to the cert-manager uninstallation
procedure which is maintained by that team.

Closes: STF-1642

* Adjust cert-manager removal after editor review

* Pre-STF 1.5.3 Documentation Walkthrough and Cleanup (#517)

* Documentation walk-through and clean up

* Update architecture documentation, creating a new section describing
  the architecture changes in STF 1.5.3
* Update style for knowledge base article references based on editorial
  feedback

* Add links to COO and cert-manager

* Update cert-manager install to use oc wait

* Multi-Cloud: Add warning about unique domains

* HA: Move warning to top

* Params: Add warning about HA deprecation

* Obs Strat: Add link to migration KBA

* Multi-Cloud: Remove reference to Ansible-based deployments

* Dashboard: Fix links to collectd plugins

Add wrappers to the collectd plugins in the Dashboard guide because paths changed between 16.2 and 17.1.

* Update wording for CloudDomain overview

Update the wording in the CloudDomain overview since router connections
are controlled with router_id parameters now.

* Modularize STF architecture changes (#518)

* Update diagrams for Cluster Observability Operator (#519)

* mg_master_517_minor-style-edits (#521)

* mg_master_517_minor-style-edits

* Update doc-Service-Telemetry-Framework/modules/con_primary-parameters-of-the-servicetelemetry-object.adoc

---------

Co-authored-by: Leif Madsen <[email protected]>

* Reference 17.1 in docinfo.xml (#522)

* PrometheusRules must reference monitoring.rhobs (#523)

The PrometheusRules and editing must all reference the new
monitoring.rhobs CRD vs the old monitoring.coreos.com CRD which was
provided by the community Prometheus Operator (and potentially
conflicted with user-workload monitoring, and openshift-monitoring). All
references to PrometheusRules now refer to the monitoring.rhobs CRD and
any CLI commands are expanded for the full CRD path.

* Basic Auth in Grafana no longer supported (#525)

* Adjust prometheus query to use token (#520)

* Adjust prometheus query to use token

* Add section for prometheus token handling

* Correction for RBAC changes

* Add link to OCP token secret docs

* Specifics about UI perms

* Update doc-Service-Telemetry-Framework/modules/proc_connecting-an-external-dashboard-system.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_connecting-an-external-dashboard-system.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_connecting-an-external-dashboard-system.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_connecting-an-external-dashboard-system.adoc

Co-authored-by: Leif Madsen <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_accessing-uis-for-stf-components.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_accessing-uis-for-stf-components.adoc

Co-authored-by: mickogeary <[email protected]>

---------

Co-authored-by: Leif Madsen <[email protected]>
Co-authored-by: mickogeary <[email protected]>

* Update installation to target Grafana Operator v5 (#526)

Update the dashboarding installation procedures to target Grafana
Operator v5 by default.

Resolves: JIRA#STF-1680

* Add enable dashboard procedure (#527)

* Add enable dashboard procedure

Update the import dashboards procedure to be enable dashboards procedure
now that STF has the ability to manage the dashboards which were
formerly imported via URL.

Also includes some minor procedure updates in related areas that were
referenced in dashboard documentation.

Resolves: JIRA#STF-1624

* Update doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc

Co-authored-by: mickogeary <[email protected]>

* Update doc-Service-Telemetry-Framework/modules/proc_importing-dashboards.adoc

Co-authored-by: mickogeary <[email protected]>

---------

Co-authored-by: mickogeary <[email protected]>

* Update OCP version support status (#529)

Update the version support status to specifically say that STF is
supported on OCP EUS releases. While the STF bundles are generated for a
range of releases, this is to support the ability of customers to
upgrade OCP clusters between EUS releases without needing to remove STF
first. Only minor testing is done against standard lifecycle releases of
OCP (odd-numbered minor releases).

* Update required resource permission reference (#528)

Update the required resource permission reference to use the Grafana
Operator v5 group.

* Drop unused module found in other issue (#533)

* mg-master_RHOSPDOC-1200_STF-disconnected (#531)

* mg-master_RHOSPDOC-1200_STF-disconnected

* added more info about mirror types and verificiation

* 3rd commit

* another commit from feedback. Added xref and removed openshiftshort as well as a few other changes

* another push to fix broken xref

* Update doc-Service-Telemetry-Framework/modules/proc_deploying-stf-on-openshift-disconnected-environments.adoc

* fix mentions of OCP

* edits based on SME feedback

* more edits based on SME feedback

* Minor syntax clean up

* Update doc-Service-Telemetry-Framework/assemblies/assembly_preparing-your-ocp-environment-for-stf.adoc

---------

Co-authored-by: Leif Madsen <[email protected]>

---------

Signed-off-by: Leif Madsen <[email protected]>
Co-authored-by: Leif Madsen <[email protected]>
Co-authored-by: Chris Sibbitt <[email protected]>
Co-authored-by: mickogeary <[email protected]>
Co-authored-by: Victoria Martinez de la Cruz <[email protected]>
Co-authored-by: Roger Heslop <[email protected]>
  • Loading branch information
6 people committed Mar 5, 2024
1 parent a33d4ac commit dfbaa7a
Show file tree
Hide file tree
Showing 20 changed files with 248 additions and 215 deletions.
2 changes: 0 additions & 2 deletions common/global/stf-attributes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ endif::[]
ifeval::["{build}" == "upstream"]
:ObservabilityOperator: Observability{nbsp}Operator
:OpenShift: OpenShift
:OpenShiftShort: OKD
:OpenStack: OpenStack
:OpenStackShort: OSP
:OpenStackVersion: Wallaby
Expand All @@ -58,7 +57,6 @@ endif::[]
ifeval::["{build}" == "downstream"]
:ObservabilityOperator: Cluster{nbsp}Observability{nbsp}Operator
:OpenShift: Red{nbsp}Hat{nbsp}OpenShift{nbsp}Container{nbsp}Platform
:OpenShiftShort: OCP
:OpenStack: Red{nbsp}Hat{nbsp}OpenStack{nbsp}Platform
:OpenStackShort: RHOSP
:OpenStackVersion: 17.1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,11 @@ ifdef::include_when_16[]
* xref:container-health-and-api-status_assembly-advanced-features[Monitoring container health and API status]
endif::include_when_16[]


//Dashboards
include::../modules/con_dashboards.adoc[leveloffset=+1]
include::../modules/proc_setting-up-grafana-to-host-the-dashboard.adoc[leveloffset=+2]
ifdef::include_when_16[]
// TODO: either rewrite or drop this procedure. We now provide the preferred downstream RHEL Grafana workload image in the deployment procedure.
//include::../modules/proc_overriding-the-default-grafana-container-image.adoc[leveloffset=+2]
include::../modules/proc_importing-dashboards.adoc[leveloffset=+2]
endif::include_when_16[]
include::../modules/proc_retrieving-and-setting-grafana-login-credentials.adoc[leveloffset=+2]

include::../modules/proc_connecting-an-external-dashboard-system.adoc[leveloffset=+2]

//Editing the metrics retention time period
include::../modules/con_metrics-retention-time-period.adoc[leveloffset=+1]
Expand Down Expand Up @@ -69,13 +63,10 @@ include::../modules/con_resource-usage-of-openstack.adoc[leveloffset=+1]
include::../modules/proc_disabling-resource-usage-monitoring-of-openstack-services.adoc[leveloffset=+2]

//Monitoring container health

include::../modules/con_container-health-and-api-status.adoc[leveloffset=+1]
include::../modules/proc_disabling-container-health-and-api-status-monitoring.adoc[leveloffset=+2]
endif::include_when_16[]



//reset the context
ifdef::parent-context[:context: {parent-context}]
ifndef::parent-context[:!context:]
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,18 @@ ifeval::["{SupportedOpenShiftVersion}" == "{NextSupportedOpenShiftVersion}"]
* {OpenShift} version {SupportedOpenShiftVersion} is running.
endif::[]
ifeval::["{SupportedOpenShiftVersion}" != "{NextSupportedOpenShiftVersion}"]
* An {OpenShift} version inclusive of {SupportedOpenShiftVersion} through {NextSupportedOpenShiftVersion} is running.
* An {OpenShift} Extended Update Support (EUS) release version {SupportedOpenShiftVersion} or {NextSupportedOpenShiftVersion} is running.
endif::[]
* You have prepared your {OpenShift} environment and ensured that there is persistent storage and enough resources to run the {ProjectShort} components on top of the {OpenShift} environment. For more information about {ProjectShort} performance, see the Red Hat Knowledge Base article https://access.redhat.com/articles/4907241[Service Telemetry Framework Performance and Scaling].
* Your environment is fully connected. {ProjectShort} does not work in a {OpenShift}-disconnected environments or network proxy environments.
* You have deployed {ProjectShort} in a fully connected or {OpenShift}-disconnected environments. {ProjectShort} is unavailable in network proxy environments.

ifeval::["{build}" == "downstream"]
[IMPORTANT]
ifeval::["{SupportedOpenShiftVersion}" == "{NextSupportedOpenShiftVersion}"]
{ProjectShort} is compatible with {OpenShift} version {SupportedOpenShiftVersion}
endif::[]
ifeval::["{SupportedOpenShiftVersion}" != "{NextSupportedOpenShiftVersion}"]
{ProjectShort} is compatible with {OpenShift} version {SupportedOpenShiftVersion} through {NextSupportedOpenShiftVersion}.
{ProjectShort} is compatible with {OpenShift} versions {SupportedOpenShiftVersion} and {NextSupportedOpenShiftVersion}.
endif::[]
endif::[]

Expand All @@ -42,6 +42,7 @@ endif::[]
* For more information about Operator catalogs, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/operators/understanding/olm-rh-catalogs.html[_Red Hat-provided Operator catalogs_].
* For more information about the cert-manager Operator for Red Hat, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/security/cert_manager_operator/index.html[_cert-manager Operator for Red Hat OpenShift overview_].
* For more information about {ObservabilityOperator}, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/monitoring/cluster_observability_operator/cluster-observability-operator-overview.html[_Cluster Observability Operator Overview_].
* For more information about OpenShift life cycle policy and Extended Update Support (EUS), see https://access.redhat.com/support/policy/updates/openshift[_Red Hat OpenShift Container Platform Life Cycle Policy_].

include::../modules/con_deploying-stf-to-the-openshift-environment.adoc[leveloffset=+1]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ ifeval::["{SupportedOpenShiftVersion}" == "{NextSupportedOpenShiftVersion}"]
{ProjectShort} is compatible with {OpenShift} version {SupportedOpenShiftVersion}
endif::[]
ifeval::["{SupportedOpenShiftVersion}" != "{NextSupportedOpenShiftVersion}"]
{ProjectShort} is compatible with {OpenShift} version {SupportedOpenShiftVersion} through {NextSupportedOpenShiftVersion}.
{ProjectShort} is compatible with {OpenShift} Extended Update Support (EUS) release versions {SupportedOpenShiftVersion} and {NextSupportedOpenShiftVersion}.
endif::[]
endif::[]

Expand All @@ -40,6 +40,7 @@ endif::[]
* https://access.redhat.com/documentation/en-us/openshift_container_platform/{NextSupportedOpenShiftVersion}/[{OpenShift} product documentation]
* https://access.redhat.com/articles/4907241[Service Telemetry Framework Performance and Scaling]
* https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/welcome/index.html#cluster-installer-activities[OpenShift Container Platform {NextSupportedOpenShiftVersion} Documentation]
* https://access.redhat.com/support/policy/updates/openshift[Red Hat OpenShift Container Platform Life Cycle Policy]



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ To prepare your {OpenShift} environment for {Project} ({ProjectShort}), you must

* Ensure that you have persistent storage available in your {OpenShift} cluster for a production-grade deployment. For more information, see <<persistent-volumes_assembly-preparing-your-ocp-environment-for-stf>>.
* Ensure that enough resources are available to run the Operators and the application containers. For more information, see <<resource-allocation_assembly-preparing-your-ocp-environment-for-stf>>.
* Ensure that you have a fully connected network environment. For more information, see xref:con-network-considerations-for-service-telemetry-framework_assembly-preparing-your-ocp-environment-for-stf[].

include::../modules/con_observability-strategy.adoc[leveloffset=+1]
include::../modules/con_persistent-volumes.adoc[leveloffset=+1]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Use the third-party application, Grafana, to visualize system-level metrics that
For more information about configuring data collectors, see xref:configuring-red-hat-openstack-platform-overcloud-for-stf_assembly-completing-the-stf-configuration[].

ifdef::include_when_16[]
//TODO: can re-work this once we have OSP13 dashboard(s) to show. Can't use container health checks or monitoring in OSP13.
You can use dashboards to monitor a cloud:

Infrastructure dashboard::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
= Customizing the deployment

[role="_abstract"]
The Service Telemetry Operator watches for a `ServiceTelemetry` manifest to load into {OpenShift} ({OpenShiftShort}). The Operator then creates other objects in memory, which results in the dependent Operators creating the workloads they are responsible for managing.
The Service Telemetry Operator watches for a `ServiceTelemetry` manifest to load into {OpenShift}. The Operator then creates other objects in memory, which results in the dependent Operators creating the workloads they are responsible for managing.

[WARNING]
====
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
[id="con-network-considerations-for-service-telemetry-framework_{context}"]
= Network considerations for Service Telemetry Framework

You can only deploy {Project} ({ProjectShort}) in a fully connected network environment. You cannot deploy {ProjectShort} in {OpenShift}-disconnected environments or network proxy environments.
You can deploy {Project} ({ProjectShort}) in fully connected network environments or in {OpenShift}-disconnected environments. You cannot deploy {ProjectShort} in network proxy environments.
Original file line number Diff line number Diff line change
Expand Up @@ -87,10 +87,12 @@ ifeval::["{SupportedOpenShiftVersion}" == "{NextSupportedOpenShiftVersion}"]
* {OpenShift} {SupportedOpenShiftVersion}
endif::[]
ifeval::["{SupportedOpenShiftVersion}" != "{NextSupportedOpenShiftVersion}"]
* {OpenShift} {SupportedOpenShiftVersion} through {NextSupportedOpenShiftVersion}
* {OpenShift} Extended Update Support (EUS) releases {SupportedOpenShiftVersion} and {NextSupportedOpenShiftVersion}
endif::[]
* Infrastructure platform

For more information about the {OpenShift} EUS releases, see link:https://access.redhat.com/support/policy/updates/openshift[Red Hat OpenShift Container Platform Life Cycle Policy].

[[osp-stf-server-side-monitoring]]
.Server-side STF monitoring infrastructure
image::363_OpenStack_STF_updates_0923_deployment_prereq.png[Server-side STF monitoring infrastructure]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,6 @@
[role="_abstract"]
Red Hat supports the core Operators and workloads, including {MessageBus}, {ObservabilityOperator} (Prometheus, Alertmanager), Service Telemetry Operator, and Smart Gateway Operator. Red Hat does not support the community Operators or workload components, inclusive of Elasticsearch, Grafana, and their Operators.

You can only deploy {ProjectShort} in a fully connected network environment. You cannot deploy {ProjectShort} in {OpenShift}-disconnected environments or network proxy environments.
You can deploy {Project} ({ProjectShort}) in fully connected network environments or in {OpenShift}-disconnected environments. You cannot deploy {ProjectShort} in network proxy environments.

For more information about {ProjectShort} life cycle and support status, see the https://access.redhat.com/node/6225361[{Project} Supported Version Matrix].
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,18 @@
[role="_abstract"]
In {OpenShift}, applications are exposed to the external network through a route. For more information about routes, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/networking/configuring_ingress_cluster_traffic/overview-traffic.html[Configuring ingress cluster traffic].

In {Project} ({ProjectShort}), HTTPS routes are exposed for each service that has a web-based interface. These routes are protected by {OpenShift} RBAC and any user that has a `ClusterRoleBinding` that enables them to view {OpenShift} Namespaces can log in. For more information about RBAC, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/authentication/using-rbac.html[Using RBAC to define and apply permissions].
In {Project} ({ProjectShort}), HTTPS routes are exposed for each service that has a web-based interface and protected by {OpenShift} role-based access control (RBAC).

You need the following permissions to access the corresponding component UI's:

[source,json,options="nowrap"]
----
{"namespace":"service-telemetry", "resource":"grafana", "group":"grafana.integreatly.org", "verb":"get"}
{"namespace":"service-telemetry", "resource":"prometheus", "group":"monitoring.rhobs", "verb":"get"}
{"namespace":"service-telemetry", "resource":"alertmanager", "group":"monitoring.rhobs", "verb":"get"}
----

For more information about RBAC, see https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/authentication/using-rbac.html[Using RBAC to define and apply permissions].

.Procedure

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ EOF
+
[source,bash]
----
$ for o in alertmanager/default prometheus/default elasticsearch/elasticsearch grafana/default; do oc delete $o; done
$ for o in alertmanagers.monitoring.rhobs/default prometheuses.monitoring.rhobs/default elasticsearch/elasticsearch grafana/default-grafana; do oc delete $o; done
----
+
. To verify that all workloads are operating correctly, view the pods and the status of each pod:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@

[id="connecting-an-external-dashboard-system_{context}"]
= Connecting an external dashboard system

It is possible to configure third-party visualization tools to connect to the {ProjectShort} Prometheus for metrics retrieval. Access is controlled via an OAuth token, and a ServiceAccount is already created that has (only) the required permissions. A new OAuth token can be generated against this account for the external system to use.

To use the authentication token, the third-party tool must be configured to supply an HTTP Bearer Token Authorization header as described in RFC6750. Consult the documentation of the third-party tool for how to configure this header. For example link:https://grafana.com/docs/grafana/latest/datasources/prometheus/configure-prometheus-data-source/#custom-http-headers[Configure Prometheus - Custom HTTP Headers] in the _Grafana Documentation_.

.Procedure

. Log in to {OpenShift}.

. Change to the `service-telemetry` namespace:
+
[source,bash]
----
$ oc project service-telemetry
----

. Create a new token secret for the stf-prometheus-reader service account
+
[source,bash]
----
$ oc create -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: my-prometheus-reader-token
namespace: service-telemetry
annotations:
kubernetes.io/service-account.name: stf-prometheus-reader
type: kubernetes.io/service-account-token
EOF
----

. Retrieve the token from the secret
+
[source,bash]
----
$ TOKEN=$(oc get secret my-prometheus-reader-token -o template='{{.data.token}}' | base64 -d)
----

. Retrieve the Prometheus host name
+
[source,bash]
----
$ PROM_HOST=$(oc get route default-prometheus-proxy -ogo-template='{{ .spec.host }}')
----

. Test the access token
+
[source,bash]
----
$ curl -k -H "Authorization: Bearer ${TOKEN}" https://${PROM_HOST}/api/v1/query?query=up
{"status":"success",[...]
----

. Configure your third-party tool with the PROM_HOST and TOKEN values from above
+
[source,bash]
----
$ echo $PROM_HOST
$ echo $TOKEN
----

. The token remains valid as long as the secret exists. You can revoke the token by deleting the secret.
+
[source,bash]
----
$ oc delete secret my-prometheus-reader-token
secret "my-prometheus-reader-token" deleted
----

.Additional information

For more information about service account token secrets, see link:https://docs.openshift.com/container-platform/{NextSupportedOpenShiftVersion}/nodes/pods/nodes-pods-secrets.html#nodes-pods-secrets-creating-sa_nodes-pods-secrets[Creating a service account token secret] in the _OpenShift Container Platform Documentation_.
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ To change the rule, edit the value of the `expr` parameter.
+
[source,bash,options="nowrap"]
----
$ curl -k --user "internal:$(oc get secret default-prometheus-htpasswd -ogo-template='{{ .data.password | base64decode }}')" https://$(oc get route default-prometheus-proxy -ogo-template='{{ .spec.host }}')/api/v1/rules
$ curl -k -H "Authorization: Bearer $(oc create token stf-prometheus-reader)" https://$(oc get route default-prometheus-proxy -ogo-template='{{ .spec.host }}')/api/v1/rules
{"status":"success","data":{"groups":[{"name":"./openstack.rules","file":"/etc/prometheus/rules/prometheus-default-rulefiles-0/service-telemetry-prometheus-alarm-rules.yaml","rules":[{"state":"inactive","name":"Collectd metrics receive count is zero","query":"rate(sg_total_collectd_msg_received_count[1m]) == 0","duration":0,"labels":{},"annotations":{},"alerts":[],"health":"ok","evaluationTime":0.00034627,"lastEvaluation":"2021-12-07T17:23:22.160448028Z","type":"alerting"}],"interval":30,"evaluationTime":0.000353787,"lastEvaluation":"2021-12-07T17:23:22.160444017Z"}]}}
----
Expand Down
Loading

0 comments on commit dfbaa7a

Please sign in to comment.