Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SURE-7333] Downstream AKS Cluster Management > Cluster details page not displaying accurate message during update #179

Open
Josh-Diamond opened this issue Mar 7, 2023 · 22 comments
Assignees
Labels
JIRA Must shout kind/bug Something isn't working
Milestone

Comments

@Josh-Diamond
Copy link

Josh-Diamond commented Mar 7, 2023

SURE-7333

Setup

  • Rancher version: 2.7.0
  • Browser type & version: Chrome

Describe the bug

  • When updating a downstream AKS cluster, Cluster Management > Cluster details page does not display accurate message.

To Reproduce

  1. Fresh install of 2.7.0
  2. Provision a downstream AKS cluster
  3. Update downstream AKS cluster
  4. Navigate to Cluster Management > Cluster Details page
  5. Reproduced

Result

  • The following message is seen:
    This resource is currently in a transitioning state, but there isn't a detailed message available.

Expected Result

  • expected to show more detailed message, as seen in AKS operator logs:
    time="2023-03-07T02:10:50Z" level=info msg="Waiting for cluster [REDACTED] to update node pool [REDACTED]"

Screenshots
Screen Shot 2023-03-06 at 6 10 24 PM

@Josh-Diamond Josh-Diamond added the kind/bug Something isn't working label Mar 7, 2023
@Josh-Diamond Josh-Diamond added this to the v2.7.2 milestone Mar 7, 2023
@Josh-Diamond Josh-Diamond self-assigned this Mar 7, 2023
@kwwii
Copy link

kwwii commented Mar 7, 2023

@Josh-Diamond What extra information is provided by displaying the non-formatted text suggested?

@Josh-Diamond
Copy link
Author

Josh-Diamond commented Mar 7, 2023

@kwwii the non-formated text suggestion came from AKS Operator logs, and instead of displaying, This resource is currently in a transitioning state, but there isn't a detailed message available, my request is to replace this message w/ the more specific log coming from AKS Operator, to read Waiting for cluster [c-REDACTED] to update node pool [REDACTED].

w.r.t. What extra information is provided by displaying the non-formatted text suggested?

The specific node pool being updated. If we have access to more specific details, we should pass them along to user w/o them having to navigate off Cluster Details page. From QA perspective this is a bug.

@nwmac nwmac removed this from the v2.7.2 milestone Mar 8, 2023
@kwwii
Copy link

kwwii commented Mar 8, 2023

Thanks, I agree that we should give the user as much information as possible

@gaktive
Copy link
Member

gaktive commented Apr 24, 2023

This should be transferred to backend to provide the information via the operator.

@gaktive gaktive transferred this issue from rancher/dashboard Apr 24, 2023
@kkaempf kkaempf added this to the 2023-Q2-v2.7x milestone Apr 25, 2023
@mjura mjura self-assigned this May 4, 2023
@mjura
Copy link
Contributor

mjura commented May 4, 2023

I will look on this

@mjura
Copy link
Contributor

mjura commented May 11, 2023

It is needs to be discussed with UI team,

@kkaempf
Copy link

kkaempf commented May 12, 2023

@Josh-Diamond is there a respective JIRA issue ?

@gaktive should I create a dashboard issue or will you ?

@gaktive
Copy link
Member

gaktive commented May 12, 2023

@kkaempf You have the instant context so I'll allow you to file :)

@richard-cox
Copy link
Member

@kkaempf @gaktive I don't think this is one for the UI team (rancher/dashboard#8877 (comment)). Whoever owns the provisioning.cattle.io.cluster metadata.state.message should take this on

@mjura mjura removed their assignment May 18, 2023
@salasberryfin salasberryfin self-assigned this Jul 13, 2023
@kkaempf kkaempf modified the milestones: 2023-Q3-v2.7x, 2023-Q4-v2.8x Jul 18, 2023
@salasberryfin
Copy link
Contributor

Hi @richard-cox. @mjura and I investigated how to implement a proper messaging strategy during update and we think using the aksclusterconfigs.aks.cattle.io object would be the most sensible solution. We are already using this resource to write error messages and it is equivalent to what we need to implement for non-error messages. Is there a reason for using provisioning.cattle.io.cluster instead? This would add complexity to the operator messaging, would mean using different strategies for different types of messages and adds new dependencies to the code.

Would it be possible to consider moving to aksclusterconfigs.aks.cattle.io?

@richard-cox
Copy link
Member

The context of the cluster is the prov cluster resource itself. So when viewing a prov cluster or, importantly a list of prov clusters it's status comes from itself rather than a secondary resource.

We've hit a lot of issues in the past with secondary resources, specifically how they affect scaling. They will also break the new pagination process coming in soon.

@salasberryfin
Copy link
Contributor

Hi @richard-cox, error messages are currently being updated using this secondary resource aksclusterconfigs.aks.cattle.io (same for the other hosted providers: EKS and GKE). Using this for information messages would align with what's being used in the operator code and would make it easily extensible to other hosted providers. Are there other controllers that are using provisioning.cattle.io.cluster for updating messages?

@richard-cox
Copy link
Member

I'm not sure about controllers, however all clusters that are shown in the UI look at the provisioning cluster for state and state message.

@gaktive
Copy link
Member

gaktive commented Jul 26, 2023

Putting status/ui-blocked on this since this ties into rancher/dashboard#8877 and the label is in use on rancher/rancher for tracking UI issues.

@kkaempf kkaempf modified the milestones: v2.8.0, 2024-Q1-v2.8x Oct 17, 2023
@kkaempf kkaempf added the JIRA Must shout label Dec 6, 2023
@kkaempf
Copy link

kkaempf commented Dec 7, 2023

Despite this comment UI team believes that this must be fixed in aks-operator.

@kkaempf kkaempf changed the title Downstream AKS Cluster Management > Cluster details page not displaying accurate message during update [SURE-7333] Downstream AKS Cluster Management > Cluster details page not displaying accurate message during update Dec 12, 2023
@salasberryfin
Copy link
Contributor

After discussing what's the best way to tackle this with @mjura, we consider this should wait to have the new Rancher API available to be implemented. As we don't know when it will be available yet, we think it is best to move this back to Q2, considering it is not a high priority issue.

@kkaempf kkaempf modified the milestones: v2.8-Next1, v2.9.0 Mar 12, 2024
@kkaempf kkaempf modified the milestones: v2.9.0, v2.9-Next1 Jul 2, 2024
@mjura mjura assigned mjura and unassigned salasberryfin Aug 2, 2024
@jakefhyde
Copy link

When you create a hosted cluster (e.g. AKS), although the cluster.provisioning.cattle.io does get created, the cluster.management.cattle.io is the tertiary object, and appears to be what the UI displays statuses from. Based on my testing, when creating on of these clusters, the Updated condition is not present on the provisioning cluster object, which leads me to believe the UI is pulling from the management cluster object; this is also reinforced by the fact that the management cluster object is one manipulated during cluster creation/update. Looking in the API, I see the following for the management cluster object:

"state": {
  "error": false,
  "message": "",
  "name": "provisioning",
  "transitioning": true
},

and the following for the provisioning cluster object:

state": {
  "error": false,
  "message": "Waiting for API to be available",
  "name": "waiting",
  "transitioning": true
},

With the following showing on the UI:
image

I think in this case, it seems like the correct path for the hosted team would be to set the corresponding condition on the management cluster, be it Provisioned or Updated depending on whether the cluster has been Provisioned successfully yet, since the message of the condition is ultimately what drives the metadata.state.message from the v1 API. @richard-cox feel free to correct me if any of the above information is incorrect. I may be a bit fuzzy regarding how the UI pulls metadata.state.message when relationships are involved in the v1 API.

@richard-cox
Copy link
Member

Good spot @jakefhyde. I've investigated some more, interestingly it's different given the type of cluster

  • RKE2 --> provisioning cluster state
  • Non-RKE2 --> management cluster state, or if empty the provisioning cluster state

For tracking

  • shell/models/provisioning.cattle.io.cluster.js _stateObj contains the logic to decide between prov or mgmt cluster state object
  • shell/plugins/dashboard-store/resource-class.js stateDescription is the text shown in cluster lists
  • shell/components/ResourceDetail/Masthead.vue banner is the text shown in the cluster detail page's banner

@gaktive
Copy link
Member

gaktive commented Aug 22, 2024

cc @nwmac

@kkaempf since this isn't going to be in 2.9.1, should we change the milestone to the next minor version 2.10.0 and then assess? @mjura is chiming in that the CRD used to source messages needs to be changed.

@kkaempf kkaempf modified the milestones: v2.9.1, v2.9-Next1 Aug 23, 2024
@kkaempf
Copy link

kkaempf commented Aug 23, 2024

Since @mjura is already on it, I moved it to 2.9-Next1

Michal, feel free to move to a different milestone as you see fit 😉

@nwmac
Copy link
Member

nwmac commented Aug 23, 2024

@richard-cox The fact we use mgmt cluster is most likely historical - I thought we would prefer to move towards using the provisioning cluster where possible and ultimately remove the need for the UI to request the mgmt cluster. Would it not be better for the hosted operators to update the provisioning cluster and for the UI to be updated to look there?

@mjura mjura modified the milestones: v2.9-Next1, v2.10.0 Aug 23, 2024
@mjura
Copy link
Contributor

mjura commented Aug 23, 2024

Since @mjura is already on it, I moved it to 2.9-Next1

We talked with Gary and I moved it to v2.10, due it will be needed UI help, it this issue has low priority

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JIRA Must shout kind/bug Something isn't working
Development

No branches or pull requests

9 participants