CAPG: Upstream CCM manifest doesn't work #666

jayesh-srivastava · 2024-04-19T05:33:44Z

Tried deploying CCM in a CAPG cluster and used the provided CCM manifest from (https://github.com/kubernetes/cloud-provider-gcp/blob/master/deploy/packages/default/manifest.yaml).
The CCM pod is stuck in CrashLoopBack with this error:

unable to load configmap based request-header-client-ca-file: Get "https://127.0.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 127.0.0.1:443: connect: connection refused

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2024-04-19T05:33:51Z

This issue is currently awaiting triage.

If the repository mantainers determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mcbenjemaa · 2024-04-26T10:20:10Z

Please use:

  command: ['/usr/local/bin/cloud-controller-manager']
  args:
  - --cloud-provider=gce
  - --leader-elect=true
  - --use-service-account-credentials

and remove the env.

mcbenjemaa · 2024-04-26T10:20:27Z

/kind support

jayesh-srivastava · 2024-04-29T21:42:19Z

Hi @mcbenjemaa , Thanks for the help. CCM pod is up now with these args

  - args:
    - --cloud-provider=gce
    - --leader-elect=true
    - --use-service-account-credentials
    - --allocate-node-cidrs=true
    - --cluster-cidr=192.168.0.0/16
    - --configure-cloud-routes=false

One more doubt, I see the cloud-controller-manager image being used is k8scloudprovidergcp/cloud-controller-manager:latest . How can I use k8s version specific images for ccm?

BenTheElder · 2024-05-07T21:38:14Z

You may have to build the image while the release process is being revampled, there are instructions in the README.

The :latest tag is aimed at CI / testing of the project itself I think.

/retitle CAPG: Upstream CCM manifest doesn't work

I don't think the manifest is necessarily meant to work with CAPG, I would expect CAPG to handle deploying everything?

Otherwise this may be in scope for #686

mcbenjemaa · 2024-05-14T13:12:05Z

Self deployed CCM:
i got this error:

message="Error syncing load balancer: failed to ensure load balancer: instance not found"

k8s-triage-robot · 2024-08-12T13:24:51Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

esierra-stratio · 2024-08-23T08:49:06Z

Something similar here, i'm trying to deploy the Cloud Controller Manager (CCM) and I'm encountering the following error:

I0823 08:10:42.838284       1 node_controller.go:391] Initializing node minplus0-md-2-vbvmr-856l7 with cloud provider
I0823 08:10:42.920926       1 gen.go:15649] GCEInstances.Get(context.Background.WithDeadline(2024-08-23 09:10:42.83965981 +0000 UTC m=+3629.567729051 [59m59.918720336s]), Key{"minplus0-md-2-vbvmr-856l7", zone: "europe-west4-b"}) = <nil>, googleapi: Error 404: The resource 'projects/clusterapi-369611/zones/europe-west4-b/instances/minplus0-md-2-vbvmr-856l7' was not found, notFound
E0823 08:10:42.921062       1 node_controller.go:213] error syncing 'minplus0-md-2-vbvmr-856l7': failed to get instance metadata for node minplus0-md-2-vbvmr-856l7: failed to get instance ID from cloud provider: instance not found, requeuing

I don't understand why CCM is adding the label zone as:

I0823 08:10:41.974944       1 node_controller.go:493] Adding node label from cloud provider: beta.kubernetes.io/instance-type=n2-standard-2
I0823 08:10:41.974950       1 node_controller.go:494] Adding node label from cloud provider: node.kubernetes.io/instance-type=n2-standard-2
I0823 08:10:41.974954       1 node_controller.go:505] Adding node label from cloud provider: failure-domain.beta.kubernetes.io/zone=europe-west4-b
I0823 08:10:41.974958       1 node_controller.go:506] Adding node label from cloud provider: topology.kubernetes.io/zone=europe-west4-b
I0823 08:10:41.974963       1 node_controller.go:516] Adding node label from cloud provider: failure-domain.beta.kubernetes.io/region=europe-west4
I0823 08:10:41.974968       1 node_controller.go:517] Adding node label from cloud provider: topology.kubernetes.io/region=europe-west4

The correct zone should be gce://clusterapi-369611/europe-west4-c/minplus0-md-2-vbvmr-856l7.
This is how I'm deploying CCM:

        - name: cloud-controller-manager
          image: k8scloudprovidergcp/cloud-controller-manager:latest
          imagePullPolicy: IfNotPresent
          # ko puts it somewhere else... command: ['/usr/local/bin/cloud-controller-manager']
          command: ['/usr/local/bin/cloud-controller-manager']
          args:
            - --cloud-provider=gce  # Add your own cloud provider here!
            - --leader-elect=true
            - --use-service-account-credentials
            # these flags will vary for every cloud provider
            - --allocate-node-cidrs=true
            - --configure-cloud-routes=true
            - --cluster-cidr=192.168.0.0/16
            - --v=4
          livenessProbe:
            failureThreshold: 3
            httpGet:
              host: 127.0.0.1
              path: /healthz
              port: 10258
              scheme: HTTPS
            initialDelaySeconds: 15
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 15
          resources:
            requests:
              cpu: "200m"
          volumeMounts:
            - mountPath: /etc/kubernetes/cloud.config
              name: cloudconfig
              readOnly: true
      hostNetwork: true
      priorityClassName: system-cluster-critical
      volumes:
        - hostPath:
            path: /etc/kubernetes/cloud.config
            type: ""
          name: cloudconfig

aojea · 2024-08-23T22:20:42Z

The correct zone should be gce://clusterapi-369611/europe-west4-c/minplus0-md-2-vbvmr-856l7.

what do you mean by correct zone there?

the instance url is https://www.googleapis.com/compute/v1/projects/{PROJECT}/zones/{ZONE}/instances/{VM_INSTANCE}

that is the providerId, isn't it?

esierra-stratio · 2024-08-26T07:31:55Z

The issue is that the GCEInstances.Get function constructs the provider ID with the wrong zone. It assumes the zone must match where the master CCM is deployed (in this case, europe-west4-b), instead of the correct one, which is europe-west4-c. That's why the CCM couldn't find the instance.

Is there any way to make the CCM check every single zone? Maybe a multizone option or something similar?

esierra-stratio · 2024-08-26T08:48:47Z

Solved!

          args:
            - --cloud-provider=gce  # Add your own cloud provider here!
            - --leader-elect=true
            - --use-service-account-credentials
            # these flags will vary for every cloud provider
            - --allocate-node-cidrs=true
            - --cluster-cidr=192.168.0.0/16
            - --v=4
            - --cloud-config=/etc/kubernetes/gce.conf
          livenessProbe:
            failureThreshold: 3
            httpGet:
              host: 127.0.0.1
              path: /healthz
              port: 10258
              scheme: HTTPS
            initialDelaySeconds: 15
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 15
          resources:
            requests:
              cpu: "200m"
          volumeMounts:
            - mountPath: /etc/kubernetes/gce.conf
              name: cloudconfig
              readOnly: true
      hostNetwork: true
      priorityClassName: system-cluster-critical
      volumes:
        - hostPath:
            path: /etc/kubernetes/gce.conf
            type: FileOrCreate
          name: cloudconfig

where gce.conf:

[Global]
multizone=true

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 19, 2024

k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Apr 26, 2024

k8s-ci-robot changed the title ~~Upstream CCM manifest doesn't work~~ CAPG: Upstream CCM manifest doesn't work May 7, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CAPG: Upstream CCM manifest doesn't work #666

CAPG: Upstream CCM manifest doesn't work #666

jayesh-srivastava commented Apr 19, 2024

k8s-ci-robot commented Apr 19, 2024

mcbenjemaa commented Apr 26, 2024

mcbenjemaa commented Apr 26, 2024

jayesh-srivastava commented Apr 29, 2024

BenTheElder commented May 7, 2024

mcbenjemaa commented May 14, 2024

k8s-triage-robot commented Aug 12, 2024

esierra-stratio commented Aug 23, 2024

aojea commented Aug 23, 2024

esierra-stratio commented Aug 26, 2024 •

edited

Loading

esierra-stratio commented Aug 26, 2024 •

edited

Loading

CAPG: Upstream CCM manifest doesn't work #666

CAPG: Upstream CCM manifest doesn't work #666

Comments

jayesh-srivastava commented Apr 19, 2024

k8s-ci-robot commented Apr 19, 2024

mcbenjemaa commented Apr 26, 2024

mcbenjemaa commented Apr 26, 2024

jayesh-srivastava commented Apr 29, 2024

BenTheElder commented May 7, 2024

mcbenjemaa commented May 14, 2024

k8s-triage-robot commented Aug 12, 2024

esierra-stratio commented Aug 23, 2024

aojea commented Aug 23, 2024

esierra-stratio commented Aug 26, 2024 • edited Loading

esierra-stratio commented Aug 26, 2024 • edited Loading

esierra-stratio commented Aug 26, 2024 •

edited

Loading

esierra-stratio commented Aug 26, 2024 •

edited

Loading