Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add Metal3 provider #82

Merged
merged 22 commits into from
Jun 21, 2024
Merged

✨ Add Metal3 provider #82

merged 22 commits into from
Jun 21, 2024

Conversation

chess-knight
Copy link
Member

@chess-knight chess-knight commented Apr 26, 2024

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #21
Fixes #98

Special notes for your reviewer:
Test on virtualized environment:
See docs https://book.metal3.io/quick-start

  1. I created Ubuntu instance in gx-scs - flavor SCS-16V-64 and 200GiB disk
  2. Create libvirt network - https://book.metal3.io/quick-start#virtualized-configuration
    $ virsh net-info baremetal
    Name:           baremetal
    UUID:           ae14ef12-4ff1-4c54-90c8-38ebdec3542b
    Active:         yes
    Persistent:     yes
    Autostart:      no
    Bridge:         metal3
  3. If the management cluster(CSO,CAPI,CAPM3,BMO) is outside the "bare-metal" instance(Ironic,libvirt), install Libvirt port-forwarding hook
    • use this hooks.json:
    {
        "bmh-vm-01": {
            "interface": "metal3",
            "private_ip": "192.168.222.150",
            "port_map": {
                "tcp": [
                    6443
                ]
            }
        }
    }
  4. Create VMs - https://book.metal3.io/quick-start#virtualized-configuration
    • e.g. create 1 for control-plane and 3 for workers:
    virt-install \
      --connect qemu:///system \
      --name bmh-vm-01 `# workers 02, 03, 04` \
      --description "Virtualized BareMetalHost" \
      --osinfo=ubuntu-lts-latest \
      --ram=12288 \
      --vcpus=2 `# e.g. 3 vcpus for workers` \
      --disk size=25 `# add second disk (--disk size=20) for workers if you want to install rook-ceph` \
      --graphics=none \
      --console pty \
      --serial pty \
      --pxe \
      --network network=baremetal,mac="00:60:2f:31:81:01" `# workers 02, 03, 04` \
      --noautoconsole
    $ virsh list
     Id   Name        State
    ---------------------------
     1    bmh-vm-01   running
     2    bmh-vm-02   running
     3    bmh-vm-03   running
     4    bmh-vm-04   running
  5. Install sushy-tools for Redfish communication - https://book.metal3.io/quick-start#sushy-tools---aka-the-bmc
    $ docker logs sushy-tools
     * Serving Flask app 'sushy_tools.emulator.main'
     * Debug mode: off
    WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
     * Running on http://192.168.222.1:8000
    Press CTRL+C to quit
  6. Create KinD management cluster - https://book.metal3.io/quick-start#management-cluster
  7. Install Dnsmasq - https://book.metal3.io/quick-start#dhcp-server
    • use this config:
    DHCP_HOSTS=00:60:2f:31:81:01,192.168.222.100;00:60:2f:31:81:02,192.168.222.101;00:60:2f:31:81:03,192.168.222.102;00:60:2f:31:81:04,192.168.222.103
    DHCP_IGNORE=tag:!known
    # IP of the host from VM perspective
    PROVISIONING_IP=192.168.222.1
    GATEWAY_IP=192.168.222.1
    DHCP_RANGE=192.168.222.100,192.168.222.149
    DNS_IP=provisioning
    
  8. Skip Image server (we use osism images) - https://book.metal3.io/quick-start#image-server
  9. Deploy Ironic - https://book.metal3.io/quick-start#deploy-ironic
    • if ironic should be accessible from outside, add public IP into ironic certificate, e.g.:
    - patch: |-
        - op: replace
          path: /spec/ipAddresses/0
          value: 192.168.222.1
        - op: add
          path: /spec/ipAddresses/-
          value: 172.18.0.2
        - op: add
          path: /spec/ipAddresses/-
          value: 213.131.230.81
      target:
        kind: Certificate
        name: ironic-cert|ironic-inspector-cert
  10. Deploy Bare Metal Operator - https://book.metal3.io/quick-start#deploy-bare-metal-operator
    • if ironic is outside of management cluster, modify bmo/ironic.env as follows:
    DEPLOY_KERNEL_URL=http://192.168.222.1:6180/images/ironic-python-agent.kernel
    DEPLOY_RAMDISK_URL=http://192.168.222.1:6180/images/ironic-python-agent.initramfs
    IRONIC_ENDPOINT=https://213.131.230.81:6385/v1/
    
    • if ironic is outside of management cluster, copy ironic-cacert secret into the management cluster, so BMO can use it(or use IRONIC_INSECURE=True)
  11. Create BareMetalHosts - https://book.metal3.io/quick-start#create-baremetalhosts
    • 1 for control-plane and 3 for workers:
    apiVersion: v1
    kind: Secret
    metadata:
      name: bml-01 # workers 02, 03, 04
    type: Opaque
    stringData:
      username: replaceme
      password: replaceme
    ---
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      name: bml-vm-01 # workers 02, 03, 04
      labels:
        type: control-plane # 'type: worker' for workers
    spec:
      online: true
      bootMACAddress: 00:60:2f:31:81:01 # workers 02, 03, 04
      bootMode: legacy
      hardwareProfile: libvirt
      bmc:
        address: redfish-virtualmedia+http://192.168.222.1:8000/redfish/v1/Systems/bmh-vm-01 # workers 02, 03, 04
        credentialsName: bml-01 # workers 02, 03, 04
    $ kubectl get bmh --show-labels
    NAME        STATE       CONSUMER   ONLINE   ERROR   AGE   LABELS
    bml-vm-01   available              true             11m   type=control-plane
    bml-vm-02   available              true             11m   type=worker
    bml-vm-03   available              true             11m   type=worker
    bml-vm-04   available              true             11m   type=worker
  12. Deploy CAPI/CAPM3/CSO
    export CLUSTER_TOPOLOGY=true
    clusterctl init --infrastructure metal3
    # apply Metal3ClusterTemplate CRD until new CAPM3 release (current v1.7.0)
    kubectl apply -f https://raw.githubusercontent.com/metal3-io/cluster-api-provider-metal3/main/config/crd/bases/infrastructure.cluster.x-k8s.io_metal3clustertemplates.yaml
    kubectl label crd metal3clustertemplates.infrastructure.cluster.x-k8s.io cluster.x-k8s.io/v1beta1=v1beta1
    # install CSO in your favourite way
  13. Create Cluster Stack
    apiVersion: clusterstack.x-k8s.io/v1alpha1
    kind: ClusterStack
    metadata:
      name: clusterstack
    spec:
      provider: metal3
      name: alpha
      kubernetesVersion: "1.28"
      channel: custom
      autoSubscribe: false
      noProvider: true
      versions:
      - v0-sha.b699b93
    $ kubectl get clusterstack
    NAME           PROVIDER   CLUSTERSTACK   K8S    CHANNEL   AUTOSUBSCRIBE   USABLE           LATEST                                       AGE   REASON   MESSAGE
    clusterstack   metal3     alpha          1.28   custom    false           v0-sha-b699b93   metal3-alpha-1-28-v0-sha-b699b93 | v1.28.9   12m
  14. Create Cluster
    apiVersion: cluster.x-k8s.io/v1beta1
    kind: Cluster
    metadata:
      name: my-cluster
    spec:
      topology:
        class: metal3-alpha-1-28-v0-sha.b699b93
        version: v1.28.9
        controlPlane:
          replicas: 1
        workers:
          machineDeployments:
          - class: default-worker
            name: alpha
            replicas: 3
        variables:
    #   Required
        - name: controlPlaneEndpoint
          value:
            host: 192.168.222.150
    #        host: 213.131.230.81
    #        port: 6443
    #   If .controlPlaneEndpoint.host is public IP, specify also private IP for kube-vip
    #    - name: controlPlaneEndpoint_private_ip
    #      value: 192.168.222.150
    #   Optional
        - name: workerHostSelector
          value:
            matchLabels:
              type: worker
        - name: controlPlaneHostSelector
          value:
            matchLabels:
              type: control-plane
    ##   Experiment with other optional variables, e.g. try rook-ceph
    #    - name: user
    #      value:
    #        name: user
    #        sshKey: ssh-ed25519 ABCD... [email protected]
    #    - name: image
    #      value:
    #        checksum: https://swift.services.a.regiocloud.tech/swift/v1/AUTH_b182637428444b9aa302bb8d5a5a418c/openstack-k8s-capi-images/ubuntu-2204-kube-v1.28/ubuntu-2204-kube-v1.28.10.qcow2.CHECKSUM
    #        checksumType: sha256
    #        format: qcow2
    #        url: https://swift.services.a.regiocloud.tech/swift/v1/AUTH_b182637428444b9aa302bb8d5a5a418c/openstack-k8s-capi-images/ubuntu-2204-kube-v1.28/ubuntu-2204-kube-v1.28.10.qcow2
    #    - name: rook_ceph_cluster_values
    #      value: |
    #        enabled: true
    #    - name: workerDataTemplate
    #      value: my-cluster-workers-template
    #    - name: controlPlaneDataTemplate
    #      value: my-cluster-controlplane-template
    #---
    #apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    #kind: Metal3DataTemplate
    #metadata:
    #  name: my-cluster-controlplane-template
    #spec:
    #  clusterName: my-cluster
    #---
    #apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    #kind: Metal3DataTemplate
    #metadata:
    #  name: my-cluster-workers-template
    #spec:
    #  clusterName: my-cluster
    $ kubectl get cluster,metal3cluster
    NAME                                  CLUSTERCLASS                       PHASE         AGE   VERSION
    cluster.cluster.x-k8s.io/my-cluster   metal3-alpha-1-28-v0-sha.b699b93   Provisioned   62m   v1.28.9
    
    NAME                                                             AGE   READY   ERROR   CLUSTER      ENDPOINT
    metal3cluster.infrastructure.cluster.x-k8s.io/my-cluster-srg2j   62m   true            my-cluster   {"host":"192.168.222.150","port":6443}
    $ clusterctl get kubeconfig my-cluster > kubeconfig.yaml
  15. Test kube-vip service loadbalancing
    $ kubectl --kubeconfig kubeconfig.yaml create deploy --image nginx --port 80 nginx
    # --load-balancer-ip needs to be specified because kube-vip-cloud-provider is missing
    $ kubectl --kubeconfig kubeconfig.yaml expose deployment nginx --port 80 --type LoadBalancer --load-balancer-ip 192.168.222.151
    $ curl 192.168.222.151
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
    html { color-scheme: light dark; }
    body { width: 35em; margin: 0 auto;
    font-family: Tahoma, Verdana, Arial, sans-serif; }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p>
    
    <p>For online documentation and support please refer to
    <a href="http://nginx.org/">nginx.org</a>.<br/>
    Commercial support is available at
    <a href="http://nginx.com/">nginx.com</a>.</p>
    
    <p><em>Thank you for using nginx.</em></p>
    </body>
    </html>

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squash commits
  • include documentation
  • add unit tests

Signed-off-by: Roman Hros <[email protected]>
Also move image values to values.yaml and rename to k8s v1-29

Signed-off-by: Roman Hros <[email protected]>
Also switch back to v1-28 and fix node names

Signed-off-by: Roman Hros <[email protected]>
Also update metrics-server addon

Signed-off-by: Roman Hros <[email protected]>
@jschoone jschoone linked an issue May 3, 2024 that may be closed by this pull request
2 tasks
We have only one variable 'image' which can be used for both types of nodes

Signed-off-by: Roman Hros <[email protected]>
Default values are based on test manifests

Signed-off-by: Roman Hros <[email protected]>
Should prevent too many restarts

Signed-off-by: Roman Hros <[email protected]>
@chess-knight chess-knight marked this pull request as ready for review May 20, 2024 14:27
It can be used, e.g. when management cluster and workload clusters are on different networks

Signed-off-by: Roman Hros <[email protected]>
@chess-knight chess-knight marked this pull request as draft May 30, 2024 13:45
@chess-knight chess-knight linked an issue May 30, 2024 that may be closed by this pull request
@chess-knight chess-knight marked this pull request as ready for review May 31, 2024 12:23
@jschoone jschoone merged commit e6db313 into main Jun 21, 2024
2 checks passed
@jschoone jschoone deleted the feat/metal3 branch June 21, 2024 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separate management cluster and ironic Cluster Stacks for Metal³
2 participants