New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Make ceph device persistent #773

Open

bogdando wants to merge 1 commit into openstack-k8s-operators:main from bogdando:fix_persist_ceph

+51 −1

Contributor

bogdando commented Mar 13, 2024

Persist /dev/vg2/data-lv2 as a systemd service to align ceph deployment with that ci-framework does for Ceph deployment on standalone tripleo.

Add standalone_revert.sh script to ensure the time is synchronized and /dev/vg2/data-lv2 device is recreated, after restoring VM from the clean snapshot.

Add env vars to allow ssh commands functional after revert is done (for Makefile targets standalone_deploy and standalone_revert).

openshift-ci bot requested review from fao89 and karelyatin

March 13, 2024 16:39

bogdando requested review from jistr and fultonj and removed request for fao89 and karelyatin

March 13, 2024 16:40

bogdando force-pushed the fix_persist_ceph branch from 247a8b3 to 9dfda88 Compare

March 13, 2024 16:41

Contributor

fao89 commented Mar 13, 2024

/approve

Contributor

openshift-ci bot commented Mar 13, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bogdando, fao89

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [fao89]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot added the approved label

softwarefactory-project-zuul bot commented Mar 13, 2024

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/e0d81657cbbb4bb58bb401a70ffd2cb1

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 30m 07s
✔️ install-yamls-crc-podified-edpm-baremetal SUCCESS in 1h 10m 59s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 05m 36s
❌ cifmw-data-plane-adoption-osp-17-to-extracted-crc FAILURE in 43m 58s

Contributor Author

bogdando commented Mar 14, 2024

recheck rdoproject.org/github-check no logs

softwarefactory-project-zuul bot commented Mar 14, 2024

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/cfab01e44e094bf894303b55d0bf4181

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 40m 05s
✔️ install-yamls-crc-podified-edpm-baremetal SUCCESS in 1h 18m 56s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 05m 55s
❌ cifmw-data-plane-adoption-osp-17-to-extracted-crc FAILURE in 42m 16s

Contributor Author

bogdando commented Mar 14, 2024

recheck rdoproject.org/github-check no logs

softwarefactory-project-zuul bot commented Mar 14, 2024

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/962de64242fb4cb49414dadddd259a2d

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 34m 15s
✔️ install-yamls-crc-podified-edpm-baremetal SUCCESS in 1h 13m 06s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 04m 50s
❌ cifmw-data-plane-adoption-osp-17-to-extracted-crc FAILURE in 43m 18s


          Make ceph device persistent

1e5a03f

Persist /dev/vg2/data-lv2 as a systemd service to align ceph deployment
with that ci-framework does for Ceph deployment on standalone tripleo.

Add standalone_revert.sh script to ensure the time is synchronized
and /dev/vg2/data-lv2 device is recreated, after restoring VM from
the clean snapshot.

Add env vars to allow ssh commands functional after revert is done
(for Makefile targets standalone_deploy and standalone_revert).

Signed-off-by: Bohdan Dobrelia <[email protected]>

bogdando force-pushed the fix_persist_ceph branch from 9dfda88 to 1e5a03f Compare

March 15, 2024 13:45

softwarefactory-project-zuul bot commented Mar 15, 2024

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/366fb829535a4b09908840fe0f7ac8f3

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 32m 15s
✔️ install-yamls-crc-podified-edpm-baremetal SUCCESS in 1h 09m 48s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 06m 45s
❌ cifmw-data-plane-adoption-osp-17-to-extracted-crc FAILURE in 46m 45s

Contributor

fao89 commented Mar 20, 2024

recheck

softwarefactory-project-zuul bot commented Mar 20, 2024

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/b48e21eadf2d437ba33b404f0e24709c

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 52m 17s
✔️ install-yamls-crc-podified-edpm-baremetal SUCCESS in 1h 20m 31s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 13m 51s
❌ cifmw-data-plane-adoption-osp-17-to-extracted-crc FAILURE in 42m 40s

jistr reviewed

View reviewed changes

devsetup/standalone/ceph.sh

+              cat /tmp/ceph-osd-losetup.service | sudo tee /etc/systemd/system/ceph-osd-losetup.service
+              sudo chmod 0644 /etc/systemd/system/ceph-osd-losetup.service
+              sudo systemctl daemon-reload
+              sudo systemctl enable --now ceph-osd-losetup.service

Contributor

jistr Mar 22, 2024

Why is this needed? In my env i can snapshot-restore using the make targets and the loopback device is still present. Are you snapshotting some other way?

Contributor Author

bogdando Mar 22, 2024 •

edited

Loading

testing shows that ensuring the unit is started after reboot is sufficient (and having it stopped results in missing loop device). Reverting doesn't cause problems here. So I will rework this

devsetup/scripts/standalone_revert.sh

+              SSH_OPT="-o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i $SSH_KEY_FILE"
+              virsh snapshot-revert --domain edpm-compute-${EDPM_COMPUTE_SUFFIX} --snapshotname clean
+              ssh $SSH_OPT root@$IP systemctl stop chronyd ';' chronyd -q  \'pool pool.ntp.org iburst\' ';' systemctl start chronyd

Contributor

jistr Mar 22, 2024

+1 on setting the clock, which doesn't seem to be done automatically after reverting from the snapshot.

But why the manual setting with pool.ntp.org? That will get blocked inside our network. For me just systemctl restart chronyd seems to bring the clock up to date (and it's using the configured NTP server).

Contributor Author

bogdando Mar 22, 2024

good point. I think we can split this off into a different patch

openshift-merge-robot added the needs-rebase label

Contributor

openshift-merge-robot commented Apr 20, 2024

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved needs-rebase