Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systemd generator must parse ostree cmdline differently for aboot #2919

Closed

Conversation

ericcurtin
Copy link
Collaborator

We must resolve the /ostree/root.[ab] symlink to it's realpath and then extract the osname from it (example centos), in the non-aboot case this is what is expected from the output string of this function

@openshift-ci
Copy link

openshift-ci bot commented Jul 4, 2023

Hi @ericcurtin. Thanks for your PR.

I'm waiting for a ostreedev member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ericcurtin
Copy link
Collaborator Author

This is probably ok as-is, just doing some extra testing...

@ericcurtin ericcurtin force-pushed the ostree-system-generator-aboot branch 4 times, most recently from 1ac7b51 to 5740d96 Compare July 4, 2023 18:49
@ericcurtin
Copy link
Collaborator Author

Current issue:

Jul 04 19:09:55 localhost kernel: audit: type=1400 audit(1688497790.740:4): avc:  denied  { read } for  pid=297 comm="ostree-system-g" name="root.a" dev="vda3" ino=18350 scontext=system_u:system_r:init_t:s0 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=lnk_file permissive=0
Jul 04 19:09:55 localhost systemd[296]: /usr/lib/systemd/system-generators/ostree-system-generator failed with exit status 1.

must fix this...

@cgwalters
Copy link
Member

tcontext=unconfined_u:object_r:user_home_t:s0

It looks to me like you may be hacking on this by directly copying files from your build environment into the host? Use install /path/to/home/src/ostree/ostree /usr/bin/ostree to ensure that the target has its label set.

g_assert (regex);
g_once_init_leave (&regex_initialized, 1);
}

/* The to_parse string was added to handle the aboot case, in the aboot
* case it has to parse something like:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In latest git main, ostree-prepare-root.c links to glib at least, so if it helps us we can probably factor out a shared helper more easily between these two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a way we can keep the aboot case more consistent here by setting things up in prepare root that seems nicer to me.

(Can we have prepare-root create a symlink in /run with the data it parsed?)

Copy link
Collaborator Author

@ericcurtin ericcurtin Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way it currently is, prepare root just goes to a previously set up symlink at deploy time, I'd happily transfer that to /run if we saw benefit, but the avc is preventing us from reading that symlink at all (this is where we parse the data from the symlink) which would look something like this:

[root@localhost ~]# ls -tlr /ostree/
total 20
lrwxrwxrwx. 1 root root   87 Jun 29 12:34 root.b -> deploy/centos/deploy/6ce6fe7c07c4274f7d536f6d505b34fc1453978f576c4b35d241ab7a4dd06898.0
drwxr-xr-x. 3 root root 4096 Jun 29 12:36 boot.1.1
lrwxrwxrwx. 1 root root    8 Jun 29 12:36 boot.1 -> boot.1.1
lrwxrwxrwx. 1 root root   87 Jun 29 12:37 root.a -> deploy/centos/deploy/76fee8c28c91f92d598217254fabf3da17a309b4208bdc38a135ac54fc1e82de.0

was thinking of changing the symlink to ../ostree/deploy/centos/deploy/76fee8c28c91f92d598217254fabf3da17a309b4208bdc38a135ac54fc1e82de.0 to help satisfy the regex.

But I wasn't taking the generator into account at the time. /run likely wouldn't work as it wouldn't persist across boots?

I'm open to ideas for an alternate location if another location makes more sense...

Copy link
Collaborator Author

@ericcurtin ericcurtin Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a way we can keep the aboot case more consistent here by setting things up in prepare root that seems nicer to me.

(Can we have prepare-root create a symlink in /run with the data it parsed?)

So initially root.a and root.b would be symlinks with the exact same values as the karg for ostree= in the grub case, so it was the exact code flow once you resolved the symlink (with boot.1 or boot.0, etc.). But then I saw that sometimes that lead to broken links in rollback cases.

So now we use the realpaths instead for the root.a and root.b, the end result is the same once you parse it a little differently here. So now it's almost the same except now you have to resolve the symlink and tweak the regex just a tiny bit. I've tested with your review comments addressed and it's all working perfectly now (the selinux issue was fixed somewhere else).

I would like to try other approaches, but I think other approaches would lead to more complex changes than seen here. LoC is a poor metric, but with these 14 lines of code everything works as expected from my testing and the grub case is not effected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In latest git main, ostree-prepare-root.c links to glib at least, so if it helps us we can probably factor out a shared helper more easily between these two.

So the only role for this function seems to be to get the osname from the path for this line:

const char *stateroot_var_path = glnx_strjoina ("/sysroot/ostree/deploy/", stateroot, "/var");

does prepare-root need this functionality? I do understand the desire for a generic parser though, you could have a struct maybe that contains these fields.

centos 
76b0919b393aa2254dd4b72950514422bb213992c3885f2196ec9cb278c57c5a
0

and once you have those fields copied in a struct, just do everything generically based on those fields. It's a bigger rewrite than here though.

* /ostree/boot.1/centos/e779b04a7445bf71d241288e92b3a3f7da0d5c4af33f3cb9dc232b6484d998eb/0
*/
g_autofree char *symlink_val = NULL;
if (!strncmp (ostree_cmdline, "/ostree/root.", sizeof ("/ostree/root.") - 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also just g_str_has_prefix (ostree_cmdline, "/ostree/root.")

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Jul 5, 2023

I'm using rpms, not copying fiiles, but this symlink gets created here:

= { "aboot-deploy", "-r", path_str, "-c", abootcfg, "-o", options, aboot, NULL };

during osbuild and during finalized-staged step when upgrading, etc. Maybe I have to account for that somehow, not an selinux guru yet, but the package would not be aware of this file, because it's to be created on demand.

I have a question though @cgwalters , does /sysroot/ostree or /ostree dir exist when the generator gets called? Is it a case of an avc for a file that doesn't even exist maybe?

It's hard to tell, because that's where I'm trying to get the relevant data in this case, rather than from /proc/cmdline .

On a side note, another thing noticeable about this generator is, it's probably a todo task sometime to get the logging working for this generator, by writing the logs to /dev/kmsg instead. I don't think the current logs get written anywhere accessible.

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Jul 5, 2023

Maybe I could just create two junk links the the aboot-deploy rpm for /ostree/root.a and /ostree/root.b? to satisfy selinux, just to be aware that files are expected to be there, they can be replaced with valid ones later. Literally something like:

$ ls -ltr
total 8
drwxr-xr-x 1 root root 12 Jul  4 15:25 deploy
lrwxrwxrwx 1 root root  7 Jul  5 12:31 root.a -> invalid
lrwxrwxrwx 1 root root  7 Jul  5 12:31 root.b -> invalid

@ericcurtin ericcurtin force-pushed the ostree-system-generator-aboot branch from 5740d96 to ada3d9d Compare July 6, 2023 14:06
@ericcurtin ericcurtin marked this pull request as ready for review July 6, 2023 14:17
@ericcurtin
Copy link
Collaborator Author

Tested successfully and marked as non-draft.

@ericcurtin
Copy link
Collaborator Author

ls -ltr
total 8
drwxr-xr-x 1 root root 12 Jul 4 15:25 deploy
lrwxrwxrwx 1 root root 7 Jul 5 12:31 root.a -> invalid
lrwxrwxrwx 1 root root 7 Jul 5 12:31 root.b -> invalid

Did not need to do this in the end.

@ericcurtin ericcurtin force-pushed the ostree-system-generator-aboot branch 5 times, most recently from 964aaf1 to a335f0b Compare July 8, 2023 14:53
@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Jul 8, 2023

@cgwalters I think this is ready for review, I took some steps to make the non-aboot case less affected, now the regex is only tweaked for the aboot case.

We must resolve the /ostree/root.[ab] symlink to it's realpath and then
extract the osname from it (example centos), in the non-aboot case this
is what is expected from the output string of this function.
@ericcurtin ericcurtin force-pushed the ostree-system-generator-aboot branch from a335f0b to b8aad73 Compare July 10, 2023 14:15
@@ -93,18 +93,40 @@ path_kill_slashes (char *path)
static char *
stateroot_from_ostree_cmdline (const char *ostree_cmdline, GError **error)
Copy link
Collaborator Author

@ericcurtin ericcurtin Jul 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth doing this in a follow on PR, maybe these non-public functions should be like:

static char *
stateroot_from_ostree_target (const char *ostree_target, GError **error)

with:

s/ostree_cmdline/ostree_target/g

because in the aboot case for example, this parameter is not part of the cmdline, we get the ostree_target via other means.

@cgwalters
Copy link
Member

@ericcurtin So I think what would help a lot here is:

  • Can you make even a rough draft notes in doc/android-boot.md that describes how this works?
  • Also in that doc can you give instructions for getting a reproducer build with this setup? It's using the auto toolchain right?
  • To fast path the above is there any prebuilt qcow2s online with this setup?

@cgwalters
Copy link
Member

OK so looking at this slightly more, I am increasingly of the feeling that having ostree-impl-system-generator.c parse the ostree= karg was a mistake.

In #2928 I did change the prepare-root code (running in the initramfs, the thing that canonically parses ostree=) to add metadata into /run/ostree-booted.

And so it'd be easy to extend that to also include the stateroot. So this code here should just read /run/ostree-booted, in the same way that I changed ostree-remount.c to do so as well.

(Incidentally I think we could also change ostree-remount.c to not be a separate binary at all, and the systemd unit just does ExecStart=ostree admin remount as an implementation detail, much in the same way as we have ostree admin finalize-staged)

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Jul 14, 2023

@ericcurtin So I think what would help a lot here is:

  • Can you make even a rough draft notes in doc/android-boot.md that describes how this works?

Yes I'll push a PR for that.

  • Also in that doc can you give instructions for getting a reproducer build with this setup? It's using the auto toolchain right?

Yeah I'll add instructions, it's only buildable for CentOS Automotive Stream distribution, but Fedora Mobility SIG have expressed interest in this for OnePlus 6 (the same technique works, it's the same firmware/bootloader, just an earlier version) and I could see it being useful for many IoT devices that have Android-like firmware.

  • To fast path the above is there any prebuilt qcow2s online with this setup?

Not at the moment, but I can share a guide to build.

OK so looking at this slightly more, I am increasingly of the feeling that having ostree-impl-system-generator.c parse the ostree= karg was a mistake.

With composefs, we could make this redundant tbh, because it's gonna fail the composefs check if the filesystem you get pointed to by say a symlink is not the correct one.

In #2928 I did change the prepare-root code (running in the initramfs, the thing that canonically parses ostree=) to add metadata into /run/ostree-booted.

I'll take a look at this.

And so it'd be easy to extend that to also include the stateroot. So this code here should just read /run/ostree-booted, in the same way that I changed ostree-remount.c to do so as well.

(Incidentally I think we could also change ostree-remount.c to not be a separate binary at all, and the systemd unit just does ExecStart=ostree admin remount as an implementation detail, much in the same way as we have ostree admin finalize-staged)

Sounds reasonable.

@cgwalters
Copy link
Member

I'll take a look at this [passing state from initramfs to root].

Sure, if you want; though we need #2928 merged first, so the first step is for someone to review it.

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Jul 14, 2023

If we remove "ostree=" arg, prepare-root would still have to detect we are an android-like system to do things slightly differently, but that's easy because the "androidboot.slot_suffix=" karg will be there anyway to detect this.

@cgwalters
Copy link
Member

OK I looked at this slightly more, and I think we need #2938 - it won't work AFAIK to change just one instance of this.

I actually have some deeper questions/uncertainty about the integration of aboot here. ISTM that we may need to change the core concept of iterating over the bootloader entries as source of truth.

IOW we may need similar parsing of the aboot karg around list_deployments_process_one_boot_entry() and dependencies, right? This is the core bit that makes ostree admin status and similar work.

@cgwalters
Copy link
Member

So with #2938 we will go from three things parsing ostree= to two.

I believe we actually need a common model across the remaining two; I don't think it will work to just change things as in this PR. We should be changing ostree-sysroot.c.

So, closing - but without prejudice, feel free to reopen if you disagree.

@cgwalters cgwalters closed this Jul 17, 2023
@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Jul 18, 2023

@cgwalters in the Android Bootloader case the /boot directory is less relevant, because the actual kernel, cmdline, initramfs and dtb are written to either boot_a or boot_b partition. But we have not eradicated the /boot directory in the Android Bootloader case as even though it's not actively used, having kernel, cmdline, initramfs, dtb, etc. lying around somewhere as separate files can be useful.

Hence, the "swapped directory pattern" is not so relevant either because we don't load those files at boot time. So in order for ostree to know what root filesystem it should boot into, we set up symlinks either root.a or root.b.

During "OSTree Finalize Staged Deployment" we write to boot_a or boot_b partition and set up a symlink /ostree/root.a or /ostree/root.b.

Initially I used the exact same parameter as would be used for ostree= karg in the symlink, so an initial setup looks like this:

drwxr-xr-x. 3 root root 4096 Jul 18 11:36 /ostree/boot.1.1
lrwxrwxrwx. 1 root root    8 Jul 18 11:36 /ostree/boot.1 -> boot.1.1
lrwxrwxrwx. 1 root root   97 Jul 18 11:36 /ostree/root.a -> /sysroot//ostree/boot.1/centos/d44712f9935f7c6f30064e81ab2366e996f0989a064e2ec22326609c4b15d927/0
drwxr-xr-x. 3 root root 4096 Jul 18 11:36 /ostree/deploy
drwxr-xr-x. 7 root root 4096 Jul 18 11:36 /ostree/repo

Then after an "rpm-ostree install" we would look like this:

lrwxrwxrwx. 1 root root    8 Jul 18 11:43 /ostree/boot.0 -> boot.0.1
lrwxrwxrwx. 1 root root   97 Jul 18 11:36 /ostree/root.a -> /sysroot//ostree/boot.1/centos/d44712f9935f7c6f30064e81ab2366e996f0989a064e2ec22326609c4b15d927/0
lrwxrwxrwx. 1 root root   97 Jul 18 11:43 /ostree/root.b -> /sysroot//ostree/boot.0/centos/d44712f9935f7c6f30064e81ab2366e996f0989a064e2ec22326609c4b15d927/0
drwxr-xr-x. 3 root root 4096 Jul 18 11:43 /ostree/boot.0.1
drwxr-xr-x. 3 root root 4096 Jul 18 11:43 /ostree/deploy
drwxr-xr-x. 7 root root 4096 Jul 18 11:43 /ostree/repo

An issue here is that the /ostree/root.a symlink is now actually broken, because /ostree/boot.1 no longer exists. I fear this would break rollbacks.

So in order to fix that, I resolve boot.0 or boot.1 to their actual realpath loctions, so now it looks more like:

lrwxrwxrwx. 1 root root   87 Jun 29 12:34 /ostree/root.b -> deploy/centos/deploy/6ce6fe7c07c4274f7d536f6d505b34fc1453978f576c4b35d241ab7a4dd06898.0
drwxr-xr-x. 3 root root 4096 Jun 29 12:36 /ostree/boot.1.1
lrwxrwxrwx. 1 root root    8 Jun 29 12:36 /ostree/boot.1 -> boot.1.1
lrwxrwxrwx. 1 root root   87 Jun 29 12:37 /ostree/root.a -> deploy/centos/deploy/76fee8c28c91f92d598217254fabf3da17a309b4208bdc38a135ac54fc1e82de.0
drwxr-xr-x. 3 root root 4096 Jun 29 12:37 /ostree/deploy
drwxr-xr-x. 7 root root 4096 Jun 29 12:37 /ostree/repo

and it fixes the problem as the symlinks are never broken.

I'm open to other ideas, but this is why I'd like to parse aboot slightly differently in the Android Bootloader case. The concept of a /boot directory is a little fake with this bootloader.

@cgwalters
Copy link
Member

This is just an increasingly held opinion, but I think I'd advocate trying to teach ostree in this case to stop reading and writing bootloader entries at all in this case - the core code needs to understand that it was booted via android boot, and use that as source of truth.

@cgwalters
Copy link
Member

A lot of the complexity here in ostree is trying to support an arbitrary number (N) of userspace roots, which can use an arbitrary number (M) of "boot entries".

In the common case where the kernel doesn't change, ostree avoids re-writing a copy of the kernel+initramfs.

But, maybe big picture this is not an important optimization for android boot in the production path...particularly because based on the intersection of this with composefs and the argument that we should "strictly bind" UKI-like-thing with userspace via a throwaway key, we will never have shared boot state between two userspace trees.

Now it's interesting that you're testing with rpm-ostree install because that will reliably create the situation where multiple userspace is sharing boot state.

Anyways if I'm understanding things correctly, the way ostree is copying the kernel/initramfs into /boot/ostree is just unnecessary for this use case, right?

So I think for android boot we need to switch to basically stopping using /boot at all.

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Jul 18, 2023

This is just an increasingly held opinion, but I think I'd advocate trying to teach ostree in this case to stop reading and writing bootloader entries at all in this case - the core code needs to understand that it was booted via android boot, and use that as source of truth.

Yup in the Android Bootloader case, bootloader entries mean nothing... But the files that are in /boot directory are needed to compose the Android Boot Image server side, and we store aboot-5.14.0-339.302.el9iv.aarch64.img in /boot client-side before it's written via dd to the boot partition.

A lot of the complexity here in ostree is trying to support an arbitrary number (N) of userspace roots, which can use an arbitrary number (M) of "boot entries".

In the common case where the kernel doesn't change, ostree avoids re-writing a copy of the kernel+initramfs.

But, maybe big picture this is not an important optimization for android boot in the production path...particularly because based on the intersection of this with composefs and the argument that we should "strictly bind" UKI-like-thing with userspace via a throwaway key, we will never have shared boot state between two userspace trees.

Yes true, we need to write to either a or b boot partition in every case in the production android boot case, because of what you just described. An upgrade is just an upgrade as Android Boot Image (which contains cmdline) and userspace are a strict pair, no shared boot state as you say.

Now it's interesting that you're testing with rpm-ostree install because that will reliably create the situation where multiple userspace is sharing boot state.

I test with all kinds of upgrades "rpm-ostree upgrade", "rpm-ostree install", etc. Because I care about the developer use case as well as the production use case.

Anyways if I'm understanding things correctly, the way ostree is copying the kernel/initramfs into /boot/ostree is just unnecessary for this use case, right?

In the production use case it's not necessary, in the developer use case, it's very useful! As devs can change one of kernel, cmdline, initramfs, dtb, client-side, generate a new "/boot/aboot-5.14.0-339.302.el9iv.aarch64.img" file from this, write to boot partition and reboot.

So I think for android boot we need to switch to basically stopping using /boot at all.

I'm reluctant to do this because people are using the workflow previously described in the developer, non-secure boot case. Otherwise they have to revert to separate a non-ostree based image for development which is not great.

If we stopped using /boot, we'd need to migrate at least 5 files to some other directory for development purposes and a user has to remember these files are in a different place to a standard non-android boot image.

@cgwalters
Copy link
Member

and we store aboot-5.14.0-339.302.el9iv.aarch64.img in /boot client-side

Canonically it's actually in /usr/lib/modules/, no?

@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Jul 18, 2023

and we store aboot-5.14.0-339.302.el9iv.aarch64.img in /boot client-side

Canonically it's actually in /usr/lib/modules/, no?

Yes sorry you are right, it's a non-ostree system the files are in /boot 🤦

I'll try and remove /boot directory, those files are not there in the ostree case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants