-
Notifications
You must be signed in to change notification settings - Fork 574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
o/hookstate/ctlcmd: queue service commands if run from default-configure hook #13960
o/hookstate/ctlcmd: queue service commands if run from default-configure hook #13960
Conversation
153e4fd
to
757129b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Please make sure to link updated documentation on the snapcraft forum.
This fix just makes the implementation consistent with the documented behavior https://snapcraft.io/docs/supported-snap-hooks#heading--default-configure and the snapctl docs don't mention this behavior at all. Do you have a specific doc page in mind that would need updating? |
It's not clear why the user still wants to start services from within the default configure hook. My understanding is the the reason for this was specific to the order of configure-hook VS starting services as part of the do-install flow which should not be required anymore unless I am missing something. A secondary more subtle point, it does not seem like a good idea to start/interfere with service commands before do-install start service is done. Longer term we might want explicitly limit snapctl powers depending on how far the install sequence progressed to avoid creative permutations that cause problems. |
The reason why the user have service disabled still: They use the snap for core and for classic, and in classic there is no concept of seeding configuration with gadget. They rely on the user manually modifying the configuration and starting the service, so it is required to have it disabled by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some question/comments
overlord/hookstate/ctlcmd/helpers.go
Outdated
@@ -149,7 +149,7 @@ func runServiceCommand(context *hookstate.Context, inst *servicestate.Instructio | |||
return err | |||
} | |||
|
|||
if !context.IsEphemeral() && context.HookName() == "configure" { | |||
if !context.IsEphemeral() && (context.HookName() == "configure" || context.HookName() == "default-configure") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do the tasks queued after default-configure exactly end up? it's not very transparent from tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are queued at the end after all install tasks
https://github.com/snapcore/snapd/blob/0bf3e39e00948e76ed45cbcc96facee8da181395/overlord/hookstate/ctlcmd/helpers.go#L114-L124
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added clarifying comments in the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean after the configure hook? does this solve the original issue people were having in ways they expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, after both the configure hook and the health-check hook.
For illustration this is the expected order of tasks for installing a snap which is used in the unit tests:
"prerequisites",
"download-snap",
"validate-snap",
"mount-snap",
"copy-snap-data",
"setup-profiles",
"link-snap",
"auto-connect",
"set-auto-aliases",
"setup-aliases",
"run-hook[install]",
"run-hook[default-configure]",
"start-snap-services",
"run-hook[configure]",
"run-hook[check-health]",
"exec-command", <---- task for snapctl command
"service-control", <---- task for service operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, we probably want to ask the reporter on the bugs if this is too late for them or ok?
Good point, I will check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ZeyadYasser for the fix and your explanations.
The use case is from a snap that has a one-shot service doing several firewall and network configuration to set up a Thread bridge (linked in the bug report). As mentioned in #13960 (comment), the services are disabled by default, so that the user (on classic but also a pre-built Core image) gets a chance to configure for e.g. the networking interface before starting the services. However, the same snap needs to start automatically on Core when the config is seeded from a gadget. (A workaround was suggested by @zyga to keep services enabled by default but not run them when the needed config isn't set; this could work well for a limited config set.)
In our case, with the workflow proposed in this PR, it could work as follows:
default-configure
hook callssnapctl start
if theautostart
config option is set to true.configure
hook does some validation (not done indefault-configure
to avoid redundant checks).- services start.
But I suppose this isn't valid in every use case, as some applications expect the configure
hook to execute while the services are running.
In summary, while the proposed order isn't an issue for our use case, I believe starting the services right after the hook succeeds makes this more inline with the snap best practices.
Side note: I wonder why run-hook[check-health]
is ordered between configure
and exec-command
/service-control
tasks.
Another solution I can think of is to add the ability to just enable/disable services via snapctl
without starting or stopping them. Then, the default-configure
hook could simply call for e.g. the snapctl enable --service <service>
(snapctl enable
does something else) command (without needing to queue) and the following task (start-snap-services
) should just take care of the startup. I've already discussed this with @ernestl, and I'm not sure if this is going to work. I have experimented by calling snapctl stop --disable <service>
from install
and default-configure
hooks for an enabled service, and it works as expected: the service gets disabled and never starts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the earliest we can start those services is after snap-start-services, @ZeyadYasser could you look what would be the complexity of queuing commands from default-configure there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have a variant of queueCommand
or some special handling code that does the following:
- if the service commands are coming from
default-configure
hook - simply inject the commands' tasksets after
start-snap-services
usingsnapstate.InjectTasks
snapstate.InjectTasks
takes care of the complexity of adjusting the tasks' dependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is now updated to inject service commands after start-snap-services
for the default-configure
hook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine with me assuming it solves the original problems as is
91078ff
to
42355fc
Compare
I rebased the PR to include test changes from #14032. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are issues related to where to inject if lanes are shared like if transaction=all-snaps is used
this will need re-reviews
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, test question
|
||
func (s *servicectlSuite) TestQueuedCommandsOrderingConfigureHook(c *C) { | ||
const hook = "configure" | ||
const singleTransaction = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we need the variant when this is true? or is the check above not working for that case? if we want not to test we should at least leave a comment somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, It doesn't work for configure hook. It is tricky to get snap name for all task kinds in the case of configure hook so I didn't add the single transaction fix for configure hook since it is a non-issue due how commands are queued at the very end .
I added the test (but commented out) with a comment explaining why it is left untested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
…re hook Queue "snapctl restart ..." and "snapctl start ..." commands to be run after default-configure similar to configure hook. This is to avoid a problem where the service doesn't see a new value if it uses "snapctl get ...", because it's still not commited by default-configure hook. Fixes: https://bugs.launchpad.net/snapd/+bug/2047949 Signed-off-by: Zeyad Gouda <[email protected]>
And add comments explaining tasks relative order. Signed-off-by: Zeyad Gouda <[email protected]>
Signed-off-by: Zeyad Gouda <[email protected]>
…t-snap-services Signed-off-by: Zeyad Gouda <[email protected]>
Multiple snaps could be installed in a single transaction where all snap tasksets are in a single lane. The old simplistic approach of looking up the first "start-snap-services" fails when we have multiple tasks for multiple snaps in the same lane. A test is added to trigger this corner case, and the fix just checks the snap name associated with the "start-snap-services" tasks. Signed-off-by: Zeyad Gouda <[email protected]>
…le transaction Signed-off-by: Zeyad Gouda <[email protected]>
56a6bae
to
3bcbee9
Compare
Rebased to fix failing arch tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
This PR queues
snapctl
services' commands to be run after thedefault-configure
similar to what is done for configure hook #4070. This is to avoid a problem where the service doesn't see a new value if it usessnapctl get ...
, because it's still not committed bydefault-configure
hook similar to what was described in LP#2047949 .Fixes: https://bugs.launchpad.net/snapd/+bug/2047949