Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could lilos_handoff::Push take &mut Option<T> to be cancel safe? #9

Open
mattfbacon opened this issue Jul 29, 2023 · 5 comments
Open

Comments

@mattfbacon
Copy link
Contributor

This way the pointer could be directly provided so the T isn't taken out unless the future succeeds.

@korrat
Copy link

korrat commented Sep 1, 2023

I believe while this fixes the issue with dropping the future returned by Push::push, API consumers cannot avoid being weakly cancel-safe themselves. To use such an API, the API consumer would need to have space for the T in its own future (or call stack). As soon as this future is cancelled, the T is dropped instead of being processed.

As I understand the problem, we want to perform two operations: generating some value and sending this value to a different task. With the current API is that we always need to generate the value before sending it. Ideally, we would only generate the value when we can be sure to succeed in sending it. lilos::spsc::Permit solves this neatly, as we can obtain the permit before generating the value. Once we obtain the permit, we are assured that there is space in the queue and sending the value will succeed without having to await. Therefore, we can generate and send the value in a single invocation of poll.

@mattfbacon
Copy link
Contributor Author

I have to think about that. I wonder it it could actually be done soundly.. if the pop end is waiting, we get a permit. Then the pop future is cancelled and forgotten (no Drop), but the permit would still seem valid. I think this one-step approach is the only valid option.

That said, maybe an API that takes make_value: impl FnOnce() -> T would work?

@cbiffle
Copy link
Owner

cbiffle commented Mar 2, 2024

Okay so I clearly need to fix my Github notification settings. Hello from like seven months later!

On @korrat's point - I tried rewriting the handoff API to take a closure, but it turns out to be incredibly subtle -- since the closure needs to be called from the taker's context it wasn't clear that it could be done. Or at least, I couldn't figure out how to do it. I'm definitely open to insights.

@mattfbacon - I think your push_from operation is clever. I need to do some testing in my codebases to see what the generated code looks like. I feel like it addresses 90% of my cancel correctness concerns with handoff -- while @korrat is correct that it doesn't save the caller from doing a potentially destructive or expensive operation to produce the T, it does at least let the caller "put it back" on cancellation of the pusher future.

The fact that some_handoff.push_from(&mut None) compiles is a bit unfortunate. Your decision to treat it as a no-op in #11 is one of several ways to address that, and has the advantage of generating very little code compared to a panic. I think your approach may be reasonable.

@cbiffle
Copy link
Owner

cbiffle commented Mar 2, 2024

After thinking about @korrat's analysis in #14, I'm less confident about push_from, because it isn't sufficient to make the transfer truly cancel-correct. The sequence of events in question:

  1. Task A parks at pop waiting for rendezvous.
  2. Task B executes push_from, which immediately completes because A was waiting. B has lost access to the item it's sending.
  3. Before resolving the pop future, Task A cancels it. This could be a result of e.g. popping from the handoff within a select_biased! that chooses an earlier branch and bails. The item is now lost with no indication.

There are various small changes I could imagine to address one piece or the other -- for instance, we could maintain a "data lost" flag so that there's at least some indication that this happened, though I don't see how I would use that in practice to fix an application.

I am currently leaning toward concluding that handoff, while super useful, is inherently cancel-unsafe, and can only be used in cases where data loss is tolerable (rare) or where both sides are careful not to cancel the push/pop futures (more common).

I've also tested the associated PR on some of my firmware, and it increases code size by a small but measurable amount. I'm investigating the causes. I suspect that the compiler is unable to entirely eliminate the operations on the Option, which is odd since it appears to be totally inlining push_from.

Given (1) the realization that a minimum-copy handoff may be inherently cancel-unsafe, and (2) the increase in lines-of-code and bytes-in-flash, I'm now leaning toward not merging this PR in its current form.

I'm definitely open to alternatives, or to being convinced otherwise.

@mattfbacon
Copy link
Contributor Author

Ah, good point, it's cancel-safe in the case where the push_from happens before the pop, but not vice versa, because of the PopWait variant. Maybe I'm missing something, but why does the popper need to provide the location in this variant? Why couldn't the popper wait until the PushWait condition occurs, then take it from that?

@cbiffle cbiffle changed the title Could Push take &mut Option<T> to be cancel safe? Could lilos_handoff::Push take &mut Option<T> to be cancel safe? Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants