Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestControl + Dispatcher == deadlock #4104

Open
Baccata opened this issue Jul 18, 2024 · 13 comments
Open

TestControl + Dispatcher == deadlock #4104

Baccata opened this issue Jul 18, 2024 · 13 comments

Comments

@Baccata
Copy link
Contributor

Baccata commented Jul 18, 2024

Using CE 3.5.4, invoking a Dispatcher#unsafeRunSync in a program run with TestControl appears to deadlock

//> using dep "org.typelevel::cats-effect-testkit:3.5.4"

import cats.effect.IOApp
import cats.effect.IO
import cats.effect.std.Dispatcher
import scala.concurrent.duration._
import cats.effect.testkit.TestControl

object Main extends IOApp.Simple {

  // exact same thing happens using parallel 
  val program = Dispatcher.sequential[IO].use { dispatcher =>
    IO(dispatcher.unsafeRunSync(IO(1)))
  }

  def run: IO[Unit] =
    TestControl
      .executeEmbed(program)
      .flatMap(IO.println)
      .timeout(2.seconds)  // timeout doesn't kick-in 

}
@djspiewak
Copy link
Member

I'm pretty sure this is expected behavior. TestControl is a single-threaded executor, so unsafeRunSync in all its forms will deadlock.

@Baccata
Copy link
Contributor Author

Baccata commented Jul 18, 2024

Damn, forgot about that.

By any chance, is there anything that could help mitigate the issue, to help other poor souls understand that it's a no-no without spending a few hours debugging ? Could the Dispatcher somehow detect that it's being used in an improper runtime and throw a descriptive error or something ?

@djspiewak
Copy link
Member

First thing that comes to mind is leveraging BlockContext, since unsafeRunSync uses that. It would basically mean that any time we detect a scala.concurrent.blocking, we would immediately error out and fail the test (something like an UnsupportedOperationException feels right).

@Baccata
Copy link
Contributor Author

Baccata commented Jul 22, 2024

Can you elaborate ?

As of now, printing BlockingContext.current in a program that runs on TestControl prints the id of the WorkerThread from the outer runtime. So assuming you're suggesting to momentarily set the BlockingContext to some ad-hoc instance in TestControl, I'm somewhat concerned about the lifecycle, as scala.concurrent.blocking invokes a ThreadLocal.

Then there's the question of whether you're suggesting that an ad-hoc BlockingContext would always throw (which would probably break legitimate uses in TestControl), or whether you're suggesting that the dispatcher implementation would have to inspect BlockingContext.current and decide to throw if it detects the ad-hoc one (which feels somewhat off, separation-of-concern wise).

@armanbilge
Copy link
Member

armanbilge commented Jul 22, 2024

I'm somewhat concerned about the lifecycle

TestControl can use BlockContext's existing lifecycle-managing API:

BlockContext.withBlockContext(myContext) {
  // then this block runs with myContext as the handler
  // for scala.concurrent.blocking
}

Then there's the question of whether you're suggesting that an ad-hoc BlockingContext would always throw (which would probably break legitimate uses in TestControl)

I am concerned about this too, but I am not entirely convinced it would break legitimate use-cases. TestControl is really only useful for testing Temporal-based code. Once you get into the territory that requires blocking we are talking about I/O and multi-threading which TestControl cannot and should not be used to test. We already issue a similar warning:

"Program under test failed produce a result (either a value or an error) and has no further " +
"actions to take, likely indicating an asynchronous deadlock. This may also indicate some " +
"interaction with an external thread, potentially via IO.async or IO#evalOn. If this is the " +
"case, then it is likely you cannot use TestControl to correctly evaluate this program, and " +
"you should either use the production IORuntime (ideally via some integration with your " +
"testing framework), or attempt to refactor the program into smaller, more testable components.")

@armanbilge
Copy link
Member

we would immediately error out and fail the test (something like an UnsupportedOperationException feels right).

Another option is we could try and side-channel this warning, but proceed.

@Baccata
Copy link
Contributor Author

Baccata commented Jul 22, 2024

To elaborate on the context : I came across a deadlock whilst testing code that was integrating with a Caffeine-cache. I wanted to test the TTLs, and caffeine has a mechanism to override the clock it uses to poll time, which is basically a () => Long. I figured I'd wire a dispatcher-based implementation to ensure the time that was received by the cache was consistent with the TestControl's.

So yeah, I don't disagree with what you say, but it does feel like my usecase was somewhat legitimate in the context of TestControl.

It's probably a XY-problem though. Maybe there could be a flavour of execute that could take a (() => FiniteDuration) => IO[A], allowing for the injection of a time-polling thunk into the impure code that needs it, but it feels a bit too bespoke ...

@armanbilge
Copy link
Member

I figured I'd wire a dispatcher-based implementation to ensure the time that was received by the cache was consistent with the TestControl's.

Oh yeah ... if I understand correctly, it would be far better to just use Scheduler directly. I regret that we don't currently expose it in the calculus like we do for Async#executionContext but you could grab that and pattern-match or something ... that would work for WSTP which now extends Scheduler. We'd have to change the TestControl EC to do that as well ...

@djspiewak
Copy link
Member

IO.realTime.syncStep.unsafeRunSync().toOption.get is a nice and concise™ solution that doesn't require ugly pattern matching stuff.

@armanbilge
Copy link
Member

armanbilge commented Jul 23, 2024

IO.realTime.syncStep.unsafeRunSync().toOption.get

Actually this doesn't work 😝 if you convert to SyncIO then you are not running on the IORuntime, which means you are not using the scheduler in that runtime.

@armanbilge
Copy link
Member

In fact, since this can be surprising, I wonder if we should not support translation of realTime and monotonic in the syncStep interpreter 🤔

case IO.RealTime => G.realTime.map(a => Right((a, limit)))
case IO.Monotonic => G.monotonic.map(a => Right((a, limit)))

@djspiewak
Copy link
Member

Ooooooh I forgot that those were cases in the ADT inside of SyncIO. Wow that's actually super annoying.

@armanbilge
Copy link
Member

First thing that comes to mind is leveraging BlockContext, since unsafeRunSync uses that. It would basically mean that any time we detect a scala.concurrent.blocking, we would immediately error out and fail the test (something like an UnsupportedOperationException feels right).

I took a closer look at this and I'm afraid this idea would be too much of a footgun.

Currently TestControl implements IO.blocking via scala.concurrent.blocking:

val runtime: IORuntime = IORuntime(
ctx,
ctx.deriveBlocking(),
new Scheduler {

/**
* Derives a new `ExecutionContext` which delegates to `this`, but wrapping all tasks in
* [[scala.concurrent.blocking]].
*/
def deriveBlocking(): ExecutionContext =
new ExecutionContext {
import scala.concurrent.blocking
def execute(runnable: Runnable): Unit = blocking(self.execute(runnable))
def reportFailure(cause: Throwable): Unit = self.reportFailure(cause)
}

So this would mean that if anyone attempts to use IO.println(...) in executeEmbed (e.g. for debugging) then it would crash. A workaround would be to do IO(println(...)) ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants