Running Cats Effect on Virtual Threads of JDK21

Tim Evdokimov
4 min readDec 31, 2023

Cats Effect is an amazing piece of high-end machinery, enabling clear separation of effects and logics for complex concurrent asynchronous flows implemented in Scala. Its default executor, WorkStealingPool, is derived from Tokio Rust library and features a pool of just few (as many as CPU cores allocated) worker threads, capable of processing non-blocking runnable thunks with very high efficiency.

While Cats Effect provides a separate second pool for explicit blocking operations, defined by IO.blocking sections, blocking contexts can actually occur in many places - especially when the new code written for Cats Effect needs to interact with some legacy asynchronous implementation, based on Scala Future or Akka actors, inside the same JVM.

Often, a forced conversion of an IO effect into a value needs to happen synchronously — this is done by summoning Dispatcher and using its unsafeRunSync() method

Update: Another legit method I was originally not aware of is to use IO.syncStep(…) and SyncIO.

Internally, this forces a creation of Future and the current thread waits on a lock, awaiting the result of Future execution.

Fortunately, there is a smart built-in mechanism to deal with blocking contexts provided by WorkStealingPool:

  • When some io-compute-NNN thread hits a blocking context, it re-labels itself into io-compute-blocker-NNN and bravely remains dealing with the blocking code.
  • To avoid depletion of io-compute threads, a clone of current thread is made and put back into the main compute pool. After the blocking part is done, the current io-compute-blocker-NNN thread doesn’t go away immediately, but keeps lingering for another minute, waiting for other blocking thunks to come along.

This all works rather smoothly when blocking contexts occur at at reasonably continuous rate. In such scenario, there will always be someio-compute-blocker-NNN threads lingering around, to process incoming blocking code.

But when the blocking runnables start coming in massive bursts, and there are no available io-compute-blocker-NNN threads available — more and more threads get created. All these threads compete for the same few scarce CPU cores, causing runaway positive feedback reaction.

There is no effective remedy against it — even if inside unsafeRunSync() nothing really blocking happens, it is the overhead of uncapped creating of the threads and Future/awaits that could bring otherwise healthy process to a nearly halt.

Fortunately, new JDK 21 (latest LTS after JDK 17) has just arrived, featuring Virtual Threads. And it was quite trivial exercise to replace default Cats Effect runtime with JDK 21-based one — just that method under IOApp:

override protected def runtime: IORuntime = {
val compute = Executors.newVirtualThreadPerTaskExecutor()
val executionContext = ExecutionContext.fromExecutor(compute);
IORuntimeBuilder
.apply()
.setCompute(executionContext, () => compute.shutdown())
.build()
}

There were no substantial difference measured on non-blocking effects, but for the unsafeRunSync(), JDK 21 beats default executor with flying colors.

Full test code available here:
https://github.com/jacum/cats-effect-jdk21

Final scores, each test running 10000 repetitions of unsafeRunSync() in a given number of parallel threads.

WorkStealingPool: The more effects happen in parallel, the more substantial the overhead of creating extra threads becomes. The number of io-compute-blocker-NNN threads in each test becomes (number of parallel processes +3), e.g. for test with 200 processes — it becomes 203, with 500–503 — which is way too much for 2-4 cpu cores most of cloud workloads will ever have allocated.

JDK 21 Virtual threads: No extra threads are created, all runnables handled natively by JVM. With higher parallelism, elapsed times go up almost linearly with the number of operations, implying that native scheduling of blocking contexts on virtual threads is very efficient.

This approach has recently been tested in some of our production workloads — and it works very reliably.

The only caveat is that by default Cats Effect tracing attaches something to each thread, and with JDK 21 virtual thread, it causes a memory leak. With -Dcats.effect.tracing.mode=NONE, fiber tracing is disabled and there is no leak anymore.

Conclusion

JDK21 provides a reasonable drop-in suitable replacement for Cats Effect default runtime, providing same efficiency with non-blocking runnables and is stunningly superior in handling workloads that may enter blocking contexts, such as Future/await of legacy codebase.

Update: https://github.com/typelevel/cats-effect/discussions/3927 was initiated, to follow up with Cats Effect maintainers.

--

--