Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swapping IO for abstract IO type-classes #396

Merged
merged 1 commit into from
Oct 3, 2024

Conversation

recursion-ninja
Copy link
Collaborator

Refactor to use io-classes type-classes instead of IO

A reasonably large-scale refactoring (of API type signatures) which addresses the numerous TODO annotations stating "replace by io-classes constraints for IO simulation" in order to provide a more general, polymorphic API for users of the lsm-tree library. The indirect object to be replaced that the TODO annotation references are the m ~ IO constraints found on the same line(s) prefixing the TODO annotation, allowing the post-refactoring functions to abstract over IO operations rather than exposing an API which is specialized to IO. While most TODO annotations were in the Database.LSMTree.Internal module, this PR touches many additional modules. This wide footprint of changes is necessary because the Database.LSMTree.Internal module imports many other modules from the lsm-tree project, hence each of these "dependency modules" required a "IO-abstraction" API update in order to facilitate the "IO-abstraction" of the Database.LSMTree.Internal module.

The io-classes which are utilized in place of specialized IO functions are:

Important Note

Wherever the PR updated a module API from exposing a specialized IO version of a function to a generalized io-classes function, a SPECIALISE directive was also added which specializes the polymorphic m type parameter to IO. The rational here is to have the module's interface file still expose a version of the function which is specialized to IO in order to prevent any performance regressions. However, this may be excessive and result in unnecessarily longer compiler times and code sizes.

I would like some commentary from reviewers more experience balancing SPECIALISE directive usage as to whether the abundant and boilerplate additions of theSPECIALISE directives are a wise decision.

Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like some commentary from reviewers more experience balancing SPECIALISE directive usage as to whether the abundant and boilerplate additions of the SPECIALISE directives are a wise decision.

Generalising the code to work for any IO-like monad is done for testing purposes only, and since the library is intended to be highly performant, the generalisation should ideally have no overhead. So, the SPECIALISE pragmas are warranted -- the compile time is also not prohibitively long at the moment, even with the changes in this PR

On that note, we should verify that the generalisation does not impact our performance. Let's run the benchmarks to see if we can see a clear difference

Also, I have some proposed changes on the branch jdral/io-classes-for-abstract-types. You can cherry-pick what you want to include.

It's puzzling that tests are currently timing out -- there are no logic changes AFAICS. We should find out what is happening here

Note that with the changes on my branch, it seems most of our tests succeed that are currently failing, except for propLockstepIO_RealImpl_RealFS. However, I'm hesitant to say that my changes "fix" the problem if we don't know the origin of the bug. However, I've played around a bit with the statemachine tests, and it seems that disabling the Open action makes the propLockstepIO_RealImpl_RealFS pass... Not sure why

src/Database/LSMTree/Internal.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Run.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/RunBuilder.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/RunBuilder.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/RunBuilder.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/RunBuilder.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/RunReader.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/RunReader.hs Outdated Show resolved Hide resolved
@recursion-ninja recursion-ninja force-pushed the recursion-ninja/io-classes-for-abstract-types branch from 00ddaee to 52875ff Compare September 25, 2024 01:36
@dcoutts
Copy link
Collaborator

dcoutts commented Sep 26, 2024

I concur with all the PR review suggestions from Joris above.

@dcoutts
Copy link
Collaborator

dcoutts commented Sep 26, 2024

Also, this is great! :-) Lets try and get it in soon before it bitrots.

If necessary to avoid too many conflicts and bit-rot, we can merge it in pieces. For example could omit the .Internal module at first and convert a batch of modules, next batch etc and finally top level that pulls it all together. But if we can do it soon all in one go, that's better still.

@recursion-ninja recursion-ninja force-pushed the recursion-ninja/io-classes-for-abstract-types branch 9 times, most recently from 16d4d23 to 3d5cb29 Compare October 1, 2024 15:51
Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- just a few small things to change and then we can merge.

Also, let's apply this diff to bypass the NoThunks test error (for now)

diff --git a/src-extras/Database/LSMTree/Extras/NoThunks.hs b/src-extras/Database/LSMTree/Extras/NoThunks.hs
index 87b745e7..46d3dca5 100644
--- a/src-extras/Database/LSMTree/Extras/NoThunks.hs
+++ b/src-extras/Database/LSMTree/Extras/NoThunks.hs
@@ -1,3 +1,4 @@
+{-# LANGUAGE CPP                   #-}
 {-# LANGUAGE DataKinds             #-}
 {-# LANGUAGE QuantifiedConstraints #-}
 {-# LANGUAGE UndecidableInstances  #-}
@@ -492,11 +493,20 @@ instance NoThunksIOLike' IO RealWorld
 
 type NoThunksIOLike m = NoThunksIOLike' m (PrimState m)
 
+-- TODO: on ghc-9.4, a check on StrictTVar IO (RWState (TableContent IO h))
+-- fails, but we have not yet found out why so we simply disable NoThunks checks
+-- for StrictTVars on ghc-9.4
 instance NoThunks a => NoThunks (StrictTVar IO a) where
   showTypeOf (_ :: Proxy (StrictTVar IO a)) = "StrictTVar IO"
-  wNoThunks ctx var = do
-      x <- readTVarIO var
-      noThunks ctx x
+  wNoThunks _ctx _var = do
+#if defined(MIN_VERSION_GLASGOW_HASKELL)
+#if MIN_VERSION_GLASGOW_HASKELL(9,4,0,0) && !MIN_VERSION_GLASGOW_HASKELL(9,6,0,0)
+    pure Nothing
+#else
+    x <- readTVarIO _var
+    noThunks _ctx x
+#endif
+#endif
 
 instance NoThunks a => NoThunks (StrictMVar IO a) where
   showTypeOf (_ :: Proxy (StrictMVar IO a)) = "StrictMVar IO"

src/Database/LSMTree/Common.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Common.hs Show resolved Hide resolved
@recursion-ninja recursion-ninja force-pushed the recursion-ninja/io-classes-for-abstract-types branch from 3d5cb29 to 8820c21 Compare October 3, 2024 18:28
@jorisdral jorisdral force-pushed the recursion-ninja/io-classes-for-abstract-types branch from 8820c21 to e7cb644 Compare October 3, 2024 20:37
@jorisdral jorisdral added this pull request to the merge queue Oct 3, 2024
Merged via the queue into main with commit 33ded4d Oct 3, 2024
24 checks passed
@jorisdral jorisdral deleted the recursion-ninja/io-classes-for-abstract-types branch October 3, 2024 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants