Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add finalization support to folds #2427

Merged
merged 1 commit into from
Nov 18, 2023
Merged

Conversation

harendra-kumar
Copy link
Member

@harendra-kumar harendra-kumar commented Jul 12, 2023

This is a revival of #904 .

takeEndBy_ predicate (Fold fstep finitial fextract) =
Fold step finitial fextract
takeEndBy_ predicate (Fold fstep finitial fextract ffinal) =
Fold step finitial fextract ffinal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I understand this properly. Why don't we finalize this fold on Done?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the one who initializes, finalizes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if you initialize it more than once you finalize it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Done we need to finalize this fold.

takeEndBy predicate (Fold fstep finitial fextract) =
Fold step finitial fextract
takeEndBy predicate (Fold fstep finitial fextract ffinal) =
Fold step finitial fextract ffinal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Done you need to finalize this fold.

Comment on lines 472 to 479
-- An alternative to using an "extract" funciton is to use "Partial s b" style
-- partial value so that we always emit the output value and there is no need
-- to extract. Then extract can be used for cleanup purposes. But in this case
-- in some cases we may need a "Continue" constructor where an output value is
-- not available, this was implicit earlier. Also, "b" should be lazy here so
-- that we do not always compute it even if we do not need it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not understand this very well.

@@ -829,6 +863,9 @@ splitWith func (Fold stepL initialL extractL) (Fold stepR initialR extractR) =
Partial sR -> extractR sR
Done rR -> return rR

final (SeqFoldR f sR) = finalR sR >>= pure . SeqFoldR f
final (SeqFoldL sL) = finalL sL >>= pure . SeqFoldL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the finalize type was s -> m b. The implementation would've been semantically different here.

In extract we do initialR. Shouldn't we also finalize it there?

extract (SeqFoldR_ sR) = extractR sR
extract (SeqFoldL_ _) = do
res <- initialR
case res of
Partial sR -> extractR sR
Done rR -> return rR

final (SeqFoldR_ sR) = finalR sR >>= pure . SeqFoldR_
final (SeqFoldL_ sL) = finalL sL >>= pure . SeqFoldL_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In extract we do initialR. Shouldn't we also finalize it there?

@@ -802,8 +831,10 @@ data SeqFoldState sl f sr = SeqFoldL !sl | SeqFoldR !f !sr
{-# INLINE splitWith #-}
splitWith :: Monad m =>
(a -> b -> c) -> Fold m x a -> Fold m x b -> Fold m x c
splitWith func (Fold stepL initialL extractL) (Fold stepR initialR extractR) =
Fold step initial extract
splitWith func
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a fold

@@ -846,8 +883,8 @@ data SeqFoldState_ sl sr = SeqFoldL_ !sl | SeqFoldR_ !sr
--
{-# INLINE split_ #-}
split_ :: Monad m => Fold m x a -> Fold m x b -> Fold m x b
split_ (Fold stepL initialL _) (Fold stepR initialR extractR) =
Fold step initial extract
split_ (Fold stepL initialL _ finalL) (Fold stepR initialR extractR finalR) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a fold

@@ -83,6 +83,8 @@ copyStream inh outh = S.fold (FH.write outh) (S.unfold FH.reader inh)
inspect $ hasNoTypeClasses 'copyStream
inspect $ 'copyStream `hasNoType` ''Step -- S.unfold
inspect $ 'copyStream `hasNoType` ''IUF.ConcatState -- FH.read/UF.many
-- XXX Why is this change done?
-- XXX Is this a fix?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some inspection tests are failing after the changes, disabled those. We should look at the perf benchmarks instead.

@@ -285,6 +288,7 @@ _copyStreamUtf8'Fold :: Handle -> Handle -> IO ()
_copyStreamUtf8'Fold inh outh =
Stream.fold (Handle.write outh)
$ Unicode.encodeUtf8
-- XXX I'm reviewing the files in order. I don't understand this change yet. I might not have reviewed the prerequisite yet!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeCharUtf8' is a parser now.

-- initialize should give a token and finalize should consume that token. Or the
-- `Done` should somehow consume that token.
-- Or can we make a simple assert by keeping a counter? We increment it on
-- initialize and decrement it on a finalize or `Done`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such feasible way. The rule is when you are Done or extract you need to finalize any pending state.

@@ -1621,6 +1630,7 @@ takeEndBy_ predicate (Fold fstep finitial fextract ffinal) =
step s a =
if not (predicate a)
then fstep s a
-- XXX Need to finalize that value here!
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -1653,6 +1663,7 @@ takeEndBy predicate (Fold fstep finitial fextract ffinal) =
then return res
else do
case res of
-- Need to finalize the value here.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -2815,7 +2815,7 @@ splitOnSuffixSeq withSep patArr (Fold fstep initial done final) (Stream step sta
Stop -> do
-- do not issue a blank segment when we end at pattern
if (idx == maxIndex) && RB.unsafeEqArray rb rh patArr
then return Stop
then return Stop -- XXX Finalize the fold here
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -2847,7 +2847,7 @@ splitOnSuffixSeq withSep patArr (Fold fstep initial done final) (Stream step sta
Skip s -> go SPEC fs s rh cksum
Stop ->
if RB.unsafeEqArray rb rh patArr
then return Stop
then return Stop -- XXX Finalize the fold here?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -514,6 +516,7 @@ foldConcat
FL.Done b -> return $ Right (b, s)
FL.Partial fs1 -> go1 SPEC fs1 s
Stream.Skip s -> go1 SPEC fs s
-- XXX This is being finalized twice
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this finalization.

@@ -58,6 +58,8 @@ writeLimited svar winfo = Fold step initial (const (return ())) final

where

-- XXX We never change the state of the fold to "False". What is the point
-- og this state?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is old code, and deprecated, not changing it.

-- XXX Thinking out loud without understanding the nitty gritty.
-- Maybe change the result type to (Maybe b)?
-- parEval :: MonadAsync m => (Config -> Config) -> Fold m a b -> Fold m a (Maybe b)
-- If it's a parallel fold the result might not exist when er want it to.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not changing the type.

@harendra-kumar harendra-kumar force-pushed the fold-cleanup-action-done branch 2 times, most recently from 04d4336 to 7392655 Compare November 14, 2023 22:28
case resL of
Partial sL -> bimap (Tuple' sL) Right resR
Done bL -> Done $ Left bL
case resL of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Was this a bug fix?

Done b -> initInnerFold (f b)

stepc (C stepInner s extractInner) a = do
stepc (C stepInner s extractInner fin) a = do
Copy link
Member

@adithyaov adithyaov Nov 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractInner is not used anywhere

Done bR -> Done bR
case rR of
Partial sR1 -> return $ Partial (sL, sR1)
Done bR -> finalL sL >> return (Done bR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hard to catch if this is missed anywhere. We should have some tests to catch this.

refoldMany (Fold sstep sinitial sextract) (Refold cstep cinject cextract) =
refoldMany
(Fold sstep sinitial sextract _sfinal)
-- XXX We will need a "final" in refold as well
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can open an issue.

@@ -62,13 +63,33 @@ parEval modifier f =
--
-- A polled stream abstraction may be useful, it would consist of normal
-- events and tick events, latter are guaranteed to arrive.
--
-- XXX We can use the config to indicate if the fold is a scanning type or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create an issue based on these comments so we can track them?

This was needed especially for concurrent fold combinators. A fold
combinator that uses concurrent folds needs to wait for the concurrent
folds to finish before it can finish. The finalizing action in folds can
deallocate any resources allocated by the "initial" action and also wait
for folds that it has initialized.

This complicates fold combinators in general. We can potentially
introduce a type for non-failing parsers and support finalization only
in those. The current use cases can be covered by that. Parsers do not
support scanning, which is not required in the use cases where we need
finalization (there is no known use case).
@harendra-kumar harendra-kumar merged commit 32390c5 into master Nov 18, 2023
13 of 17 checks passed
@harendra-kumar harendra-kumar mentioned this pull request Nov 19, 2023
4 tasks
@harendra-kumar harendra-kumar deleted the fold-cleanup-action-done branch November 22, 2023 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants