Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New generic continuation type and new matcher interface #132

Merged
merged 73 commits into from
Jul 1, 2024

Conversation

Datseris
Copy link
Member

@Datseris Datseris commented Jun 27, 2024

This PR implements the ideas described in #130. There are still some TODOs left.

UPDATE: the matcher infrasturcture is hidden. After talking with Kalel we realized that this new matcher is actually a whole new continuation and is very simple to implement. This PR has the starting code for it but doesn't export anything new yet.

So this PR now focuses on generalizing the seed and match continuation, as well as renames and re-organization, see CHANGELOG!

This PR also removes the "info extraction" business from the seed and match continuation. This appears completely obsolete, as you can always extract any info you want from the attractors after the continuation, with the same cost as doing it during the continuation.

This PR also further improves the matching code to not unecessarily increment IDs in the mathcing process. That means: if we had previous attractors 1, 2, and current attractors 2 and 3, and attractor 3 matches to attractor 2, then the new attractor 2 won't get ID 4, it will get ID 3 as this is the next available integer.

  • implement retract_keys
  • Update documentation /docs files
  • add tests for the new interface
  • ensure previous code works and is deprecated
  • global rename fractions_curves to fractions_cont (for continuation)
  • global rename attractors_info to attractors_cont (for continuation)

@Datseris
Copy link
Member Author

Datseris commented Jun 30, 2024

@awage @KalelR alright, I think this is some really good work on the generic global continuation method. Have a look at the documentation:

"""
AttractorSeedContinueMatch(mapper, matcher = MatchBySSSetDistance(); seeding)
A global continuation method for [`global_continuation`](@ref).
`mapper` is any subtype of [`AttractorMapper`](@ref) which implements
[`extract_attractors`](@ref), i.e., it finds the actual attractors.
`matcher` is a configuration of how to match attractor IDs,
and at the moment can only be an instance of [`MatchBySSSetDistance`](@ref).
## Description
This is a general global continuation method is based on a 4-step process:
1. Seed initial conditions from previously found attractors
2. Propagate those forwards to "continue" previous attractors
3. Estimate basin fractions and potentially find new attractors
4. Match attractors
### Step 0 - Finding initial attractors
At the first parameter slice of the global continuation process, attractors and their fractions
are found using the given `mapper` and [`basins_fractions`](@ref).
See the `mapper` documentation and [`AttractorMapper`](@ref)
for details on how this works. Then, from the second parameter onwards the continuation occurs.
### Step 1 - Seeding initial conditions
Initial conditions can be seeded from previously found attractors.
This is controlled by the `seeding` keyword, which must be a function that given
a `StateSpaceSet` (an attractor), it returns an iterator of initial conditions.
By default the first point of an attractor is provided as the only seed.
Seeding can be turned off by providing the dummy function `seeding = A -> []`,
i.e., it always returns an empty iterator and hence no seeds.
### Step 2 - Continuing the seeds
The dynamical system referenced by the `mapper` is now set to the new parameter value.
The seeds are run through the `mapper` to converge to attractors at the new parameter value.
Seeding initial conditions close to previous attractors increases the probability
that if an attractor continues to exist in the new parameter, it is found.
Additionally, for some `mappers` this seeding process improves the accuracy as well as
performance of finding attractors, see e.g. discussion in [Datseris2023](@cite).
This seeding works for any `mapper`, regardless of if they can map individual initial conditions
with the `mapper(u0)` syntax! If this syntax isn't supported, steps 2 and 3 are done together.
### Step 3 - Estimate basins fractions
After the special seeded initial conditions are mapped to attractors,
attractor basin fractions are computed by sampling additional initial conditions
using the provided `ics` in [`global_continuation`](@ref).
I.e., exactly as in [`basins_fractions`](@ref).
Naturally, during this step new attractors may be found, besides those found
using the "seeding from previous attractors".
### Step 4 - Matching
Normally the ID an attractor gets assigned is somewhat a random integer.
Therefore, to ensure a logical output of the global continuation process,
attractors need to be "matched".
This means: attractor and fractions must have their _IDs are changed_,
so that attractors that are "similar" to those at a
previous parameter get assigned the same ID.
What is "similar enough" is controlled by the `matcher` input.
The default `matcher` [`MatchBySSSetDistance`](@ref) matches
sets which have small distance in state space.
The matching algorithm itself can be quite involved,
so read the documentation of the `matcher` for how matching works.
A note on matching: the [`MatchBySSSetDistance`](@ref) can also be used
after the continuation is completed, as it only requires as input
the state space sets (attractors), without caring at which parameter each attractor
exists at. If you don't like the final matching output,
you may use a different instance of [`MatchBySSSetDistance`](@ref)
and call [`match_sequentially!`](@ref) again on the output,
without having to recompute the whole global continuation!
### Step 5 - Finish
After matching the parameter is incremented.
Steps 1-4 repeat until all parameter values are exhausted.
### Further note
This global continuation method is a generalization of the "RAFM" continuation
described in [Datseris2023](@cite). This continuation method is still exported
as [`RecurrencesFindAndMatch`](@ref).
"""

@KalelR this method is so general, that now we truly don't need to create a new continuation type "basin enclosure". The only thing we have to do is to add a dispatch to the function:

"""
_match_attractors(
current_attractors, prev_attractors, matcher,
ds, p, pprev
)
Return the replacement map, mapping IDs of current attractors to the ones
they match to in previous attractors, according to `matcher`, and also given
then dynamical system, current and previous parameter values.
"""

Do you think you can create this function from your original code and PR it? Everything else will just work and you only need to define this function to have basin enclosure. You can have it with or without seeding, and with any mapper, whether it is featurizing or recurrences.

I improved the featurizing code to always store the attractors via evolving the trajectories, irrespectively if the sampler is random or pre-described initial condiitons. I also think that your preferred grouping method with "PairwiseComparison" will benefit from the seeding, as the first initial conditions will likely be the attractors, so they will form the center of the clusters for you.

src/matching/sssdistance.jl Outdated Show resolved Hide resolved
@awage
Copy link
Contributor

awage commented Jul 1, 2024

This is looking very good. The interface is great.

@Datseris
Copy link
Member Author

Datseris commented Jul 1, 2024

Thanks @awage . @KalelR I've finished the Basin enclosure matcher, which now is indeed a matcher. I departed significantly from your code in #133 , however, I think the overall logic stays similar. The advantage of my approach is that the source code is significantly smaller as it re-uses existing structures. I decided to use AttractorsViaProximity to find the "flowing" and "coflowing" attractors. I thought re-using the proximity mapper would give more robust results than deciding to arbitrarily evolve for some T and Ttr time.

Here is the full source code:

"""
    MatchByBasinEnclosure(; kw...) <: IDMatcher

A matcher that matches attractors by whether they are enclosed in
the basin of a new attractor or not.

## Keyword arguments

- `ε = nothing`: distance threshold given to [`AttractorsViaProximity`](@ref).
  If `nothing`, it is estimated as half the minimum distance of centroids
  (in contrast to the default more accurate estimation in [`AttractorsViaProximity`](@ref)).
- `Δt = 1, consecutive_lost_steps = 1000`: also given to [`AttractorsViaProximity`](@ref).
  Note that attractors that did not convergen anywhere within this number of steps
  do not get assigned ID -1 as in [`AttractorsViaProximity`](@ref). Rather, they
  get assigned the next available free ID.
- `distance = Centroid()`: metric to estimate distances between state space sets
  in case there are co-flowing attractors, see below.
- `seeding = A -> A[end]`: how to select a point from the attractor to see if
  it is enclosed in the basin of a new attractor.

## Description

An attractor `A₋` is a set in a state space that occupies a particular region
(or, a single point, if it is a fixed point).
This region is always within the basin of said attractor.
When the parameter of the dynamical system is incremented,
the attractors `A₊` in the new parameter have basins that may have changed in shape and size.

The previous attractor `A₋` is "matched" (i.e., has its ID changed)
to a new attractor `A₊` attractor if `A₋` is located inside the basin of attraction of `A₊`.
To see if `A₋` is in the basin of `A₊`, we first pick a point from `A₊` using the `seeding`
keyword argument. By default this is the last point on the attractor, but it could be anything
else, including the centroid of the attractor (`mean(A)`).
This point is given as an initial condition to an [`AttractorsViaProximity`](@ref) mapper
that maps initial conditions to the `₊` attractors with threshold `ε`
(by default estimated automatically, see [`AttractorsViaProximity`](@ref)).

There can be the situation where multiple `₋` attractors get matched to the same `₊`
attractor, which we call "coflowing attractors". In this scenario matching is prioritized
for the `₋` attractor that is closest to the `₊` in terms of state space set distance,
which is estimated with the `distance` keyword, which can be anything
[`setsofsets_distances`](@ref) accepts. The closes `₋` coflowing attractor
gets assigned the same ID as the `₊` one, while the rest get different unique IDs.

Basin enclosure is a concept similar to "basin instability" in [Ritchie2023](@cite).
"""
@kwdef struct BasinEnclosure{E, D} <: IDMatcher
    ε::E = nothing
    distance::D = Centroid()
    seeding::S = A -> A[end]
    Δt::Float64 = 1
    consecutive_lost_steps::Int = 1000
end

function _match_attractors(
        current_attractors, prev_attractors, matcher::MatchByBasinEnclosure,
        mapper, p, pprev
    )
    ds = referenced_dynamical_system(mapper)
    if matcher.ε === nothing
        e = ε_from_centroids(attractors)
    else
        e = matcher.ε
    end
    proximity = AttractorsViaProximity(ds, current_attractors, e; horizon_limit = Inf)
    rmap = Dict(k => proximity(matcher.seeding(A)) for (k, A) in prev_attractors)
    # we now process the replacement map `rmap` for co-flowing or diverged attractors.
    next_id = next_free_id(current_attractors, prev_attractors)
    # take care of diverged attractors
    for (old_ID, new_ID) in rmap
        if new_ID < 0 # diverged attractors get -1 ID.
            rmap[old_ID] = next_id
            next_id += 1
        end
    end
    # next, take care of co-flowing attractors
    if unique(values(rmap)) != length(rmap) # a value is repeated
        # Do coflowing and assign `next_id` to the least distant coflowing
    end
    return rmap
end

function ε_from_centroids(attractors)
    distances = setsofsets_distances(attractors, attractors, Centroid())
    alldists = sort!(vcat([collect(values(d)) for (k,d) in distances]...))
    filter!(!iszero, alldists)
    return minimum(alldists)/2
end

The only thing left is to implement the co-flowing logic.


EDIT: ah shit, I now realize that there is something wrong here. My logic is opposite. We must create a replacement map mapping CURRENT ids to OLD ids. In the code above I have it the wrong way around, I map OLD ids to CURRENT ids. So at the very end of the function we must "invert" the mapping to return the correct replacement map.

@Datseris
Copy link
Member Author

Datseris commented Jul 1, 2024

One thing you didn't explain in #133 is how you handle the "other" attractors in the coflowing case. The closest attractor gets the ID of the new one it flows to. What about the others though? What ID do they get? Do they get the next free available integer? Or do we check if there are some new attractors where no previous attractor has flowed to, and assign this ID by proximity? I think it is easier to assign the next available integer, but we likely will have to test this with your Kuramoto models.

For now, I will leave this as it is, and let you play around with this implementation by adjusting what happens inside the "co flowing attractors" if clause. It would be fantastic if we could have this done before JuliaCon 2024.

Now, I will focus on ensuring the tests pass and merge in this PR and tag a new release.

@Datseris Datseris merged commit 40dc918 into main Jul 1, 2024
2 checks passed
@Datseris Datseris deleted the matcher_interface branch July 1, 2024 11:23
@KalelR
Copy link
Contributor

KalelR commented Aug 6, 2024

For now, I will leave this as it is, and let you play around with this implementation by adjusting what happens inside the "co flowing attractors" if clause. It would be fantastic if we could have this done before JuliaCon 2024.

Sorry I took so long.
So let's say that at parameter p1 there two attractors, A1- and A2-. At parameter p+>p- there is only one attractor, Ak+. Trajectories starting at A1- and A2- (with p = p+) flow onto Ak+. They are thus coflowing attractors. But the distance d(A1-, Ak+) < d(A2-, Ak+). Therefore, we match Ak+ with A1-, thus labelling Ak+ as A1+. In the continuation, we would see A1- continue onto A1+ and A2- cease to exist. This is the behavior I originally intended, and the code is already doing this (with some modifications I'm introducing in a PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants