-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt at implementation of VarNameVector
(Metadata
alternative)
#555
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Pull Request Test Coverage Report for Build 10924238467Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #555 +/- ##
==========================================
- Coverage 84.32% 79.25% -5.07%
==========================================
Files 26 27 +1
Lines 3183 3621 +438
==========================================
+ Hits 2684 2870 +186
- Misses 499 751 +252 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite finished, but it's close enough that I'll open it for review, since I'm off work tomorrow. Things that remain to be done:
- Sort out the only test failure, which is some strange Zygote error that appeared when I changed something seemingly unrelated. Might be a Zygote bug.
- Decide on a migration path from Metadata + old Gibbs sampler to VarNamedVector + new Gibbs sampler. Currently this PR makes VarNamedVector the default metadata type for VarInfo to see that tests for it pass, but that needs to be reverted, assuming we won't bother making VarNamedVector work with the old Gibbs sampler. Doing that would require putting in structures for
gids
, which seems like a waste of effort. - There's one substantial difference between Metadata and VarNamedVector from a user's point of view: The former doesn't resample values if you call
model(vi)
again unless you set the"del"
flags, while the latter does. There are a couple of places where I gloss over this, inassume
and insetval_and_resample!
. I think that's all okay, but I need to understand this better to really convince myself.
(-0.16489786710222099,) | ||
``` | ||
""" | ||
function generated_quantities(model::Model, chain::AbstractChains) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method claimed to work for AbstactChains
, but in fact only worked for MCMCChains
. Hence I moved it to the MCMCChains extension as I refactored this function to work with VarNamedVectors.
@@ -1886,14 +2109,6 @@ julia> DynamicPPL.setval!(var_info, (m = 100.0, )); # set `m` and and keep `x[1] | |||
julia> var_info[@varname(m)] # [✓] changed | |||
100.0 | |||
|
|||
julia> var_info[@varname(x[1])] # [✓] unchanged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests were checking that calling a model with a VarInfo
doesn't by default overwrite existing values with newly sampled ones. This is, intentionally, no longer the case with VarNamedVector
.
@@ -225,21 +225,44 @@ invlink_transform(dist) = inverse(link_transform(dist)) | |||
# Helper functions for vectorize/reconstruct values # | |||
##################################################### | |||
|
|||
# Useful transformation going from the flattened representation. | |||
struct FromVec{Size} <: Bijectors.Bijector | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The splitting of FromVec
to UnwrapSingletonTransform
and ReshapeTransform
isn't really core to this PR, it's something that come up when I was changing how VarNamedVector
does transformations. It fixes a problem where from_vec_transform(x)(tovec(x))
was not a no-op for 0-dimensional Array
s. If you'd like to review it separately let me know and I can do some cherry-picking to separate it to a distinct PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's okay to keep it in the PR, but perhaps add it to this PR's summary so we can easily see it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; not a huge thing, so might as well just keep it here 👍
I would request a review from @torfjelde, but I can't because this is his PR :) |
@sunxd3, can you also provide your review, given that you have worked with VarInfo extensively? |
Absolutely, will take a look. |
docs/src/internals/varinfo.md
Outdated
|
||
We also want some additional methods that are *not* part of the `Dict` or `Vector` interface: | ||
|
||
- `push!(container, ::VarName, value[, transform])`: add a new element to the container, _but_ for this we also need the `VarName` to associate to the new `value`, so the semantics are different from `push!` for a `Vector`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dict
actually supports this push!
API, so we could move it to the list of L24-L30 above. For example,
julia> d = Dict()
Dict{Any, Any}()
julia> push!(d, :s=>nothing)
Dict{Any, Any} with 1 entry:
:s => nothing
|
||
- `push!(container, ::VarName, value[, transform])`: add a new element to the container, _but_ for this we also need the `VarName` to associate to the new `value`, so the semantics are different from `push!` for a `Vector`. | ||
|
||
- `update!(container, ::VarName, value[, transform])`: similar to `push!` but if the `VarName` is already present in the container, then we update the corresponding value instead of adding a new element. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am missing something. What is the difference between update!
and setindex!
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update!
can take the transform
argument, and can also deal with variables changing dimension. setindex!
is more low level and just directly writes to the storage vector.
The more I think about it, the less the distinction seems necessary to me. @torfjelde, did you have a reason to keep these separate, or would you mind essentially renaming update!
to setindex!
, and making it be synonymous push!
be synonymous with it too? This would also be in line with how push!
works for Dict
s, where it allows both adding and changing values.
@@ -225,21 +225,44 @@ invlink_transform(dist) = inverse(link_transform(dist)) | |||
# Helper functions for vectorize/reconstruct values # | |||
##################################################### | |||
|
|||
# Useful transformation going from the flattened representation. | |||
struct FromVec{Size} <: Bijectors.Bijector | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's okay to keep it in the PR, but perhaps add it to this PR's summary so we can easily see it in the future.
A Zygote issue for the failing test here: FluxML/Zygote.jl#1523 Will have to think about workarounds. |
# TODO(mhauru) Revisit the default. We probably don't want it to be VarNamedVector just | ||
# yet. | ||
metadata_type::Type=VarNamedVector, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd say no; for now, let's be explicit about it and test it properly 👍
metadata_type::Type=VarNamedVector, | ||
) | ||
varinfo = VarInfo() | ||
varinfo = VarInfo(metadata_type()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is this worth it? As in, would it not be better to just pass an instantiated metadata
? Not a huuuuge fan of passing types like this.
Docs build now, and @willtebbutt helped me put in place a temporary workaround for the Zygote issue. |
@yebai asked me to make a slide or two on
|
metadata.flags[flag][getidx(metadata, vn)] = false | ||
return metadata | ||
end | ||
|
||
function unset_flag!(vnv::VarNamedVector, ::VarName, flag::String, ignorable::Bool=false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this ignorable
thingy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or rather, why is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unset_flag!
errors if you try to set a flag which can not be set for VNV, on a "rather error than silently do the wrong thing" basis. assume
calls unset_flag!(vi, vn, "del")
, which would thus error. It seems to be fine to ignore the fact that the call actually does nothing for VNV, so I added an argument to be able to do so, and gets tests to pass for now. I haven't understood fully why assume
wants to unset said flag, but still seems to work fine even if it can't do so. I expect to remove ignorable
once I've figured that out.
We just had a long chat about this PR with @sunxd3, @willtebbutt, and @torfjelde. We agreed that given the lack of a clear design document, and the fact that much of this is still working around various aspects of the old Gibbs sampler, @willtebbutt and @sunxd3 will only review the new files: varnamedvector.jl, its tests, and its docs. They are relatively self-contained, although the motivation for some of their features will remain opaque without the surrounding context. That's fine. We'll do another round of reviewing the whole structure around VarInfo once the old Gibbs is out and we've cleaned up after it. Some other points that came up:
|
This is an attempt at a implementation similar to what was discussed in #528 (comment).
The idea is to introduce a simpler alternative to
Metadata
, which contains a fair bit of not-so-useful-anymore functionality and isn't particularly accessible.With the current state of this PR, it's possible to evaluate a model with a
VarNameVector
:But it's not possible to sample.
What are we trying to achieve here?
The idea behind
VarNameVector
is to enable both an efficient and convenient representation of avarinfo
, i.e. an efficient representation of a realization from a@model
with additional information necessary for both a sampler developer and end-user to extract what they need.In that vein, it seems like we need something that sort of acts like a
Vector
, and sort of acts like anOrderedDict
, but is neither (at least, I don't think it fits in either).Vector
because we need:Vector
representation).for
loop over variablesx[i]
in a model, we want the values forx
to be stored in a contiguous chunk of memory).OrderedDict
because we need:VarName
, e.g.trace[vn]
should result in the realization in it's "constrained" space (i.e. in the space that the distribution of which it came from commonly works in).Moreover, the above should be achieved while ensuring that we allow both mutable and immutable implementations, in addition to maintaining type-stability whenever possible and falling back to type unstable when not.
Current implementation:
VarInfo
withMetadata
The current implementation of
VarInfo
withMetadata
as it's underlying storage achieves some of these properties through a few different means:getindex(varinfo, varname)
, is achieved by effectively grouping thevarnames
contained in aVarInfo
by their symbol (which is part of the type), and putting each group in a separateMetadata
in aNamedTuple
. Basically, instead of representing pairs like(@varname(x) => 1.0, @varname(y) => 2.0)
in a flattened way, we represent them as(x = (@varname(x) => 1.0,), y = (@varname(y) => 2.0,))
, such that in the scenario where they different in types, e.g.y isa Int
, we can dispatch on thesym
inVarName{sym}
to determine which entry of theNamedTuple
to extract.x
is continuous whiley
is discrete. It also provides very straight-forward guidelines on how to speed up models, i.e. "group variables into a single higher-variate random variable, e.g.x[1] ~ Normal(); x[2] ~ Normal()
intox ~ fill(Normal(), 2)
".varinfo
to get type-information, and then use this for the subsequent runs to obtain type-stability (and thus much improved performance).VarInfo
is somewhat lacking, e.g. it does not support changing support even if the types stay the same,varinfo[vn]
, but is currently a) limited, and b) very confusing.VarInfo
implements a somewhat confusing subset of bothAbstractDict
andAbstractVector
interfaces, but neither of which are "complete" in any sense.VarInfo
andMetadata
, even thoughVarInfo
is effectively just a wrapper aroundMetadata
/NamedTuple
containingMetadata
, which adds further confusion.Metadata
comes with a lot of additional fields and information that we have slowly been moving away from using, as the resulting codebase ends up being overly complicated and difficult to work with. This also significantly reduces the utility ofVarInfo
for the end-user.Replacing
Metadata
withVarNameVector
VarNameVector
is an attempt at replacingMetadata
while preserving some of the good ideas fromVarInfo
withMetadata
(and in the process, establish a bare-minimum of functionality required to implement a underlying storage forvarinfo
):VarNameVector
(should) implement (almost) all operations forAbstractDict
+ all non-linear-algebra operations for aAbstractVector{<:Real}
, to the point where a user will find it easy to useVarNameVector
(and thus aVarInfo
wrapping aVarNameVector
).AbstractDict
interface is implemented as if keys are of typeVarName
and values are of type corresponding tovarinfo[varname]
.AbstractVector
interface is implemented wrt. underlying "raw" storage, e.g. if we're working with a unconstrained representation, then the vector will be in unconstrained space (unlikevarinfo[varname]
which will be in constrained space).VarNameVector
uses a contiguous chunk of memory to store the values in a flattened manner, both in the type stable and unstable scenarios, which overall should lead to better performance in both scenarios.VarNameVector
reduces the number of fields + replaces some now-less-useful fields such asdist
with fields requiring less information such astransformss
(holding the transformations used to map from "unconstrained" to "constrained" representation).Updates
2023-11-13
Okay, so I've made some new additions:
VarNameVector
.push!
andupdate!
aVarNameVector
where:push!
means what it usually means, but the varname / key must be unique.update!
works with either existing keys or new keys.push!
&update!
The idea behind the implementation is as follows:
varname
,push!
(andupdate!
, which is equivalent in this scenario), is straight-forward: we just callpush!
andsetindex!
on all the necessary underlying containers, e.g.ranges
.varname
, things become a bit more complicated for a few fields, namelyranges
andvalues
(everthing else is just call tosetindex!
):values
already allocated tovarname
.values
tovarname
and then shift all thevalues
occuring after this part; doing this on everyupdate!
will be expensive! Instead, we just move the new value to the end ofvalues
and mark the old location as "inactive". This leads to much more efficientupdate!
, and then we can just perform a "sweep" to re-contiguify the underlying storage every now and then.To make things a bit more concrete, consider the following example:
Then we update the entry for
@varname(x)
to a differently sized value:Notice that the order is still preserved, even though the underlying
ranges
is no longer ordered.But, if we inspect the underlying values, this contains now-inactive entries:
But in the scenario where we care about performance, we can easily fix this:
Type-stable sampling for dynamic suppport
The result of this is that we can even perform type-stable sampling for models with changing support:
The really nice thing here is that, unlike with
TypedVarInfo
, we don't need to mess around with boolean flags to indicate whether something should be resampled, etc. Instead we just callsimilar
on theVarNameVector
andpush!
onto this.We can also make this work nicely with
TypedVarInfo
:)2024-01-26T15:00:38
Okay, so now all tests should be passing and we should have, effectively, feature parity with
VarInfo
usingMetadata
.But this required quite a lot of code, and there are a few annoyances that are worth pointing out / discussing:
Transformations are (still) a pain
My original idea was that we really did not want to attach the entire
Distribution
from which the random variable came from to the metadata for a couple of reasons:VarInfo
fromModelA
inModelB
, even though they only differ by the RHS of a single~
statement).Distribution
to determine the transformation from the vectorized / flattened representation to the original one we want and linking.In fact, the above is only partially true:
getindex
inside amodel
evaluation actually usesgetindex(vi, vn, dist)
and uses the "tilde-local"dist
for the reshaping and linking, not thedist
present invi
.Hence it seemed to me that one immediate improvement of
Metadata
is to remove thedist
completely, and instead just store the transformation from the vectorized / flattened representation to whatever desired form we want (be that linked or invlinked).And this is what we do currently with
VarNameVector
.This then "resolves" (1) since now all we need is a map
f
that takes a vector and outputs something we can work with.However, (2) is still an issue: we still need the "tilde-local"
dist
to determine the necessary transformation for the particular realization we're interested in, while simultaenously wantinggetindex(vi, vn)
to also function as intended outside of amodel
, i.e. I should still be able to doSooo I don't think we can get around having to keep transformations in the metadata object that might not actually get used within a
model
evaluation if we want allow the indexing behavior above 😕2024-01-31: Update on transformations
See #575