Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InsertOpts option to reconcile metadata where multiple are encountered #329

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

brandur
Copy link
Contributor

@brandur brandur commented May 2, 2024

Here, attempt to resolve #165 by better defining how metadata that may
be inserted via InsertOpts on Insert will interact with metadata
that may be inserted via a job implementing InsertOpts(). Currently, a
job's metadata is always ignored.

The behavior to reconcile two metadata objects has some nuance, and I
don't think any one solution will fit everyone's desires. Here, I
propose that we let users defined their own preferred behavior by
implementing a new InsertOptsMetadataReconcile option:

type InsertOpts struct {
    ...

    // MetadataReconcile defines how, in the presence of multiple InsertOpts
    // like one from the job itself and another one from a Client.Insert param,
    // how metadata will be reconciled between the two. For example, two
    // metadata can be merged together, or one can exclude the other.
    //
    // Defaults to MetadataReconcileExclude.
    MetadataReconcile MetadataReconcile

The possible reconciliation modes:

  • MetadataReconcileExclude: The more specific metadata (from Insert
    params) completely excludes the less specific metadata (from a job'
    opts), and the latter is discarded.

  • MetadataReconcileMerge: Carries out a shallow merge. In case of
    conflict, keys from the more specific metadata win out.

  • MetadataReconcileMergeDeep: Carries out a deep merge, recursing into
    every object to see if we can use it as a map[string]any. In case of
    conflict, keys from the more specific metadata win out.

The default setting is MetadataReconcileExclude. This is partly a
safety feature because it is possible to create oddly formed metadata
that might have trouble merging, and partly a performance feature.
Exclude doesn't require trying to parse any of the metadata so that we
can try to merge it.

A note on typing: I tried to marshal/unmarshal maps as map[any]any
instead of map[string]any so we could support all kinds of wild key
types simultaneously, but encoding/json is pretty strict about working
with only one type of key at a time, and failed when I tried. We may
have to keep it as an invariant that if you want to use one of the merge
modes, your metadata must be parseable as map[string]any. This
shouldn't be a problem because a struct with json tags on it will
always produce a compatible object. I've tried to document this.

Fixes #165.

@brandur brandur force-pushed the brandur-metadata-reconcile branch from 75654a6 to 6c62db6 Compare May 2, 2024 14:20
@brandur brandur requested a review from bgentry May 2, 2024 14:20
@brandur
Copy link
Contributor Author

brandur commented May 2, 2024

@bgentry Wanted to get your thoughts on this one — I think the different reconcile options are kind of nice for maximizing flexibility, but there is a chance it's overkill. Thoughts?

@brandur brandur force-pushed the brandur-metadata-reconcile branch from 6c62db6 to f4b75ed Compare May 3, 2024 14:09
@bgentry
Copy link
Contributor

bgentry commented May 3, 2024

Hmm, I think it's partly a bit overkill because I'm not sure the other modes would be needed, but primarily I think the default is wrong and will be a footgun in terms of functionality we have planned. Am I reading correctly that by default if you set metadata in the worker InsertOpts that it will completely overwrite any other specified metadata?

@brandur brandur force-pushed the brandur-metadata-reconcile branch from f4b75ed to 562309c Compare May 4, 2024 04:54
Here, attempt to resolve #165 by better defining how metadata that may
be inserted via `InsertOpts` on `Insert` will interact with metadata
that may be inserted via a job implementing `InsertOpts()`. Currently, a
job's metadata is always ignored.

The behavior to reconcile two metadata objects has some nuance, and I
don't think any one solution will fit everyone's desires. Here, I
propose that we let users defined their own preferred behavior by
implementing a new `InsertOptsMetadataReconcile` option:

    type InsertOpts struct {
        ...

        // MetadataReconcile defines how, in the presence of multiple InsertOpts
        // like one from the job itself and another one from a Client.Insert param,
        // how metadata will be reconciled between the two. For example, two
        // metadata can be merged together, or one can exclude the other.
        //
        // Defaults to MetadataReconcileExclude.
        MetadataReconcile MetadataReconcile

The possible reconciliation modes:

* `MetadataReconcileExclude`: The more specific metadata (from `Insert`
  params) completely excludes the less specific metadata (from a job'
  opts), and the latter is discarded.

* `MetadataReconcileMerge`: Carries out a shallow merge. In case of
  conflict, keys from the more specific metadata win out.

* `MetadataReconcileMergeDeep`: Carries out a deep merge, recursing into
  every object to see if we can use it as a `map[string]any`. In case of
  conflict, keys from the more specific metadata win out.

The default setting is `MetadataReconcileExclude`. This is partly a
safety feature because it is possible to create oddly formed metadata
that might have trouble merging, and partly a performance feature.
Exclude doesn't require trying to parse any of the metadata so that we
can try to merge it.

A note on typing: I tried to marshal/unmarshal maps as `map[any]any`
instead of `map[string]any` so we could support all kinds of wild key
types simultaneously, but `encoding/json` is pretty strict about working
with only one type of key at a time, and failed when I tried. We may
have to keep it as an invariant that if you want to use one of the merge
modes, your metadata must be parseable as `map[string]any`. This
shouldn't be a problem because a struct with `json` tags on it will
always produce a compatible object. I've tried to document this.

Fixes #165.
@brandur brandur force-pushed the brandur-metadata-reconcile branch from 562309c to 28bb8ab Compare May 4, 2024 04:57
@brandur
Copy link
Contributor Author

brandur commented May 4, 2024

So the reason I opted for the exclude option by default is that it certainly has some nice benefits:

  • Merging is inherently much slower because it involves parsing two separate metadata hashes, gluing them together, then serializing them again. Especially where insertion speed is a concern, definitely nice not to have that going on in the background with every operation.
  • Because we have to use map[string]any for the merge, it is possible for users to create incompatible metadata hashes that could produce an error (say using integers as keys for example).

I'm not sure I know the complete plans for future features well enough to see why merging is that important, so I may be missing something there.

I'd still argue strongly that the "exclude" mode needs to be in there as an option because a user should be able to clobber all secondary metadata hashes on an insert if they want to. I could see an argument for dropping the shallow merge option to lean things out, although it shouldn't be particularly harmful to keep in there either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

InsertOpts() Metadata
2 participants