Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of tags with the processBulk Consumable/PublishableRange API #407

Open
9 tasks
wirew0rm opened this issue Sep 11, 2024 · 2 comments
Open
9 tasks
Assignees
Labels
enhancement New feature or request

Comments

@wirew0rm
Copy link
Member

wirew0rm commented Sep 11, 2024

When using the processBulk API, special care has to be taken when publishing samples and the corresponding tags. At the moment this works with the automatic tag propagation which has some shortcomings:

  • can only correctly forward tags on the first sample

  • difficult to disable/override
    To fix these, the Consumable/PublishablePortRange API should be extended to allow for an intuitive way to control the propagation of tags. There was some initial offline discussion which is persisted in this issue involving @RalphSteinhagen, @drslebedev and @wirew0rm, but others are of course invited to chime in if they have a specific usecase or improvement in mind. The Proposed changes are roughly structured into four individual steps that can be tackled more or less independently.

  • I: add convenience methods for

    • consuming tags in ConsumableInputPortRange (Concept: ConsumablePortSpan)
    • at the moment, one has to directly interact with the PublishableInputRange<Tag>, which is a member of the PortRange
    • tag() -> returning range of tuple/pair <size_t, property_map> (size_t w.r.t. local index) -- complementary to the tag -> renaming -> rawTags field interface
    • implement consumableInputPortRange::consume(index=size::MAX), which will consume all tags up to the given sample index (= all tags belonging to samples in the range)
  • II: Automatic Tag Propagation Policies (only relevant for non-'N_in=1:M_out=1' chunked data constraints)

    • optional block attribute (like e.g. Resampling or BlockingIO)
    • different policies:
      • forward-propagate (default): treat the tag as if it belonged to the first sample of the next processBulk(..) invocation
      • backward-propagate: treat the tag as if it were on the first sample of the current processBulk(..) invocation
      • [ignore]: tag propagation needs to be handled by the user in the processBulk(..) function (see point IV below)
  • III: convenience methods for publishing Tags in PublishablePortRange

    • implement PublishableOutputPortRange
    • PublishableOutputPortRange::publishTag(index = 0) N.B. index w.r.t. local indexing of the output-data-span
  • IV: forwarding policy

    • at the moment we have all-to-all and one-to-one strategies defined, but only all-to-all is implemented
    • [We discussed providing a customization point where users could plug in predefined or their own strategies -> abandoned]
    • Instead, this should all be done using the custom processBulk(..) processing function using the new ConsumableInputPortRange and PublishableOutputPortRange interfaces.
    • Implement a way to suppress the automatic tag forwarding
    • Modify the existing 'selector block' to demonstrate how to forward the correct tags to the correct output samples explicitly.
@RalphSteinhagen RalphSteinhagen added the enhancement New feature or request label Sep 11, 2024
@daniestevez
Copy link
Contributor

consuming tags in ConsumableInputPortRange (Concept: ConsumablePortSpan)

Not sure if I understood this correctly. Is there a use case / reason for decoupling tag consumption from sample consumption? In my mind things would be simpler if processBulk() has visibility of the tags that are attached to all the samples available in inSpan (and no more or fewer tags), and so input tags get consumed "automatically" as their corresponding sample gets consumed (by a call to inSpan.consume() or automatically if consume() was not called). This covers all the use cases I can think of.

@RalphSteinhagen
Copy link
Member

@daniestevez one of the initial design choices was to keep processBulk(...) simple and to only export std::span<const T> and std::span<T> interfaces. This required the tag-forwarding to be external to the processing function. This also because of samples and tags being propagated in different buffers for performance reasons.

This has been softened by adding an opt-in API that also allows to dynamically control of how many samples are consumed/produced via the ConsumableSpan and ProducableSpan.

The proposal above would extend these to Consumable/PublishablePortRange, which further extends these to the Tag handling and adds the synchronisation primitives between the sample/data and the Tag buffer. This way, there'd be two default propagation policies (i.e. forward and backwards) and an opt-out. The latter for users that want full control how Tags are being propagated between different ports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: 🔖 Selected (3)
Development

No branches or pull requests

4 participants