Improve handling of tags with the processBulk Consumable/PublishableRange API #407

wirew0rm · 2024-09-11T15:52:42Z

When using the processBulk API, special care has to be taken when publishing samples and the corresponding tags. At the moment this works with the automatic tag propagation which has some shortcomings:

can only correctly forward tags on the first sample
difficult to disable/override
To fix these, the Consumable/PublishablePortRange API should be extended to allow for an intuitive way to control the propagation of tags. There was some initial offline discussion which is persisted in this issue involving @RalphSteinhagen, @drslebedev and @wirew0rm, but others are of course invited to chime in if they have a specific usecase or improvement in mind. The Proposed changes are roughly structured into four individual steps that can be tackled more or less independently.
I: add convenience methods for
- consuming tags in ConsumableInputPortRange (Concept: ConsumablePortSpan)
- at the moment, one has to directly interact with the PublishableInputRange<Tag>, which is a member of the PortRange
- tag() -> returning range of tuple/pair <size_t, property_map> (size_t w.r.t. local index) -- complementary to the tag -> renaming -> rawTags field interface
- implement consumableInputPortRange::consume(index=size::MAX), which will consume all tags up to the given sample index (= all tags belonging to samples in the range)
II: Automatic Tag Propagation Policies (only relevant for non-'N_in=1:M_out=1' chunked data constraints)
- optional block attribute (like e.g. Resampling or BlockingIO)
- different policies:
  - forward-propagate (default): treat the tag as if it belonged to the first sample of the next processBulk(..) invocation
  - backward-propagate: treat the tag as if it were on the first sample of the current processBulk(..) invocation
  - [ignore]: tag propagation needs to be handled by the user in the processBulk(..) function (see point IV below)
III: convenience methods for publishing Tags in PublishablePortRange
- implement PublishableOutputPortRange
- PublishableOutputPortRange::publishTag(index = 0) N.B. index w.r.t. local indexing of the output-data-span
IV: forwarding policy
- at the moment we have all-to-all and one-to-one strategies defined, but only all-to-all is implemented
- [We discussed providing a customization point where users could plug in predefined or their own strategies -> abandoned]
- Instead, this should all be done using the custom processBulk(..) processing function using the new ConsumableInputPortRange and PublishableOutputPortRange interfaces.
- Implement a way to suppress the automatic tag forwarding
- Modify the existing 'selector block' to demonstrate how to forward the correct tags to the correct output samples explicitly.

The text was updated successfully, but these errors were encountered:

daniestevez · 2024-09-12T06:59:26Z

consuming tags in ConsumableInputPortRange (Concept: ConsumablePortSpan)

Not sure if I understood this correctly. Is there a use case / reason for decoupling tag consumption from sample consumption? In my mind things would be simpler if processBulk() has visibility of the tags that are attached to all the samples available in inSpan (and no more or fewer tags), and so input tags get consumed "automatically" as their corresponding sample gets consumed (by a call to inSpan.consume() or automatically if consume() was not called). This covers all the use cases I can think of.

RalphSteinhagen · 2024-09-12T10:57:30Z

@daniestevez one of the initial design choices was to keep processBulk(...) simple and to only export std::span<const T> and std::span<T> interfaces. This required the tag-forwarding to be external to the processing function. This also because of samples and tags being propagated in different buffers for performance reasons.

This has been softened by adding an opt-in API that also allows to dynamically control of how many samples are consumed/produced via the ConsumableSpan and ProducableSpan.

The proposal above would extend these to Consumable/PublishablePortRange, which further extends these to the Tag handling and adds the synchronisation primitives between the sample/data and the Tag buffer. This way, there'd be two default propagation policies (i.e. forward and backwards) and an opt-out. The latter for users that want full control how Tags are being propagated between different ports.

wirew0rm assigned wirew0rm and drslebedev Sep 11, 2024

RalphSteinhagen added the enhancement New feature or request label Sep 11, 2024

RalphSteinhagen added this to the CALL#5 - Security Hardening milestone Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve handling of tags with the processBulk Consumable/PublishableRange API #407

Improve handling of tags with the processBulk Consumable/PublishableRange API #407

wirew0rm commented Sep 11, 2024 •

edited

Loading

daniestevez commented Sep 12, 2024

RalphSteinhagen commented Sep 12, 2024

Improve handling of tags with the processBulk Consumable/PublishableRange API #407

Improve handling of tags with the processBulk Consumable/PublishableRange API #407

Comments

wirew0rm commented Sep 11, 2024 • edited Loading

daniestevez commented Sep 12, 2024

RalphSteinhagen commented Sep 12, 2024

wirew0rm commented Sep 11, 2024 •

edited

Loading