Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Allow setting of batch_size as a top-level property of a model #10637

Closed
1 task done
Tracked by #10624
QMalcolm opened this issue Aug 29, 2024 · 1 comment · Fixed by #10594
Closed
1 task done
Tracked by #10624

[Feature] Allow setting of batch_size as a top-level property of a model #10637

QMalcolm opened this issue Aug 29, 2024 · 1 comment · Fixed by #10594
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@QMalcolm
Copy link
Contributor

QMalcolm commented Aug 29, 2024

Housekeeping

  • I am a maintainer of dbt-core

Short description

In #10624 we're building a new type of incremental model strategy, microbatch. To support that we need a concept an batch_size. The batch_size is essentially how "big" the incrementing of the model is. It should support the values day, month, and year. The effect in practice is that if your batch_size is day then when the event_start_time is generated in #10636, we can properly calculate what the "start" of a batch should be. Additionally, the lookback which is introduced in #10662 will essentially offset the event_start_time by an integer multiple of the batch_size

If defining it in the model sql it'd look like

-- my_model.sql
{{ config(
     materialization='incremental',
     incremental_strategy='microbatch',
     event_time='my_time_field',
     batch_size='day', -- supported values: day, month, year
   )
}}
...

If defining it in the model yaml it'd look like:

models:
  - name: my_model
     config:
       event_time: my_time_field
       incremental_strategy: microbatch
       batch_size: day # supported values: day, month, year

Acceptance criteria

  • batch_size can be set as day, month, or year for an incremental microbatch model

Suggested Tests

  • the batch size makes it into the python representation of the model

Impact to Other Teams

Cloud artifacts, new property in the config of models

Will backports be required?

No

Context

No response

@QMalcolm QMalcolm added the enhancement New feature or request label Sep 4, 2024
@QMalcolm QMalcolm changed the title Allow setting of event_time_granularity as a top-level property of a model [Feature] Allow setting of batch_size as a top-level property of a model Sep 4, 2024
@QMalcolm QMalcolm added this to the v1.9 milestone Sep 4, 2024
@QMalcolm
Copy link
Contributor Author

Resolved by #10594

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants