Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: lf_inherited_tags null value #487

Merged
merged 8 commits into from
Nov 2, 2023
Merged

fix: lf_inherited_tags null value #487

merged 8 commits into from
Nov 2, 2023

Conversation

svdimchenko
Copy link
Contributor

Description

I'm getting the following error when inherited tags is not set

pydantic.error_wrappers.ValidationError: 1 validation error for LfTagsConfig
inherited_tags
  none is not an allowed value (type=type_error.none.not_allowed)

Checklist

  • You followed contributing section
  • You kept your Pull Request small and focused on a single feature or bug fix.
  • You added unit testing when necessary
  • You added functional testing when necessary

@svdimchenko svdimchenko marked this pull request as ready for review November 2, 2023 11:51
@svdimchenko svdimchenko added enable-functional-tests Label to trigger functional testing bug Something isn't working labels Nov 2, 2023
@nicor88
Copy link
Contributor

nicor88 commented Nov 2, 2023

@svdimchenko seems that the PR introduces lf_inherited_tags as part of lf_tags_config, is that correct? if so this is a breacking config change that we should communicate in the next release.

@svdimchenko
Copy link
Contributor Author

@svdimchenko seems that the PR introduces lf_inherited_tags as part of lf_tags_config, is that correct? if so this is a breacking config change that we should communicate in the next release.

to be honest I thought it would be as it's represented here

class LfTagsConfig(BaseModel):
    enabled: bool = False
    tags: Optional[Dict[str, str]] = None
    tags_columns: Optional[Dict[str, Dict[str, List[str]]]] = None
    inherited_tags: List[str] = []
    inherited_tags: Optional[List[str]] = None

and yes, we should mention that in release notes

@nicor88
Copy link
Contributor

nicor88 commented Nov 2, 2023

In the last 1.6.4 release, this

lf_inherited_tags: domain, dataTier
lf_tags_config:
  enabled: true
  tags:
   	securityClassification: confidential

worked for me, also considering that we had a config.get('lf_inherited_tags)'... I like what you proposed better, because lf_inherited_tags is part of lf config.

@nicor88
Copy link
Contributor

nicor88 commented Nov 2, 2023

@svdimchenko could you update the readme too? Specifically here, we mention this:

models:
  my_project:
    example:
      +lf_inherited_tags: [inherited-tag-1, inherited-tag-2]

but should be

models:
  my_project:
    example:
      +lf_tags_config:
          enabled: true
          lf_inherited_tags: [inherited-tag-1, inherited-tag-2]

@svdimchenko
Copy link
Contributor Author

@svdimchenko could you update the readme too? Specifically here, we mention this:

models:
  my_project:
    example:
      +lf_inherited_tags: [inherited-tag-1, inherited-tag-2]

but should be

models:
  my_project:
    example:
      +lf_tags_config:
          enabled: true
          lf_inherited_tags: [inherited-tag-1, inherited-tag-2]

sure, I've updated our readme file

@svdimchenko
Copy link
Contributor Author

@dacreify could you please take a look here as well ?

@svdimchenko svdimchenko merged commit e398aec into dbt-labs:main Nov 2, 2023
10 checks passed
@dacreify
Copy link
Contributor

dacreify commented Nov 7, 2023

@nicor88 @svdimchenko Apologies just catching up here. I initially had lf_inherited_tags nested inside lf_tags_config but I ran into an issue: There was no way to set lf_inherited_tags at the project level and also apply specific tags to individual models. i.e. It didn't seem like lf_tags_config values from the project level were merged with ones at the model level. Is that not the case? I'm trying to avoid having to redundantly set lf_inherited_tags on every model I want to tag.

@nicor88
Copy link
Contributor

nicor88 commented Nov 7, 2023

@dacreify I don't know how the setup looks like for you, but can't something like this?

models:
  my_project:
     models_to_tag:
       +lf_tags_config:
          enabled: true
          lf_inherited_tags: [inherited-tag-1, inherited-tag-2]

works for you? e.g. you put group of models with the same lf_inherited_tags under one folder?

@svdimchenko
Copy link
Contributor Author

@dacreify I don't know how the setup looks like for you, but can't something like this?

models:
  my_project:
     models_to_tag:
       +lf_tags_config:
          enabled: true
          lf_inherited_tags: [inherited-tag-1, inherited-tag-2]

works for you? e.g. you put group of models with the same lf_inherited_tags under one folder?

I use same setup and everything works OK. Are there any edge cases missed here @dacreify ?

@dacreify
Copy link
Contributor

dacreify commented Nov 9, 2023

@nicor88 @svdimchenko

Here's the scenario we're trying to accomplish:

dbt_project.yml

models:
  my_project:
     models_to_tag:
       +lf_tags_config:
          enabled: true
          lf_inherited_tags: [inherited-tag-1, inherited-tag-2]

models/models_to_tag/models_to_tag.yml

models:
  - name: my_model
    config:
      lf_tags_config:
        enabled: true
        tags:
          my_tag: my_value

Here the model-level lf_tags_config clobbers the project-level one and my lf_inherited_tags is lost. This is why we hoisted it to be a peer with lf_tags_config.

Per the DBT documentation:

https://docs.getdbt.com/reference/resource-configs/plus-prefix

Note: This use of the + prefix, in dbt_project.yml, is distinct from the use of + to control config merge behavior (clobber vs. add) in other config settings (specific resource .yml and .sql files). Currently, the only config which supports + for controlling config merge behavior is grants.

Looks like there's syntax to support merging instead of clobbering but it only applies to grants currently. :/

@dacreify
Copy link
Contributor

@nicor88 @svdimchenko Any thoughts here?☝️ With potentially hundreds of models to tag repeating the lf_inherited_tags block for every single one could be onerous.

@nicor88
Copy link
Contributor

nicor88 commented Nov 14, 2023

@dacreify I understand the reason why you want to have lf_inherited_tags outside the lf_tags_config, but again, can't you group your models with the same lf_tags_config together in order to use dbt project configs?
Also, in my current setup, lf_tags_config is setup by model basis, because we want to overwride only specific tag value for specific columns, specifically to default tags in case of PII data.

@dacreify
Copy link
Contributor

@nicor88 Your second point is exactly why grouping the models and applying things at the project level doesn't help: We tag many models at the column level and will need to repeat our lf_inherited_tags on each model under the current implementation.

It's too bad DBT hasn't generalized the model-level usage of + on grants to merge other kinds of configs like this. I just ran my project under 1.7.0 to confirm the clobbering behavior and got Failed to remove LF tags as expected.

Thanks for the discussion here! If we have to duplicate lf_inherited_tags on every model we want to tag at the column level at least it will fail loudly if someone forgets to do so. I understand DRY isn't always the highest principal. 🙂

@svdimchenko
Copy link
Contributor Author

svdimchenko commented Nov 15, 2023

@nicor88 Your second point is exactly why grouping the models and applying things at the project level doesn't help: We tag many models at the column level and will need to repeat our lf_inherited_tags on each model under the current implementation.

It's too bad DBT hasn't generalized the model-level usage of + on grants to merge other kinds of configs like this. I just ran my project under 1.7.0 to confirm the clobbering behavior and got Failed to remove LF tags as expected.

Thanks for the discussion here! If we have to duplicate lf_inherited_tags on every model we want to tag at the column level at least it will fail loudly if someone forgets to do so. I understand DRY isn't always the highest principal. 🙂

hi there @dacreify ! Sorry for the delay from my side. Just the previous implementation broke the pipelines at least for 2 community members including my setup, my bad I did not test all edge cases on the review stage, that's why needed to fix that rapidly.
WDYT if you add default_lf_inherited_tags (like lf_tags_database as default lf tags to be assigned to database created by dbt) parameter to profile config and set it as default value for all models. And if the config for the model is not emply you can join lists of inherited tags. Will such approach work for you ?

@nicor88
Copy link
Contributor

nicor88 commented Nov 15, 2023

@svdimchenko Isn't that confusing from a user prospective to have default_lf_inherited_tags on profile level and lf_inherited_tags as part of lf_tags_config?
Also my 2 cents, we shouldn't setup default_lf_inherited_tags on the profile but rather pick it from dbt_project or model configs. I know that documentation will solve how to use it, but still 2 variables with a really similar name to achieve the same purpose. Can't lf_inherited_tags be removed from lf_tags_config as originally implemented?

e.g. add_lf_tags can still receive lf_inherited_tags, but instead of doing LfTagsConfig(**(lf_tags_config | dict(inherited_tags=lf_inherited_tags) we pass lf_inherited_tags to LfTagsManager, and deal with None accordingly in there...

@dacreify
Copy link
Contributor

@svdimchenko Apologies for missing a spot and causing that regression! Next time I'll do more manual testing on my end.

@nicor88 It sounds like what you're describing is going back to the 1.6.4 behavior and accounting for the None properly? I can make a draft PR to look at if so.

We are running fine on 1.6.4 for now.

Thanks!

@svdimchenko
Copy link
Contributor Author

@svdimchenko Apologies for missing a spot and causing that regression! Next time I'll do more manual testing on my end.

@nicor88 It sounds like what you're describing is going back to the 1.6.4 behavior and accounting for the None properly? I can make a draft PR to look at if so.

We are running fine on 1.6.4 for now.

Thanks!

Sure, we can add extra config like default_lf_inherited_tags for the model level and set it on top of dbt_project.yml file. In this case we should parse this config in all materializations as well.

On the other hand, we can have the default_lf_inherited_tags set in profile and pass it to all models as extra list. No need to parse it in materialisations, as lf_tags_config is already parsed there. So all we need is just to get default_lf_inherited_tags from credentials and pass the correct value to LfTagsManager class. Just seems more elegant to me.

@nicor88
Copy link
Contributor

nicor88 commented Nov 21, 2023

Thanks @svdimchenko - what you proposed sounds good for me - you are right, having in profiles and referring the default_lf_inherited_tags from there can be really helpful and clean, but we can take both approaches - as we do for s3_data_dir for example. We can try to see if credentials.default_lf_inherited_tags is not none or we fallback to dbt_project or model config.

@dacreify sure give it a shot. - issue #518

@svdimchenko svdimchenko deleted the fix/lf-inherited-tags-null-value branch January 5, 2024 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enable-functional-tests Label to trigger functional testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants