Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] __tmp_not_partitioned CTAS inherited model config "external_location" and causing clashing with the model data table #683

Open
2 tasks done
u-ra-ra-ra opened this issue Jul 3, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@u-ra-ra-ra
Copy link

u-ra-ra-ra commented Jul 3, 2024

Is this a new bug in dbt-athena?

  • I believe this is a new bug in dbt-athena
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I have a model that writes out to more than 100 hive partitions.
I recently dropped the table and attempted to recreate via dbt build. I have an external_location in configuration as I was migrated onto dbt and want to avoid rebuild data / move s3 location.

The unexpected behavior is dbt build succeeded but my s3 path does not contain any data.

dbt / adaptor version:

"app": "dbt", "dbt_version": "1.7.17"
Registered adapter: athena=1.7.2

The CTAS generated:

On model.A.XX: -- /* {"app": "dbt", "dbt_version": "1.7.17", "profile_name": "A", "target_name": "prod", "node_id": "model.A.XX"} */
    
    create table "awsdatacatalog"."{schema}"."XX__tmp_not_partitioned"
    with (
      table_type='hive',
      is_external=true,external_location= {config.external_location}
      format='orc'
    )
    as
      SELECT

Expected Behavior

This XX__tmp_not_partitioned should not use the configured external path which is clashing with the actual model. Instead it should either use a unique tmp location or same as __dbt_tmp table which goes to s3_data_dir/schema/table__dbt_tmp

Steps To Reproduce

create a partitioned model with more than 100 hive partitions, with an external_location in config.

config:
    enabled: true
    external_location:some_p3_path

If this table exists in glue, drop it.

Run dbt run to create more than 100 partitions in one sql run.

Environment

- OS: ubuntu 22.04.4 jammy
- Python: 3.10
- dbt: 1.7.17
- dbt-athena-community: 1.7.2

Additional Context

I am happy to contribute to get this fixed if that helps speed things up.

https://github.com/dbt-athena/dbt-athena/blob/main/dbt/include/athena/macros/materializations/models/table/create_table_as.sql

@u-ra-ra-ra u-ra-ra-ra added the bug Something isn't working label Jul 3, 2024
@nicor88
Copy link
Contributor

nicor88 commented Jul 4, 2024

@u-ra-ra-ra feel free to propose a pr and we will be happy to review it.

@u-ra-ra-ra
Copy link
Author

I will hopefully have some time next week to give it shot.

@lucastrubiano
Copy link

lucastrubiano commented Aug 27, 2024

At my work, I experienced the same issue, and I found that one solution is to set temporary to true on the following line in the file:

dbt-athena/dbt/include/athena/macros/materializations/models/table/create_table_as.sql

Line 165: change temporary to true:
{%- do run_query(create_table_as(true, tmp_relation, compiled_code, language, true)) -%}

Screenshot 2024-08-26 at 9 05 10 PM

I hope this helps!

@nicor88
Copy link
Contributor

nicor88 commented Aug 27, 2024

@lucastrubiano thanks for spotting this issue. Do you mind to raise a PR with your fix? Thanks

@lucastrubiano
Copy link

@nicor88 Glad to push the changes, but I don’t have permissions to create a branch or PR. Any guidance on how to proceed?

@nicor88
Copy link
Contributor

nicor88 commented Sep 3, 2024

@lucastrubiano
I recommend to have a look at the contributing section.
Long story short, if you are not a project maintainer you cannot push directly to dbt-athena/dbt-athena.
You have to fork the repository, and then propose a PR from your fork, pretty much here you can find a detailed guide on how to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants