Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(taps): add argument for nan handling strategy to _flatten_record #2222

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

dgawlowsky
Copy link
Contributor

@dgawlowsky dgawlowsky commented Feb 5, 2024

@dgawlowsky
Copy link
Contributor Author

Fix #2213

Copy link

codspeed-hq bot commented Feb 5, 2024

CodSpeed Performance Report

Merging #2222 will not alter performance

⚠️ No base runs were found

Falling back to comparing dgawlowsky:handle-nans-in-flattening (bcc3c45) with main (a72a987)

Summary

✅ 6 untouched benchmarks

Copy link

codecov bot commented Feb 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.76%. Comparing base (bf996a4) to head (bcc3c45).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2222   +/-   ##
=======================================
  Coverage   88.76%   88.76%           
=======================================
  Files          54       54           
  Lines        4769     4772    +3     
  Branches      928      928           
=======================================
+ Hits         4233     4236    +3     
  Misses        374      374           
  Partials      162      162           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -20,6 +20,7 @@ class FlatteningOptions(t.NamedTuple):
max_level: int
flattening_enabled: bool = True
separator: str = DEFAULT_FLATTENING_SEPARATOR
nan_strategy: t.Literal["fail", "allow"] = "fail"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to support msgspec for serialization in the near future (#1784), and it defaults to making the values null. Do you think there's a way to make these options future-compatible with msgspec?

https://jcristharif.com/msgspec/supported-types.html#float

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specify a little more details? I am not very familiar with msgspec lib. In simplejson which is currently used there is ignore_nan argument which could be used. I am thinking about adding one more nan_strategy "convert_null" which will set ignore_nan to True. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about adding one more nan_strategy "convert_null" which will set ignore_nan to True. WDYT?

@dgawlowsky that makes sense to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edgarrmondragon
Added this! sorry for the delay!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dgawlowsky! I'll review this soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @edgarrmondragon! Can you share the status of the review? We are using meltano in one of our projects and will be deploying do production soon, hence it will be easier to pull library from official distribution. Thanks in advance!

@larsrinn
Copy link

What's the state of this? I'm trying to develop a tap where there None-values are expected. Do you see a way around this PR to get it done quickly?

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented May 27, 2024

What's the state of this? I'm trying to develop a tap where there None-values are expected. Do you see a way around this PR to get it done quickly?

@larsrinn The biggest blocker in this PR is a missing public setting to toggle this behavior, similar to other built-in ones like stream_maps and add_record_metadata.

I'm curious what's your use case for nan-handling?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants