You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The issue with lazy evaluation of data is that errors only occur when we collect the data. At this point, it's no longer possible to fix errors that were caused by previous steps.
For example, if later rows don't match the inferred schema, an error is thrown. Users must then change e.g. their call of Table.from_csv_file and set the inference length (#749) or override parts of the schema (#754).
Ideally, we should automatically recover from such errors.
Desired solution
In Table, don't store a lazy frame directly. Instead, store a factory function that produces a lazy frame. This allows
passing arguments from later steps to produce the lazy frame,
trying again (with different arguments).
When the lazy frame is collected, catch relevant errors, and rebuild the lazy frame
with a larger schema inference length,
if that fails, some columns forced to string type.
We need to be cautious that this works properly with memoization, though.
Possible alternatives (optional)
No response
Screenshots (optional)
No response
Additional Context (optional)
No response
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
The issue with lazy evaluation of data is that errors only occur when we collect the data. At this point, it's no longer possible to fix errors that were caused by previous steps.
For example, if later rows don't match the inferred schema, an error is thrown. Users must then change e.g. their call of
Table.from_csv_file
and set the inference length (#749) or override parts of the schema (#754).Ideally, we should automatically recover from such errors.
Desired solution
In
Table
, don't store a lazy frame directly. Instead, store a factory function that produces a lazy frame. This allowsWhen the lazy frame is collected, catch relevant errors, and rebuild the lazy frame
We need to be cautious that this works properly with memoization, though.
Possible alternatives (optional)
No response
Screenshots (optional)
No response
Additional Context (optional)
No response
The text was updated successfully, but these errors were encountered: