-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: add a utils method to getColumnReader with SQLConf #360
base: main
Are you sure you want to change the base?
Conversation
DataType type, ColumnDescriptor descriptor, int batchSize, SQLConf conf) { | ||
boolean useDecimal128 = (Boolean) CometConf.COMET_USE_DECIMAL_128().get(conf); | ||
boolean useLazyMaterialization = (Boolean) CometConf.COMET_USE_LAZY_MATERIALIZATION().get(conf); | ||
boolean useLegacyDateTimestamp = (Boolean) CometConf.COMET_USE_DECIMAL_128().get(conf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure... Is it correct to use COMET_USE_DECIMAL_128
for useLegacyDateTimestamp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to have COMET_EXCEPTION_ON_LEGACY_DATE_TIMESTAMP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with this area of code, but replacing a hard-coded constant with a config setting LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
boolean useDecimal128, | ||
boolean useLazyMaterialization) { | ||
// TODO: support `useLegacyDateTimestamp` for Iceberg | ||
DataType type, ColumnDescriptor descriptor, int batchSize, SQLConf conf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we do not need the old signature? It is ok to change now as we have not made a release yet, but just asking because this is a public method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think anybody uses the old signature yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will make it harder to make the column reader Spark agnostic as you're adding one more point of dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will make it harder to make the column reader Spark agnostic as you're adding one more point of dependency.
+1 for this point. Maybe we can accept an options
parameters in this class, and pass all options from SQLConf instead?
public static ColumnReader getColumnReader(
DataType type, ColumnDescriptor descriptor, Map<String, String> options) {
...
}
In the iceberg-spark side, we can simply call conf.getAllConfs.asJava
to this method?
return getColumnReader( | ||
type, descriptor, batchSize, useDecimal128, useLazyMaterialization, true); | ||
type, descriptor, batchSize, useDecimal128, useLazyMaterialization, useLegacyDateTimestamp); | ||
} | ||
|
||
public static ColumnReader getColumnReader( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, maybe you can just replace the old getColumnReader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment, though not a show stopper.
boolean useDecimal128, | ||
boolean useLazyMaterialization) { | ||
// TODO: support `useLegacyDateTimestamp` for Iceberg | ||
DataType type, ColumnDescriptor descriptor, int batchSize, SQLConf conf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will make it harder to make the column reader Spark agnostic as you're adding one more point of dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We need a way for users to config |
Hmm, I think @parthchandra meant that by adding SQLConf, |
* Use 3.4.3
Which issue does this PR close?
Closes #.
Rationale for this change
Add a Utils method
getColumnReader
which can passed inSQLConf
, so Iceberg can pass inCometConf
forColumnReader
What changes are included in this PR?
How are these changes tested?