Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL GIPK feature conflicts with tables without primary keys #420

Open
eujing opened this issue Dec 11, 2023 · 7 comments
Open

MySQL GIPK feature conflicts with tables without primary keys #420

eujing opened this issue Dec 11, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@eujing
Copy link

eujing commented Dec 11, 2023

I experienced this error while trying to load TPC-C on a MySQL server that came with GIPK enabled by default.

[ERROR] 2023-12-07 00:55:28,801 [Thread-1] com.oltpbenchmark.api.Loader loadCustomerHistory - No value specified for parameter 9

After digging into it more, the schema shows the history table should only have 8 values.
I realized GIPK was creating an additional column my_row_id as the schema does not set an explicit primary key, which is probably going into the auto-generation of the insert statements during loading.

Attached are the example schema descriptions from the order table (which has explicit primary key) versus the history table:
image

Disabling GIPK and re-loading the data helped me to mitigate this issue, but maybe the history table could use a primary key or GIPK columns should be excluded from insert statements.

@apavlo
Copy link
Member

apavlo commented Dec 12, 2023

@eujing We have the ability to exclude column names when we generate the INSERT query to load tables:

https://github.com/cmu-db/benchbase/blob/main/src/main/java/com/oltpbenchmark/util/SQLUtil.java#L481-L485

I am thinking that adding my_row_id for MySQL will fix this problem. I would need to figure out where this gets populated.

@apavlo apavlo added the bug Something isn't working label Dec 12, 2023
@apavlo
Copy link
Member

apavlo commented Dec 12, 2023

After looking into this, we call TPCCLoader.getInsertStatement() to get the INSERT query to load the HISTORY table:

https://github.com/cmu-db/benchbase/blob/main/src/main/java/com/oltpbenchmark/benchmarks/tpcc/TPCCLoader.java#L143-L147

I don't want to add specific logic for MySQL here because then we would need to duplicate it across all the benchmarks. I am thinking that we need to create a separate location to list out the columns to exclude per DatabaseType. I am wondering if we should also move the other attributes in there to a separate config file:

https://github.com/cmu-db/benchbase/blob/main/src/main/java/com/oltpbenchmark/types/DatabaseType.java#L65-L81

@bpkroth Thoughts? I don't know the best way to do this in Java. This could potentially allow us to easily support new DBMSs without having to modify java code. But then it will mean that DatabaseType is no longer strongly typed?

@eujing
Copy link
Author

eujing commented Dec 12, 2023

Wanted to add that @lmwnshn gave me the initial idea for the extra row id column, and mentioned that a similar exclusion was done for cockroach DB 😄 :

if (databaseType.equals(DatabaseType.COCKROACHDB)) {
// cockroachdb has a hidden column called "ROWID" that should not be directly used via the catalog
excludedColumns.add("ROWID");
}

@bpkroth
Copy link
Collaborator

bpkroth commented Dec 14, 2023

Related #351

@bpkroth
Copy link
Collaborator

bpkroth commented Dec 14, 2023

After looking into this, we call TPCCLoader.getInsertStatement() to get the INSERT query to load the HISTORY table:

https://github.com/cmu-db/benchbase/blob/main/src/main/java/com/oltpbenchmark/benchmarks/tpcc/TPCCLoader.java#L143-L147

I don't want to add specific logic for MySQL here because then we would need to duplicate it across all the benchmarks. I am thinking that we need to create a separate location to list out the columns to exclude per DatabaseType. I am wondering if we should also move the other attributes in there to a separate config file:

https://github.com/cmu-db/benchbase/blob/main/src/main/java/com/oltpbenchmark/types/DatabaseType.java#L65-L81

@bpkroth Thoughts? I don't know the best way to do this in Java. This could potentially allow us to easily support new DBMSs without having to modify java code. But then it will mean that DatabaseType is no longer strongly typed?

This answer probably stems more from a practical consideration for my current time commitments, but I think maintaining a few exceptions like this in the code is not too onerous. Developing some other system to do arbitrary overrides in config files could be complex and messy very quickly, and as you point out, may cause issues with Java's typing.

It does require that either our CI catches the issue on bumps to what the :latest tag of the docker service containers resolves to (often does), or that users report the issue when we haven't encountered it.

I would say that whenever we do encounter an issue like this, we should add a test for it so we don't accidentally break it in the future.

@bpkroth
Copy link
Collaborator

bpkroth commented Dec 14, 2023

@eujing if you have a PR for this, even if it's in draft form, please submit it so we can start discussing further. If not, please let me know and I'll try and add it to my queue. Thanks!

@eujing
Copy link
Author

eujing commented Dec 18, 2023

@bpkroth ah no I dont have a PR for this, I have not set up the dev environment for benchbase yet. Have been disabling GIPK on my side to use it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants