Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexes in database #98

Open
martinalig opened this issue May 18, 2020 · 8 comments
Open

Indexes in database #98

martinalig opened this issue May 18, 2020 · 8 comments

Comments

@martinalig
Copy link
Contributor

Check if it might make sense to create additinal indexes in the database.

@ubamrein
Copy link
Contributor

Is this still relevant? Any progress?

@iCesofT
Copy link
Contributor

iCesofT commented Oct 30, 2020

Hi... you should create an index like:

CREATE INDEX IN_DPPPT_GAEN_EXPOSED
    ON DPPPT.T_GAEN_EXPOSED (ROLLING_START_NUMBER, RECEIVED_AT);

When the number of requests is high and you have thousands of records....

@ineiti
Copy link
Collaborator

ineiti commented Nov 2, 2020

I remember there has been an issue with creating indexes on the keys, as this will reveal some information. I'll ask Carmela and Wouter...

@iCesofT
Copy link
Contributor

iCesofT commented Nov 2, 2020

IMHO you are not revealing any information. With indexes, you are optimising the database access when you are "plenty" of TEKs.

If you don't create the index, you are doing "full scan" every time you execute this query:

@ineiti
Copy link
Collaborator

ineiti commented Nov 2, 2020

So the problem is this: currently all new keys are batched together in buckets by setting RECEIVED_AT to the beginning of a bucket-period. What I'm not sure is what happens if you index the keys in the database: is it theoretically possible to retrieve the arrival order of the keys?

@ineiti
Copy link
Collaborator

ineiti commented Nov 3, 2020

OK - I looked at it the wrong way: the server knows anyway who did what, so this doesn't matter.

However, discussing with Carmela and Wouter, we concluded that the sorting should not be done on exposed_id, but on key, else the clients can get a much better guess if a connection has been a key submission or not.

So making an index is a good idea - how to add the index afterwards?

@iCesofT
Copy link
Contributor

iCesofT commented Nov 3, 2020

Hi @ineiti .. If you are saying to create an index on key field, you already have it as long as you have the unique index.

The key thing is that you should create indexes for optimising your queries. If you have a query like this:

"select pk_exposed_id, key, rolling_start_number, rolling_period, transmission_risk_level"

You need to include those fields rolling_start_number and received_at in the index. Then you can sort by key so:

  1. Create the index I said before. (Indexes in database #98 (comment))
  2. Update your query to add the order by key field.

@ineiti
Copy link
Collaborator

ineiti commented Nov 3, 2020

Thanks for the explanation. The order by key is now replaced by a shuffle, which is already implemented for v2, but not yet for v1 - this is coming.

My original question is most probably known by Martin or Patrick: where to put those lines so the indexes are created correctly ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants