Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] - Ensure completeness when fetching all pages using REST API #859

Open
krassowski opened this issue Aug 5, 2024 · 5 comments · May be fixed by #881
Open

[ENH] - Ensure completeness when fetching all pages using REST API #859

krassowski opened this issue Aug 5, 2024 · 5 comments · May be fixed by #881

Comments

@krassowski
Copy link

Feature description

Currently the pagination API uses page-based pagination without sorting by time of creation/modification. This leads to incomplete results when iterated over the pages in dynamic systems with multiple modifications happening at the same time when results are fetched.

For example, if during the process of iterating over the pages of environments (api/v1/environment) an additional environment is created, either:

  • the new environment will be omitted, or
  • a different random environment will be omitted

this is because the environments are sorted by namespace and name. For a more concrete example see the details below.

Say we have five environments A, B, D, E, F, and page size equals 2.
User performs two actions:

  • requested creating environment called "C", and then
  • opens the form with a list of all environments

If the database commit happens after the second page is fetched, this could lead to the form getting replies:

  • A, B - first page (ok)
  • D, E - second page (ok at the time)
  • E, F - third page - oops!

If many users are creating environments, the admin would be randomly missing some environments.

If the results were sorted by date of environment was created (/last modified if environments can be renamed), then this is not a problem, because the replies could be (assuming A, B, D, E, F were created in this order):

  • A, B
  • D, E
  • F, C

Originally posted in nebari-dev/nebari#2599 (comment)

Instead, one of pagination implementations which guarantees data completeness should be used across conda-store REST API:

  • cursor-based pagination
  • page pagination with a sort order guaranteeing that newly added records are on the last page (e.g. by date of creation/modification for environments)
  • time-based pagination

Value and/or benefit

No randomly missing items (e.g. environments) in paginated replies.

Anything else?

No response

@trallard
Copy link
Collaborator

@peytondmurray was this addressed by the environment fetch enhancement you made recently?

@peytondmurray
Copy link
Contributor

No, this has to do with the pagination machinery, not the underlying fetch. Let's keep this open.

@peytondmurray peytondmurray self-assigned this Sep 10, 2024
@peytondmurray
Copy link
Contributor

Coming back to this, I think we can most easily fix this by simply sorting paginated results by time. I'm sure it's possible to do with the other proposed methods, but this just seems simplest to me. @trallard and @krassowski do you have preferences about this?

This seems like an easy fix, so once I have your input I'll happily fix this.

@krassowski
Copy link
Author

I think we can most easily fix this by simply sorting paginated results by time

Yes, that's one of the proposed methods :)

@peytondmurray
Copy link
Contributor

Okay, with two votes for sorting by time, I'll do this. Stay tuned!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: New 🚦
Development

Successfully merging a pull request may close this issue.

3 participants