Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Auto-vacuum Starvation #59

Open
mikehale opened this issue Nov 21, 2023 · 6 comments
Open

Document Auto-vacuum Starvation #59

mikehale opened this issue Nov 21, 2023 · 6 comments
Labels
documentation Improvements or additions to documentation

Comments

@mikehale
Copy link

I think we implemented this after your time at Heroku. We finally realized that sometimes the 2 PG auto vacuum processes would both end up working on tables that were taking hours to complete. That kept them from vacuuming the que_jobs table, which would lead to poor performance. We ended up adding a clock process task to manually vacuum our high churn tables at regular intervals, and we haven't seen poor job selection performance since.

It might be worthwhile to add something similar to what que has to your docs to mitigate this potential issue for river users.

@mikehale mikehale changed the title Autovacuum Starvation Document Auto-vacuum Starvation Nov 21, 2023
@bgentry
Copy link
Contributor

bgentry commented Nov 21, 2023

Oh hey @mikehale 👋 Thanks for raising this concern. Out of curiosity, how frequently are you manually vacuuming, how long does it take to complete, and are there any related settings you would want to share?

We have an internal maintenance process architecture where we could fairly trivially add a feature to have the leader initiate a vacuum on a specific schedule but would need to be careful to provide the right configs for it and probably not enable by default.

@mikehale
Copy link
Author

When I left we were vacuuming once every 5 minutes, and IIRC it took less than a second to complete. We had tuned the auto vacuum settings, but obviously those don't come into play with a manual vacuum. Our main concern was about affecting the performance of the rest of the database, through either a lock or increased I/O load. In practice neither of those manifested as issues, and I believe that is partly because of the consistently low overhead due to the regular and frequent vacumming. The trade-off of potentially higher I/O every 5 minutes vs randomly not having auto-vacuum run for many hours turned out to be a good one for us.

@bgentry bgentry added the documentation Improvements or additions to documentation label Nov 22, 2023
@dyeje
Copy link

dyeje commented May 10, 2024

We got bit by this. Was pretty frustrating to debug.

@bgentry
Copy link
Contributor

bgentry commented May 10, 2024

@dyeje can you add any context around what you were seeing, what your workload is like, jobs/sec or day, etc?

@brandur
Copy link
Contributor

brandur commented May 11, 2024

Postgres major version too please. I would've also expected that many of the B-tree optimizations would've helped slow down degenerate table bloat compared to those old days at Heroku.

@mikehale
Copy link
Author

It was admittedly an extreme case, but I think it had been over 24 hours that the auto vacuum processes were working on very large tables, that coupled with a very thrashed jobs table caused a noticeable though not fatal degradation in job acquisition performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants