-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #36563 - Remove orphaned content unit cleanup from index_content #10638
Conversation
Issues: #36563 |
a6eb979
to
06f9cd6
Compare
06f9cd6
to
7c403eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work ok . Checked sync and saw the new action
9dd0f85
to
70fc5ba
Compare
70fc5ba
to
442b1da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested the remove orphan calls. Behavior looked good. ACK
@@ -31,6 +31,10 @@ def run | |||
output[:new_content][content_type.label] = new_count - initial_counts[content_type.label] | |||
end | |||
end | |||
|
|||
def finalize | |||
ForemanTasks.async_task(OrphanCleanup::RemoveOrphanedContentUnits, {repo_id: input[:id]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm questioning now if we even need to destroy orphaned units after syncing.
For the orphaned filter rule problem, the solution is this:
def clean_filter_rules(repo_associations_to_destroy) |
Otherwise, I don't think orphaned content units should cause any other problems.
Plus, if a user syncs 20 repositories, they're going to get 20 orphan cleanup tasks running, which could eat up resources on their system.
Perhaps we should just rely on the weekly orphan cleanup?
If we do that, I just suggest that we change the following error messages to tell the users to run orphan cleanup instead of run a complete sync:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, maybe we should change those error messages with your PR here regardless. Complete sync no longer deletes the orphaned units itself.
end | ||
|
||
def run | ||
content_types_to_index = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
content_types_to_index
makes it sound a bit like the types will be indexed instead of checked for orphans.
repo = ::Katello::Repository.find(input[:id]) | ||
content_types_to_index = repo.repository_type.content_types_to_index | ||
else | ||
fail "Pass either a repository to determine content type or destroy_all to destroy all orphaned content units" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about removing the destroy_all
param? That way only repo_id
would be needed and the logic would be a bit simplified. Orphan cleanup should only be deleting useless data, so I don't think a safety net is needed.
Updated the PR to run orphan content unit deletion only in orphan cleanup task which should run once a week in deployed environments. |
87f89d9
to
5cc856b
Compare
PR title needs an update now ;) |
Should be ready for re-review. |
@@ -7,27 +7,16 @@ class RemoveOrphans < Actions::Base | |||
end | |||
def plan(proxy) | |||
sequence do | |||
if proxy.pulp_primary? | |||
::Katello::RootRepository.orphaned.destroy_all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why this was pulled out of the run phase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really..This was the only operation in the run phase after refactoring the content unit deletion..and this doesn't seem like something that needs to be in the run phase..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a DB operation, should try not to put it in the run phase because run
doesn't run in a transaction :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a heavy DB operation in the planning phase eat up a Puma worker? Also we shouldn't need transactions here at least since we're deleting orphans. But even if we did need one, couldn't we do a transaction
block?
(I don't think deleting orphaned root repos will be a heavy operation, but I'm a little confused because we stick heavy DB ops in tasks in other places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused because we stick heavy DB ops in tasks in other places
Usually they're in plan
or finalize
though, right? Those are fine because they run in transactions by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack to that..We tend to put only things that need API calls to other products in run phase generally. For this task, I'd be ok keeping this in run phase over the plan phase. Any failures on this are set to rescue=skip and also we do want to still plan other pulp orphan deletion actions even if the orphan deletion in katello fails. A plan phase failure wouldn't plan those pulp tasks.
We could put this in finalize if we want. Let me know if we feel strongly about that and I can update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. I trust your judgment to choose the phase correctly :)
5cc856b
to
5bf839e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me!
Katello#10638) * Fixes #36563 - Move remove orphan content units to async task * Fixes #36563 - Run orphan content unit deletion during orphan cleanup only (cherry picked from commit 51f7a12)
Katello#10638) * Fixes #36563 - Move remove orphan content units to async task * Fixes #36563 - Run orphan content unit deletion during orphan cleanup only (cherry picked from commit 51f7a12)
What are the changes introduced in this pull request?
Move remove content unit orphan deletionout of index_content to avoid increasing indexing times which impacts both syncs and publishes. Instead orphan deletion only happens during OrphanCleanup now.
Considerations taken when implementing this change?
What are the testing steps for this pull request?
Sync a repo. You should see the task: "Remove orphaned content units" run after completion of sync or publish CV actions asynchronously.0. Check
Katello::Rpm.count
in foreman-rake console.Katello::Rpm.count
in console.Katello::Rpm.count
should stay the same.bundle exec rails katello:delete_orphaned_content
. Go to foreman task in UI and wait for it to complete.Katello::Rpm.count
again. The orphaned dummy rpm records should get cleaned up..If you want to use real data, sync several large repos, publish them into a few CVs and create several versions.
That should create enough data to notice slow IndexContent action before this change.