You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The idea is not to keep a too large backlog in the agent's queue: jobs to be created, jobs created, jobs pending; such that badly behaving workflows - especially with respect to very short jobs - don't crash the agents and/or blow up component's duty cycle.
Describe the solution you'd like
Making up a reasonable enough number, I would say keeping between 50-70k jobs pending in the condor queue (and nothing else queued in local workqueue and/or in created status) would be a good commitment.
Note that for AgentStatusWatcher attributes, the actual thresholds are weighted according to the number of agents connected t the same team name (the more agents share the same team, the smaller those pending thresholds are).
Describe alternatives you've considered
We will probably have to test out a few different configurations with a production scenario, hence with 4 or 5 agents connected to the same team name.
Additional context
None
The text was updated successfully, but these errors were encountered:
Impact of the new feature
WMAgent
Is your feature request related to a problem? Please describe.
The idea is not to keep a too large backlog in the agent's queue: jobs to be created, jobs created, jobs pending; such that badly behaving workflows - especially with respect to very short jobs - don't crash the agents and/or blow up component's duty cycle.
Describe the solution you'd like
Making up a reasonable enough number, I would say keeping between 50-70k jobs pending in the condor queue (and nothing else queued in local workqueue and/or in
created
status) would be a good commitment.This can be achieved by tweaking one - or both - WorkQueueManager parameters:
https://github.com/dmwm/WMCore/blob/a63cf47/etc/WMAgentConfig.py#L147-L148
or the AgentStatusWatcher pending attributes:
https://github.com/dmwm/WMCore/blob/a63cf47/etc/WMAgentConfig.py#L355-L356
Note that for AgentStatusWatcher attributes, the actual thresholds are weighted according to the number of agents connected t the same team name (the more agents share the same team, the smaller those pending thresholds are).
Describe alternatives you've considered
We will probably have to test out a few different configurations with a production scenario, hence with 4 or 5 agents connected to the same team name.
Additional context
None
The text was updated successfully, but these errors were encountered: