Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O+M 2023-07-28 #4399

Closed
10 tasks done
btylerburton opened this issue Jul 21, 2023 · 1 comment
Closed
10 tasks done

O+M 2023-07-28 #4399

btylerburton opened this issue Jul 21, 2023 · 1 comment
Assignees
Labels
O&M Operations and maintenance tasks for the Data.gov platform

Comments

@btylerburton
Copy link
Contributor

btylerburton commented Jul 21, 2023

As part of day-to-day operation of Data.gov, there are many Operation and Maintenance (O&M) responsibilities. Instead of having the entire team watching notifications and risking some notifications slipping through the cracks, we have created an O&M Triage role. One person on the team is assigned the Triage role which rotates each sprint. This is not meant to be a 24/7 responsibility, only East Coast business hours. If you are unavailable, please note when you will be unavailable in Slack and ask for someone to take on the role for that time.

Check the O&M Rotation Schedule for future planning.

Miscs

Acceptance criteria

You are responsible for all O&M responsibilities this week. We've highlighted a few so they're not forgotten. You can copy each checklist into your daily report.

Daily Checklist

Check Production State/Actions

Note: Catalog Auto Tasks
You will need to update the chart values manually. Click the Action link in each issue and grab the values from monitor task output and check runtime.

Weekly Checklist

@btylerburton btylerburton added the O&M Operations and maintenance tasks for the Data.gov platform label Jul 21, 2023
@btylerburton btylerburton self-assigned this Jul 21, 2023
@FuhuXia FuhuXia self-assigned this Jul 25, 2023
@btylerburton btylerburton removed their assignment Jul 31, 2023
@FuhuXia FuhuXia closed this as completed Aug 1, 2023
@FuhuXia
Copy link
Member

FuhuXia commented Aug 3, 2023

catalog-admin was down on 7/28, with solr error. It was due to egress proxy error.
Slack discussion for the incident.
Issue created to monitor the egress proxy and alert us when it is not functioning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O&M Operations and maintenance tasks for the Data.gov platform
Projects
Status: 🗄 Closed
Development

No branches or pull requests

2 participants