Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Collect output from all attempts of workflows; full JSON output #544

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

timmc-edx
Copy link
Member

This is a reworking of get_action_errors.py with the following goals:

  • Retrieve all attempts of a workflow, not just the last one. This is important for collection of transient errors since people will usually re-run their workflows when there is a failure unrelated to their code.
  • Tolerate interruptions of the script due to timeouts, rate-limiting, and networking issues. On re-run, workflow runs that have been fully fetched are skipped.
  • Record the full JSON for workflow attempts, checks, and annotations so that we can do more in-depth queries if needed.

The script no longer produces a CSV, but it's still straightforward to query how often we're seeing a particular error.

Merge checklist:
Check off if complete or not applicable:

  • Version bumped
  • Changelog record added
  • Documentation updated (not only docstrings)
  • Fixup commits are squashed away
  • Unit tests added/updated
  • Manual testing instructions provided
  • Noted any: Concerns, dependencies, migration issues, deadlines, tickets

This is a reworking of `get_action_errors.py` with the following goals:

- Retrieve all attempts of a workflow, not just the last one. This is
  important for collection of transient errors since people will usually
  re-run their workflows when there is a failure unrelated to their code.
- Tolerate interruptions of the script due to timeouts, rate-limiting, and
  networking issues. On re-run, workflow runs that have been fully fetched
  are skipped.
- Record the full JSON for workflow attempts, checks, and annotations so
  that we can do more in-depth queries if needed.

The script no longer produces a CSV, but it's still straightforward to
query how often we're seeing a particular error.

# Get the checks associated with this workflow run -- this
# includes output title, summary, and text.
for check_run in self._github_get(attempt['check_suite_url'] + '/check-runs').json()['check_runs']:
Copy link
Member Author

@timmc-edx timmc-edx Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Author's note: This... isn't actually working. Different attempts on the same workflow are getting the same check runs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Once all attempts have been logged, write out a marker file
# that indicates this workflow has been completely downloaded
# and can be skipped in the future.
self._write_json(self.download_marker_data, download_marker)
Copy link
Member Author

@timmc-edx timmc-edx Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Author's note: Consider writing the latest attempt count here, and checking it at the start of the method. A workflow could have had new attempts since the last execution of the script. Somewhat of an edge case, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant