Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spark.stage details attribute at the end of the stage #7608

Merged
merged 1 commit into from
Sep 12, 2024

Conversation

paul-laffon-dd
Copy link
Contributor

What Does This Do

Adds the spark.stage details attribute at the end of the stage, rather than at the beginning.

Motivation

The details attribute contains a large amount of data, including the full stack trace that initiated the stage. When using long-running spans, the span is flushed multiple times, which can significantly increase the ingestion volume.

By adding the details attribute at the end of the stage, we reduce the ingestion volume while still ensuring the information is available once the stage has completed.

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@paul-laffon-dd paul-laffon-dd added the inst: apache spark Apache Spark instrumentation label Sep 12, 2024
@paul-laffon-dd paul-laffon-dd marked this pull request as ready for review September 12, 2024 10:12
@paul-laffon-dd paul-laffon-dd requested a review from a team as a code owner September 12, 2024 10:12
@paul-laffon-dd paul-laffon-dd merged commit a472c9d into master Sep 12, 2024
83 checks passed
@paul-laffon-dd paul-laffon-dd deleted the paul.laffon/spark-stage-details branch September 12, 2024 15:16
@github-actions github-actions bot added this to the 1.40.0 milestone Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inst: apache spark Apache Spark instrumentation type: enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants