Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate use of seq_trace label for span context #2

Open
tsloughter opened this issue May 30, 2019 · 13 comments
Open

Investigate use of seq_trace label for span context #2

tsloughter opened this issue May 30, 2019 · 13 comments
Assignees

Comments

@tsloughter
Copy link
Collaborator

Add Vince to assignees when he accepts his invite to the repo.

From the meeting minutes:

  • Consider using seq_trace to support tracing independent of library (The main issue is that it could interfere with other uses of the tracing token)
  • In terms of logger metadata, we should plan to use OTP21 logger, which is planned to be integrated with Elixir in 1.10 or 1.11 (2020)
  • Ideally, the seq_trace token would be a map, so that multiple use-cases could share it using different keys (i.e. users would get_token, update the map, and set_token)
  • Perhaps the initial focus should be on explicit white-box tracing?
  • seq_trace only propagates context between processes via message passing - creation of spans is still as normal and not magical/automatic.
  • Action: Vince & Greg to look into whether seq_trace is going to be usable for us before the next meeting.
@zachdaniel
Copy link
Collaborator

Vince made this and linked to it in the minutes google doc, but I'm posting it here so everyone can see it:

https://github.com/binaryseed/seq_trace_intro

@binaryseed binaryseed self-assigned this May 30, 2019
@binaryseed
Copy link
Collaborator

Before we can determine if it's usable, we need to clarify what it'd be used for...

Are we suggesting adopting this in individual projects that do tracing like Open Census , or creating some kind of light-weight telemetry_tracing like project that would be adopted by various projects?

@tsloughter
Copy link
Collaborator Author

I do wish we had a shared "context" library. I threw this together for opencensus and grpcbox https://github.com/tsloughter/ctx

But that is bigger than span context, it is what we keep span context in when not using the pdict. Even though it would not hurt to have tags and deadlines propagate with messages, I wouldn't want to shove that in the seqtrace label.

Back on topic.. we probably don't need to create a project for just this.

@tsloughter
Copy link
Collaborator Author

I don't expect to see an issue but we probably want to do some benchmarks on use of process dict, seq_trace and regular map with span context inside. Including to make sure it doesn't hurt message passing performance.

@zachdaniel
Copy link
Collaborator

That’s a good point. If we went this route, we’d be forcing all messages sent from a traced process to include additional data, as opposed to having to “manually” propagate context.

@tsloughter
Copy link
Collaborator Author

I think that is fine when it is just a few integers. But it should be benched.

@tsloughter
Copy link
Collaborator Author

I just opened this on opencensus but it is partly related to this issue and measuring performance of different context storage methods census-instrumentation/opencensus-erlang#155

@zachdaniel
Copy link
Collaborator

I suspect you are right, but it would be a new side effect that we would probably want to make very clear. I think your point in our meeting was a good one that this sort of "hijacks" a utility that can only have a single user at any one time. I also agree with @binaryseed that that is probably not motivation enough to not use it (especially if it can be disabled via configuration). But if someone has some tight system level constraints, adding even small overhead to message passing could theoretically be problematic. Or if someone has a multi-network cluster, adding extra data to messages could physically cost them money.

@binaryseed
Copy link
Collaborator

I think doing some benchmarks is a great idea. Based on my experience, I suspect that leveraging the VM to pass context will be plenty fast, likely faster than manually passing full context maps around in application code.

The pattern I'd suspect will be ideal is that the seq_trace label contains minimal information, basically just a trace identifier (and anything else needed very frequently), and we use an ETS table keyed off that ID to store any other trace data. Wether we do a ETS per trace or a single global ETS or a partitioned set could be informed by a benchmark too.

So the cases we'd want to compare, roughly:

  • seq_trace
    • label: Trace ID
    • label: full Context map
  • Manual context passing
  • Process Dictionary

@tsloughter How does the process dictionary work when crossing process boundaries? I assume there is some manual instrumentation to propagate the context?

@tsloughter
Copy link
Collaborator Author

@binaryseed for the pdict yea, you'll need something that is just like X = get(span_ctx) and then it acts the same as with manual context passing as a variable, but with the additional step of on the other side having to do put(span_ctx, X).

@binaryseed
Copy link
Collaborator

I did some benchmarking documented here: binaryseed/seq_trace_intro#1

There were pretty minimal differences in overhead (on the order of 0-2 us on my very old macbook air)

@tsloughter
Copy link
Collaborator Author

@binaryseed nice. I'll work on adding to that benchmarking to cover cases that are just regular context processing (as in, without message passing).

@tsloughter
Copy link
Collaborator Author

I've taken a first go at implementing this for OpenTelemetry open-telemetry/opentelemetry-erlang@acd6811

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants