Skip to content

Support for OTel tracing#681

Merged
colega merged 21 commits intomainfrom
otel-everything
Apr 24, 2025
Merged

Support for OTel tracing#681
colega merged 21 commits intomainfrom
otel-everything

Conversation

@colega
Copy link
Copy Markdown
Contributor

@colega colega commented Apr 17, 2025

What this PR does

This adds support for OTel everywhere. It is similar to #385 which migrated to OTel, and takes pieces of code from there, but it does not remove support for OpenTracing.

spanlogger package changes

The new spanlogger.NewOTel will create a new OTel span, using the tracer provided. We're asking for a tracer instead of using one from spanlogger because that's how Tempo did it with per-package Tracer.

spanlogger.FromContext will find the existing span in the context, either OpenTracing or OTel one, and will return a SpanLogger that abstracts both.

In order to achieve that spanlogger.SpanLogger now contains either an opentelemetry.Span or an trace.Span (OTel). If there's an OTel one, it will do the operations with it, if not it will fall back to OpenTracing implementation. Some calls that were cheaper before, like LogFields now are more expensive since we have to translate them to OTel attributes.

httpgrpc/server package changes

When serving an HTTP request, we'll use OpenTracing to start the span if opentracing.IsGlobalTracerRegistered(). If there's an OTel span in the request context, we'll inject that one to the request.

instrument.CollectedRequest() changes

This function is widely used, so I migrated it from using opentracing.StartSpanFromContext to tracing.StartSpanFromContext: a new wrapper in the tracing package that will use the configured tracing library.

tracing package changes

The new tracing.StartSpanFromContext creates an adapter tracing.Span that acts like an opentracing.Span or trace.Span depending on the library used in the parent span from the context.Context provided to StartSpanFromContext.

This abstraction should disappear once we've migrated our dependencies to OTel. 🤞 (Famous last words)

The ExtractSampledTraceID, ExtractTraceID and ExtractTraceSpanID now support both opentracing and OTel libraries. This should solve #662 as well.

Finally, I brought from #385 the NewOTelFromJaegerEnv function that will configure OTel tracing using Jaeger env vars, with Jaeger exporter.

server package changes

Server now has the otelgrpc.NewServerHandler() which will inject the OTel Spans to the context.

middleware package changes

When propagating an HTTP or HTTPGRPC requests, we'll use OpenTracing to start the span if opentracing.IsGlobalTracerRegistered(). If there's an OTel span in the request context, we'll inject that one to the request.

Which issue(s) this PR fixes

Ref: grafana/mimir#2708

Checklist

  • Tests updated
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@colega
Copy link
Copy Markdown
Contributor Author

colega commented Apr 17, 2025

Still draft because I need to fix tests, add tests, and double check the server configuration of middlewares.

colega added a commit to grafana/mimir that referenced this pull request Apr 17, 2025
Ref: grafana/dskit#681
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
@mdisibio
Copy link
Copy Markdown
Contributor

The ExtractSampledTraceID, ExtractTraceID and ExtractTraceSpanID now support both opentracing and OTel libraries. This should solve #662 as well.

Yes, I wasn't sure about taking on the reference otel/trace, but it seems ok. The ones I am familiar with, Mimir/Loki/Tempo/Pyroscope all already have it. Are there any other downstream repos we should check, or do you think the impact of taking on this reference is minimal?

colega added 9 commits April 21, 2025 12:04
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
colega added a commit to grafana/mimir that referenced this pull request Apr 21, 2025
This updates dskit to grafana/dskit#681 which
supports OTel tracing, but doesn't make any change, so it should still
use OpenTracing

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
colega added 2 commits April 21, 2025 17:21
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
@colega
Copy link
Copy Markdown
Contributor Author

colega commented Apr 22, 2025

I assume it's inevitable with my task to add the otel/trace dependency. The ultimate goal, once Loki & Pyroscope are migrated too, is to remove the opentracing-go one.

Comment thread httpgrpc/server/server.go Outdated
middleware.ClientUserHeaderInterceptor,
),
grpc.WithChainUnaryInterceptor(unaryInterceptors...),
grpc.WithStatsHandler(otelgrpc.NewClientHandler()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not protected with if opentracing.IsGlobalTracerRegistered() , should it be?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't going to... but ok, fair enough, let's add a check: c59312f

}

func (t Tracer) wrapWithOpenTracing(next http.Handler) http.Handler {
// Do OpenTracing when it's registered.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment doesn't make much sense here as the if is elsewhere

Comment thread middleware/http_tracing.go Outdated
Comment thread middleware/http_tracing.go
Comment thread spanlogger/opentracing_spanlogger_test.go
Comment thread spanlogger/opentracing_spanlogger_test.go
Comment thread spanlogger/otel_spanlogger_test.go
Comment thread tracing/otel.go Outdated
Comment thread tracing/otel_test.go Outdated
Comment thread tracing/otel.go
Comment thread tracing/span.go Outdated
Comment thread tracing/span_test.go
Comment thread tracing/tracing.go
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
colega added 8 commits April 23, 2025 11:54
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Comment thread tracing/span.go Outdated
This changes behaviour of tracing.StartSpanFromContext to always start a
span, as we did previously, and we decide based on whether opentracing
is registered or not.

Also added tests, which I had to move to other packages as tracing
installation is a global state.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Copy link
Copy Markdown
Contributor

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@colega
Copy link
Copy Markdown
Contributor Author

colega commented Apr 23, 2025

I'm testing this PR on a local dev environment, to make sure that we didn't break any existing OpenTracing features before merging.

@colega
Copy link
Copy Markdown
Contributor Author

colega commented Apr 24, 2025

I checked the traces and they still exist.

@colega colega merged commit cee08dc into main Apr 24, 2025
5 checks passed
@colega colega deleted the otel-everything branch April 24, 2025 12:55
labeler.Add(attribute.String("http.content_type", ct))
}

labeler.Add(attribute.String("headers", fmt.Sprintf("%v", r.Header)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we worry about sensitive headers here, e.g. authorization tokens?

See here for instance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch:
#707

colega added a commit that referenced this pull request Jun 4, 2025
When I added the OTel tracing in #681 I went too far and added headers as tracing attributes.

As @bboreham pointed out, this could log potentially sensitive headers like auth tokens.

This is a quick fix to remove that from traces, we can add an allowlist or a blocklist in the future as a feature.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
colega added a commit that referenced this pull request Jun 4, 2025
When I added the OTel tracing in #681 I went too far and added headers as tracing attributes.

As @bboreham pointed out, this could log potentially sensitive headers like auth tokens.

This is a quick fix to remove that from traces, we can add an allowlist or a blocklist in the future as a feature.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Comment thread tracing/tracing.go
return
}

return sctx.TraceID().String(), sctx.IsSampled()
Copy link
Copy Markdown
Contributor

@slim-bean slim-bean Jun 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous behavior of ExtractSampledTraceID would return the traceID regardless if it was sampled or not, the new behavior only returns the traceID if it was sampled.

Unfortunately this is a breaking change for Loki where we would use this method to append the log line with additional information if the trace was sampled or not, but we relied on it returning the traceID to use in the log line regardless of being sampled.

charleskorn added a commit that referenced this pull request Aug 8, 2025
)

**What this PR does**:

This PR fixes an issue where the tracing sampler would not be configured
correctly if tracing is being configured from Jaeger environment
variables and the `JAEGER_SAMPLER_TYPE` environment variable was set to
`probabilistic`.

**Which issue(s) this PR fixes**:

Related to #681 and
#700.

**Checklist**
- [ ] Tests updated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants