Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7495 +/- ##
==========================================
+ Coverage 81.60% 81.68% +0.07%
==========================================
Files 357 359 +2
Lines 27294 27532 +238
==========================================
+ Hits 22274 22490 +216
- Misses 3816 3822 +6
- Partials 1204 1220 +16
|
4983e8b to
2eb99db
Compare
2eb99db to
4d008b3
Compare
easwars
left a comment
There was a problem hiding this comment.
LGTM, modulo minor comments.
| cacheSizeMetric.Record(dc.metricsRecorder, 0, grpcTarget, "", dc.uuid) | ||
| cacheEntriesMetric.Record(dc.metricsRecorder, 0, grpcTarget, "", dc.uuid) |
There was a problem hiding this comment.
Is this really required? The grpc.lb.rls.server_target is empty here. So, will this measurement even be useful?
There was a problem hiding this comment.
I think so. This gives a measurement at the beginning that the cache is current of 0 (vs unset and not showing up). This is a gauge, so just the most recent state of the system, but I think it makes sense to at construction time give the state that it's empty).
There was a problem hiding this comment.
Discussed this offline with Doug and Yash, at first we thought leave it but then they mused about the same point you brought up, which is this gauge as written will live around the lifetime of the binary with an empty target. So this is actually a valid correctness issue. Thanks for catching this.
There was a problem hiding this comment.
I brought this up in the Observability thread alongside another one of my concerns, but this case is the main one that is seen as a correctness issue. If the rls server target changes over the lifetime of the balancer, it's up to exporter to have logic for the same uuid gauge with a different rls server target. Yash mentioned dashboards can group on uuid, and then see the rls server target change over time, so WAI.
|
|
||
| func (r *NoopMetricsRecorder) RecordInt64Count(_ *estats.Int64CountHandle, _ int64, _ ...string) {} | ||
|
|
||
| func (r *NoopMetricsRecorder) RecordFloat64Count(_ *estats.Float64CountHandle, _ float64, _ ...string) { |
There was a problem hiding this comment.
gofmt is crazy that it allows { and } to be on the same line for some of the methods and not for other methods. I've noticed this when making changes too.
You could get rid of the _ parameter names for all the methods though, since none of the parameters are used.
There was a problem hiding this comment.
Ah I see. Will switch to one line, and remove _ parameter name.
After some discussion, decided to just emit gauge metrics inline using synchronous gauges with cache mutex held. OpenTelemetry just copies this data over to another store for exporters to use, so this shouldn't introduce a performance hit/lock contention, especially since the accesses cache creates lock contention on the operations even before metrics.
As suggested in #7484 (comment).
RELEASE NOTES: