Skip to content

Add ability to drop instance label from span metrics#5706

Merged
ie-pham merged 9 commits intografana:mainfrom
ie-pham:instance
Oct 6, 2025
Merged

Add ability to drop instance label from span metrics#5706
ie-pham merged 9 commits intografana:mainfrom
ie-pham:instance

Conversation

@ie-pham
Copy link
Copy Markdown
Contributor

@ie-pham ie-pham commented Oct 3, 2025

What this PR does: The instance label in spanmetric series tends to be very high in cardinality. A new config called enable_instance_label is added. It is set to true by default and can be set to false to skip the instance label when generating spanmetrics series.

Which issue(s) this PR fixes:
Fixes #4873

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

- `status_message` (optionally enabled) - The message that details the reason for the `status_code` label
- `job` - The name of the job, a combination of namespace and service; only added if `metrics_generator.processor.span_metrics.enable_target_info: true`
- `instance` - The instance ID; only added if `metrics_generator.processor.span_metrics.enable_target_info: true`
- `instance` - The instance ID; only added if `metrics_generator.processor.span_metrics.enable_target_info: true` and can be dropped by setting `metrics_generator.processor.span_metrics.drop_instance_label: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rlankfo isn't instance semi required for target_info to work correctly?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IE when we go to set the value on the Gauge your likely to override/not override appropriately in SetForTargetInfo

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why it would not work appropriately?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original requirement also states that if instance is empty to skip it entirely, which sounds to me like it's not a hard requirement.

Copy link
Copy Markdown
Contributor

@mattdurham mattdurham Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is not a job or instance then the metric will never get recorded, which feels like different behavior than reducing cardinality. The change in behavior would be if there is no job label and drop instance is true then the metric will not be recorded, whereas previously it would be. Though granted that should be pretty rare that a service.name (job) is not declared.

@@ -206,7 +206,7 @@ func (p *Processor) aggregateMetricsForSpan(svcName string, jobName string, inst
labelValues = append(labelValues, jobName)
}
// add instance label only if job is not blank
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to your code but this comment looks like its wrong.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a chance to the original requirement. I will change the comments.

}
// add instance label to target info only if job is not blank
if instanceID != "" {
if instanceID != "" && !p.Cfg.DropInstanceLabel {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment issue here.

- `status_message` (optionally enabled) - The message that details the reason for the `status_code` label
- `job` - The name of the job, a combination of namespace and service; only added if `metrics_generator.processor.span_metrics.enable_target_info: true`
- `instance` - The instance ID; only added if `metrics_generator.processor.span_metrics.enable_target_info: true`
- `instance` - The instance ID; only added if `metrics_generator.processor.span_metrics.enable_target_info: true` and can be dropped by setting `metrics_generator.processor.span_metrics.enable_instance_label: false
Copy link
Copy Markdown
Contributor

@mattdurham mattdurham Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on changing this to say

The instance ID; only added if `metrics_generator.processor.span_metrics.enable_target_info: true` 
and `metrics_generator.processor.span_metrics.enable_instance_label: true` 

Copy link
Copy Markdown
Contributor

@mattdurham mattdurham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ie-pham ie-pham merged commit 7c925e6 into grafana:main Oct 6, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

High Cardinality Issue with service.instance.id in Span Metrics Processor

2 participants