Skip to content

Metrics generator crash in 2.10.0 #6396

@AndreZiviani

Description

@AndreZiviani

Describe the bug
After upgrading from 2.9 to 2.10 metrics-generator it started crashing:

panic: Invalid name validation scheme requested: unset
goroutine 797 [running]:
github.com/prometheus/common/model.ValidationScheme.IsValidLabelName(0xc013c06dcf?, {0xc019c6b5c0?, 0x1d?})
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/common/model/metric.go:203 +0xb4
github.com/prometheus/prometheus/model/relabel.relabel(0xc000039960, 0xc000505900)
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/model/relabel/relabel.go:340 +0x4c5
github.com/prometheus/prometheus/model/relabel.ProcessBuilder(...)
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/model/relabel/relabel.go:290
github.com/prometheus/prometheus/storage/remote.(*QueueManager).StoreSeries(0xc0020be000, {0xc01c800000, 0x874e, 0x41d634?}, 0x0)
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/storage/remote/queue_manager.go:999 +0x30d
github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).readSegment(0xc0020a2780, 0xc00401e000, 0x0, 0x1)
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:520 +0x548
github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).readAndHandleError(0xc0020a2780, 0xc00401e000, 0x0, 0x1, 0x7fffffffffffffff)
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:351 +0x4a
github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).watch(0xc0020a2780, 0x0, 0x1)
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:443 +0x645
github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).Run(0xc0020a2780)
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:319 +0x55a
github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).loop(0xc0020a2780)
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:268 +0x118
created by github.com/prometheus/prometheus/tsdb/wlog.(*Watcher).Start in goroutine 447
        /home/runner/work/tempo/tempo/vendor/github.com/prometheus/prometheus/tsdb/wlog/watcher.go:242 +0xed

To Reproduce
Steps to reproduce the behavior:

  1. Install Tempo with tempo-distributed helm chart, I think this is the relevant section:
        metricsGenerator:
          enabled: true
          replicas: 4
          resources:
            requests:
              memory: 3Gi
              cpu: 1000m
          config:
            processor:
              service_graphs:
                max_items: 100000
                dimensions:
                  - deployment.environment
              span_metrics:
                intrinsic_dimensions:
                  status_code: true
                dimensions:
                  - deployment.environment
                  - http.status_code
                  - http.route
                  - rpc.system
                  - rpc.service
                  - rpc.method
                  - rpc.grpc.status_code
              local_blocks:
                filter_server_spans: false
            storage:
              remote_write:
                - url: "http://mimir-gateway.mimir.svc.cluster.local/api/v1/push"
                  write_relabel_configs:
                    - source_labels: [ deployment_environment ]
                      regex: (.*)
                      target_label: client_deployment_environment
        overrides:
          defaults:
            global:
              max_bytes_per_trace: 25000000
            ingestion:
              burst_size_bytes: 50000000
              rate_limit_bytes: 30000000
              max_traces_per_user: 20000
            metrics_generator:
              processors:
                - service-graphs
                - span-metrics
                - local-blocks
              processor:
                span_metrics:
                  enable_target_info: true
                  target_info_excluded_dimensions:
                    - k8s_node_name
                    - k8s_pod_ip
                    - k8s_pod_name
                    - k8s_pod_start_time
                    - k8s_pod_uid
                    - process_command
                    - process_command_args
                    - process_executable_name
                    - process_executable_path
                    - process_owner
                    - process_pid

Expected behavior
Don't crash

Environment:

  • Infrastructure: Kubernetes (tempo-distributed)
  • Deployment tool: helm

Additional Context
No changes besides upgrading

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions