Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions docs/sources/tempo/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -1955,16 +1955,16 @@ overrides:
# Honored by both global and local strategies. With global, this value
# is divided across healthy distributors.
# Results in errors like
# RATE_LIMITED: ingestion rate limit (15000000 bytes) exceeded while
# RATE_LIMITED: ingestion rate limit (30000000 bytes) exceeded while
# adding 10 bytes
[rate_limit_bytes: <int> | default = 15000000 (15MB) ]
[rate_limit_bytes: <int> | default = 30000000 (30MB) ]

# Burst size (bytes) used in ingestion.
# Results in errors like
# RATE_LIMITED: ingestion rate limit (20000000 bytes) exceeded while
# RATE_LIMITED: ingestion rate limit (30000000 bytes) exceeded while
# adding 10 bytes
# Ignores rate strategy and is always local.
[burst_size_bytes: <int> | default = 20000000 (20MB) ]
[burst_size_bytes: <int> | default = 30000000 (30MB) ]
Comment on lines 1957 to +1967
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEDIUM: The RATE_LIMITED error examples here don't match the current distributor error text. In modules/distributor/distributor.go the message includes local/global bytes/s and burst (and user), e.g. ingestion rate limit (local: ... bytes/s, global: ... bytes/s, burst: ... bytes) .... Since this section is being updated anyway, could we align the example to the actual message (same issue also appears in the troubleshooting page that shows the log line)?

Copilot uses AI. Check for mistakes.

# Maximum number of active traces per user, per live-store instance.
# Not affected by rate_strategy.
Expand Down Expand Up @@ -2302,8 +2302,8 @@ The `rate_strategy` setting controls how the distributor's rate limit scales acr

| Strategy | When to use | How it works |
|---|---|---|
| **`local`** (default) | You want each distributor to independently handle a fixed rate, and you accept that the effective cluster rate grows as you add distributors. | Each distributor enforces the full configured `rate_limit_bytes` value. With 5 distributors at `15 MB/s`, the cluster allows up to `75 MB/s`. |
| **`global`** | You need a predictable cluster-wide ingestion budget that stays constant regardless of how many distributors you run. | The configured `rate_limit_bytes` is divided across healthy distributors. With 5 distributors at `15 MB/s`, each allows `3 MB/s`. |
| **`local`** (default) | You want each distributor to independently handle a fixed rate, and you accept that the effective cluster rate grows as you add distributors. | Each distributor enforces the full configured `rate_limit_bytes` value. With 5 distributors at `30 MB/s`, the cluster allows up to `150 MB/s`. |
| **`global`** | You need a predictable cluster-wide ingestion budget that stays constant regardless of how many distributors you run. | The configured `rate_limit_bytes` is divided across healthy distributors. With 5 distributors at `30 MB/s`, each allows `6 MB/s`. |

```yaml
overrides:
Expand All @@ -2324,25 +2324,25 @@ The following table shows where each ingestion limit is enforced and whether it

##### Examples

Each distributor instance independently allows `15 MB/s`:
Each distributor instance independently allows `30 MB/s`:

```yaml
overrides:
defaults:
ingestion:
rate_strategy: local
rate_limit_bytes: 15000000
rate_limit_bytes: 30000000
```

All distributors share a total cluster rate of `15 MB/s`.
With 5 distributors, each instance allows `3 MB/s`:
All distributors share a total cluster rate of `30 MB/s`.
With 5 distributors, each instance allows `6 MB/s`:

```yaml
overrides:
defaults:
ingestion:
rate_strategy: global
rate_limit_bytes: 15000000
rate_limit_bytes: 30000000
```

For guidance on sizing these limits for your workload, refer to [Manage trace ingestion](https://grafana.com/docs/tempo/<TEMPO_VERSION>/operations/manage-trace-ingestion/).
Expand Down
4 changes: 2 additions & 2 deletions docs/sources/tempo/configuration/manifest.md
Original file line number Diff line number Diff line change
Expand Up @@ -686,8 +686,8 @@ overrides:
defaults:
ingestion:
rate_strategy: local
rate_limit_bytes: 15000000
burst_size_bytes: 20000000
rate_limit_bytes: 30000000
burst_size_bytes: 30000000
max_traces_per_user: 10000
retry_info_enabled: true
read:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ level=info ts=2024-08-19T16:06:25.881169385Z caller=distributor.go:767 msg=disca
The distributor checks ingestion rate limits before writing to Kafka. If spans are refused due to rate limits, you'll see logs like this at the distributor:

```
msg="pusher failed to consume trace data" err="rpc error: code = ResourceExhausted desc = RATE_LIMITED: ingestion rate limit (15000000 bytes) exceeded while adding 10 bytes"
msg="pusher failed to consume trace data" err="rpc error: code = ResourceExhausted desc = RATE_LIMITED: ingestion rate limit (30000000 bytes) exceeded while adding 10 bytes"
```

You'll also see the following metric incremented. The `reason` label on this metric will contain information about the refused reason.
Expand Down
4 changes: 2 additions & 2 deletions modules/overrides/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -377,8 +377,8 @@ func (c *Config) RegisterFlagsAndApplyDefaults(f *flag.FlagSet) {
// Distributor LegacyOverrides
c.Defaults.Ingestion.RetryInfoEnabled = true // enabled in overrides by default but it's disabled with RetryAfterOnResourceExhausted = 0
f.StringVar(&c.Defaults.Ingestion.RateStrategy, "distributor.rate-limit-strategy", "local", "Whether the various ingestion rate limits should be applied individually to each distributor instance (local), or evenly shared across the cluster (global).")
f.IntVar(&c.Defaults.Ingestion.RateLimitBytes, "distributor.ingestion-rate-limit-bytes", 15e6, "Per-user ingestion rate limit in bytes per second.")
f.IntVar(&c.Defaults.Ingestion.BurstSizeBytes, "distributor.ingestion-burst-size-bytes", 20e6, "Per-user ingestion burst size in bytes. Should be set to the expected size (in bytes) of a single push request.")
f.IntVar(&c.Defaults.Ingestion.RateLimitBytes, "distributor.ingestion-rate-limit-bytes", 30e6, "Per-user ingestion rate limit in bytes per second.")
f.IntVar(&c.Defaults.Ingestion.BurstSizeBytes, "distributor.ingestion-burst-size-bytes", 30e6, "Per-user ingestion burst size in bytes. Should be set to the expected size (in bytes) of a single push request.")
Comment on lines +380 to +381
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEDIUM: This changes the default ingestion rate/burst limits, which is a user-facing behavior change. Could you add a corresponding entry under ## main / unreleased in CHANGELOG.md (following the existing category/order format) so operators see the new defaults in release notes?

Copilot uses AI. Check for mistakes.

// Ingester limits
f.IntVar(&c.Defaults.Ingestion.MaxLocalTracesPerUser, "ingester.max-traces-per-user", 10e3, "Maximum number of active traces per user, per ingester. 0 to disable.")
Expand Down
8 changes: 8 additions & 0 deletions modules/overrides/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,14 @@ max_search_duration: 5m
assert.Equal(t, limitsYAML, limitsJSON)
}

func TestConfig_DefaultIngestionLimits(t *testing.T) {
cfg := Config{}
cfg.RegisterFlagsAndApplyDefaults(flag.NewFlagSet("test", flag.ContinueOnError))

assert.Equal(t, 30_000_000, cfg.Defaults.Ingestion.RateLimitBytes)
assert.Equal(t, 30_000_000, cfg.Defaults.Ingestion.BurstSizeBytes)
}

func TestConfig_legacy(t *testing.T) {
legacyRawYaml := `
ingestion_rate_strategy: local
Expand Down
Loading