Skip to content

Support per-runner-flavor SQS batch size and window configuration in multi_runner_config #5106

@wadherv

Description

@wadherv

Summary

lambda_event_source_mapping_batch_size and lambda_event_source_mapping_maximum_batching_window_in_seconds are currently module-level variables in the multi-runner module, applied identically to every runner flavor. In deployments with multiple runner flavors serving significantly different load profiles, this forces a single batch configuration that is either too aggressive for low-volume flavors or too conservative for high-volume ones.

scale_up_reserved_concurrent_executions is already configurable per flavor inside multi_runner_config.runner_config — batch size and window should follow the same pattern.

Current Behaviour

modules/multi-runner/runners.tf lines 68–69 pass the same module-level values to every runner module regardless of the runner flavor:

  module "runners" {
    for_each = local.runner_config
    ...
    lambda_event_source_mapping_batch_size                         = var.lambda_event_source_mapping_batch_size
    lambda_event_source_mapping_maximum_batching_window_in_seconds = var.lambda_event_source_mapping_maximum_batching_window_in_seconds
    ...
  }

modules/multi-runner/variables.tf lines 756–766 define these as single shared values:

  variable "lambda_event_source_mapping_batch_size" {
    description = "Maximum number of records to pass to the lambda function in a single batch..."
    type        = number
    default     = 10
  }

  variable "lambda_event_source_mapping_maximum_batching_window_in_seconds" {
    description = "Maximum amount of time to gather records before invoking the lambda function..."
    type        = number
    default     = 0
  }

Problem

A real-world deployment with multiple runner flavors has vastly different load and capacity characteristics:

Flavor Max Runners Traffic Level Ideal Batch Ideal Window
runner-large 1000 Very high 50 10s
runner-security 400 Medium 20 10s
runner-small 1000 Medium 20 10s
runner-compute 100 Low 10 5s
runner-metal 10 Very low 5 5s

With a single shared batch_size=50 and batch_window=10s:

  • High-volume flavors (runner-large): appropriate — accumulates enough messages per window to fully utilise the batch
  • Low-volume flavors (runner-metal, max=10): the Lambda waits the full 10s every time even when only 1–2 messages are in the queue, adding unnecessary latency. A batch_size=50 is larger than the flavor's entire runners_maximum_count=10
  • SSM write pressure: batch_size directly drives SSM putParameter calls per Lambda invocation. Over-sized batches on low-volume flavors waste SSM throughput headroom unnecessarily

Furthermore, the GitHub API secondary rate limit (80 content-generating requests/minute, 500/hour) is shared across all flavors via the same GitHub App installation. Batch size and window directly control the rate of registration token API calls:

token_calls_per_minute = concurrency × (60s / batch_window)

batch_window=10s → 6 cycles/minute per flavor
batch_window=30s → 2 cycles/minute per flavor ← lower-volume flavors can afford this

Being able to set a longer batch_window on low-volume flavors reduces their contribution to the shared GitHub API rate limit budget, freeing more headroom for high-volume flavors.

Requested Change

Add lambda_event_source_mapping_batch_size and lambda_event_source_mapping_maximum_batching_window_in_seconds as optional fields inside
multi_runner_config.runner_config, falling back to the existing module-level variables when not set. This matches the existing pattern used by
scale_up_reserved_concurrent_executions (already per-flavor at variables.tf line 114 and runners.tf line 87).

Proposed Implementation

  1. modules/multi-runner/variables.tf — add optional fields to multi_runner_config object type
  # Inside the runner_config object (alongside scale_up_reserved_concurrent_executions):
  scale_up_reserved_concurrent_executions                        = optional(number, 1)
  lambda_event_source_mapping_batch_size                         = optional(number, null)
  lambda_event_source_mapping_maximum_batching_window_in_seconds = optional(number, null)

  And in the description block:
  lambda_event_source_mapping_batch_size: "(Optional) Maximum number of records per Lambda invocation for this runner flavor. Overrides the module-level
  lambda_event_source_mapping_batch_size when set."
  lambda_event_source_mapping_maximum_batching_window_in_seconds: "(Optional) Maximum seconds to gather records before invoking Lambda for this runner flavor.
  Overrides the module-level lambda_event_source_mapping_maximum_batching_window_in_seconds when set."
  1. modules/multi-runner/runners.tf — use per-flavor value with module-level fallback
  lambda_event_source_mapping_batch_size                         = coalesce(each.value.runner_config.lambda_event_source_mapping_batch_size,
  var.lambda_event_source_mapping_batch_size)
  lambda_event_source_mapping_maximum_batching_window_in_seconds = coalesce(each.value.runner_config.lambda_event_source_mapping_maximum_batching_window_in_seconds,
  var.lambda_event_source_mapping_maximum_batching_window_in_seconds)

Usage Example (after fix)

  module "multi_runner" {
    source = "..."

    # Module-level defaults (apply to all flavors that don't override)
    lambda_event_source_mapping_batch_size                         = 10
    lambda_event_source_mapping_maximum_batching_window_in_seconds = 5

    multi_runner_config = {
      runner-large = {
        runner_config = {
          runners_maximum_count                                       = 1000
          scale_up_reserved_concurrent_executions                     = 10
          lambda_event_source_mapping_batch_size                      = 50   # override: high volume
          lambda_event_source_mapping_maximum_batching_window_in_seconds = 10  # override
          ...
        }
      }
      runner-metal = {
        runner_config = {
          runners_maximum_count                   = 10
          scale_up_reserved_concurrent_executions = 1
          # no override: uses module-level defaults (batch=10, window=5s)
          ...
        }
      }
    }
  }

Backwards Compatibility

Fully backwards compatible. The new fields are optional with null default. coalesce() falls through to the existing module-level variable when the per-flavor value is null. Existing configurations require no changes.

Related

  • scale_up_reserved_concurrent_executions already uses this per-flavor pattern (variables.tf line 114, runners.tf line 87) — this request extends that pattern to batch configuration.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions