Skip to content

workflows: chunk external-download matrix across 4 parallel invocations#282

Merged
igorpecovnik merged 4 commits intomainfrom
matrix-chunk-via-reusable-workflow
Apr 15, 2026
Merged

workflows: chunk external-download matrix across 4 parallel invocations#282
igorpecovnik merged 4 commits intomainfrom
matrix-chunk-via-reusable-workflow

Conversation

@igorpecovnik
Copy link
Copy Markdown
Member

Summary

infrastructure-download-external.yml's download job hit GitHub Actions' 256-entry strategy.matrix cap (259 observed). Rather than trimming legitimate arch × release combos or duplicating a 500-line job block, lean on the fact that the workflow is already a reusable workflow — call it N times from the parent with a CHUNK_INDEX / CHUNK_COUNT pair, and have the child filter its matrix to its slice.

Child (infrastructure-download-external.yml)

  • New inputs CHUNK_INDEX (default 0) and CHUNK_COUNT (default 1).
  • start job slices include[] using modular index — entries where index % CHUNK_COUNT == CHUNK_INDEX — so slow packages don't cluster into one chunk.
  • assets-for-download artifact name gains -${CHUNK_INDEX} suffix so parallel uploads don't race.

Parent (infrastructure-repository-update.yml)

  • Single external: call becomes a strategy.matrix over chunk_index: [0, 1, 2, 3], passing CHUNK_COUNT: 4.

Effect

before after
strategy.matrix cap hit at 259 entries no, each of 4 chunks ≤ 256
headroom (total matrix entries) 256 1024
max-parallel runners 180 720 (4 × 180)
code duplication none — single child, called N times

Scale past 1024 by bumping chunk_index: [0..N-1] and CHUNK_COUNT: N in lockstep. No block duplication, no file extraction, no matrix trimming.

Legacy / un-chunked callers that omit the new inputs get CHUNK_COUNT=1 and receive the entire matrix as before.

Test plan

  • Workflow loads without syntax errors (YAML validates locally)
  • Kick off infrastructure-repository-update.yml manually; confirm 4 external child runs appear, each with a subset of the matrix
  • Total packages processed across the 4 chunks equals the pre-change total (no drops, no duplicates)
  • Each chunk's clean step tears down its own assets-for-download-<N> artifact without interfering with siblings
  • The Copying: job (which needs: external) still runs once, after all 4 chunks complete — GitHub Actions waits for all matrix legs of the reusable-workflow call by default

…ations

The `download` job in infrastructure-download-external.yml was hitting
GitHub Actions' 256-entry `strategy.matrix` cap (259 configurations
observed). Rather than trimming the matrix (which would drop legitimate
arch × release combos users need) or duplicating a 500-line job block,
lean on the fact that infrastructure-download-external.yml is *already*
a reusable workflow — call it N times from the parent
(infrastructure-repository-update.yml) with a `CHUNK_INDEX` /
`CHUNK_COUNT` pair, and have the child filter its own matrix to its
assigned slice.

Child (infrastructure-download-external.yml):
- Add CHUNK_INDEX (0..CHUNK_COUNT-1) and CHUNK_COUNT (default 1) inputs.
- In the `start` job, after building MATRIX_JSON, slice the include[]
  list so each invocation keeps only entries where
  `index % CHUNK_COUNT == CHUNK_INDEX`. Modular slicing (not contiguous
  ranges) avoids clustering slow package types into one chunk.
- Suffix the `assets-for-download` artifact name with CHUNK_INDEX so
  parallel uploads don't race against each other.

Parent (infrastructure-repository-update.yml):
- Turn the single `external:` reusable-workflow call into a
  `strategy.matrix` over `chunk_index: [0, 1, 2, 3]`, passing
  CHUNK_COUNT: 4 to the child.

Effect: 4 parallel invocations, each with its own 256-matrix cap and
its own `max-parallel=180`. Headroom 1024 total matrix entries,
effective concurrency 720. Scale past that by bumping `chunk_index`
list and CHUNK_COUNT in lockstep — no block duplication, no file
extraction, no matrix trimming.

Legacy/un-chunked callers that omit the chunk inputs get CHUNK_COUNT=1
and receive the entire matrix as before.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 14, 2026

Warning

Rate limit exceeded

@igorpecovnik has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 55 minutes and 33 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 55 minutes and 33 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5e86979b-e77f-4ee6-84ac-90f0d1eb0ac5

📥 Commits

Reviewing files that changed from the base of the PR and between c0c5e2e and 1dcd79b.

📒 Files selected for processing (1)
  • .github/workflows/infrastructure-download-external.yml

Walkthrough

The called workflow (.github/workflows/infrastructure-download-external.yml) now accepts CHUNK_INDEX and CHUNK_COUNT inputs, validates them, suffixes artifact names with the chunk index, and slices the matrix include entries into chunk groups using jq (falling back to a placeholder include when a slice is empty). The caller (.github/workflows/infrastructure-repository-update.yml) adds a strategy matrix chunk_index: [0,1,2,3], sets fail-fast: false, updates the job name to show the chunk, and passes CHUNK_INDEX/CHUNK_COUNT to the reusable workflow.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'workflows: chunk external-download matrix across 4 parallel invocations' directly and clearly describes the main change: implementing chunked parallelization of the external-download workflow matrix across 4 concurrent invocations to work around GitHub Actions' 256-entry matrix cap.
Description check ✅ Passed The PR description is comprehensive and directly related to the changeset, explaining the problem (matrix cap exceeded), the solution (chunking via reusable workflow), specific changes to both files, expected effects, and a test plan.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch matrix-chunk-via-reusable-workflow

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added size/medium PR with more then 50 and less then 250 lines 05 Milestone: Second quarter release GitHub Actions GitHub Actions code Needs review Seeking for review labels Apr 14, 2026
Revert to @main before merging. Added only so the reusable-workflow
reference resolves to the updated child on this branch during test
runs of PR #282.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/infrastructure-download-external.yml:
- Around line 398-404: The placeholder entry with name="none" causes the
workflow to unconditionally source os/external/${{ matrix.name }}.conf later and
fail; change the job or the step that sources that file to skip when matrix.name
== "none" (e.g., add a job-level or step-level guard using the matrix value so
the job/step only runs if matrix.name != "none"), and keep the
MATRIX_JSON_COMPACTED placeholder logic intact so empty slices still produce a
no-op matrix entry.
- Around line 379-385: Validate that the inputs CHUNK_INDEX and CHUNK_COUNT are
non-negative integers before any numeric comparisons: check both against a regex
like ^[0-9]+$ and if either fails, emit an error (referencing
CHUNK_INDEX/CHUNK_COUNT) and exit 1; after that, enforce CHUNK_COUNT>=1
(hard-fail if not) and then verify CHUNK_INDEX < CHUNK_COUNT as currently done.
Use explicit error messages for invalid format vs out-of-range to aid debugging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2c033d5f-34dd-4ac4-b5e1-c3c81ec60daa

📥 Commits

Reviewing files that changed from the base of the PR and between 4c14439 and fa1021e.

📒 Files selected for processing (2)
  • .github/workflows/infrastructure-download-external.yml
  • .github/workflows/infrastructure-repository-update.yml

Comment thread .github/workflows/infrastructure-download-external.yml
Comment thread .github/workflows/infrastructure-download-external.yml Outdated
workflow_call's `type: number` is enforced at the YAML boundary
only; direct API or templated callers can send non-numeric strings
that end up in the env. Bash's arithmetic context silently treats
non-numeric as 0, so "abc" passed `-lt 1`, hit the silent reset
to 1, and the chunk slice ran with a quietly-wrong CHUNK_COUNT.

Add explicit guards before any numeric comparison:

1. Regex `^[0-9]+$` on both CHUNK_INDEX and CHUNK_COUNT — fail
   with "is not a non-negative integer" naming the bad field.
2. Hard-fail CHUNK_COUNT < 1 (was: silent reset to 1) — masking
   caller bugs is worse than failing loudly.
3. Existing CHUNK_INDEX >= CHUNK_COUNT range check unchanged.

Each failure mode emits a distinct error message so a misconfig
caller can tell format-error from range-error at a glance.
The {name:none,...} placeholder existed to keep strategy.matrix
non-empty when there's no work, with a comment promising the
downstream job would 'skip entries with name=none'. The skip was
never implemented — every step would try to source
os/external/${{ matrix.name }}.conf, which fails on
os/external/none.conf because no such file exists.

Job-level `if: matrix.name != 'none'` doesn't work either:
matrix.* isn't available at job-level if-evaluation (it's expanded
after).

Fix: thread a `has_work` boolean output from the start job and gate
the download job on it via a job-level `if:` (needs.* outputs ARE
available there). Set has_work='false' in both placeholder paths
(no matrix entries at all, OR this chunk's slice happens to be
empty). The placeholder still ships to keep strategy.matrix valid,
but the job is skipped before the matrix expands so no runner is
allocated and no source attempt is made.
@igorpecovnik igorpecovnik merged commit 570a20c into main Apr 15, 2026
228 of 281 checks passed
igorpecovnik added a commit that referenced this pull request Apr 15, 2026
Revert to @main before merging. Added only so the reusable-workflow
reference resolves to the updated child on this branch during test
runs of PR #282.
@igorpecovnik igorpecovnik deleted the matrix-chunk-via-reusable-workflow branch April 15, 2026 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

05 Milestone: Second quarter release GitHub Actions GitHub Actions code Needs review Seeking for review size/medium PR with more then 50 and less then 250 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant