ci: migrate coverage gating from Codecov to GitHub Actions#8111
ci: migrate coverage gating from Codecov to GitHub Actions#8111yurishkuro merged 10 commits intojaegertracing:mainfrom
Conversation
Signed-off-by: Yuri Shkuro <github@ysh.us>
There was a problem hiding this comment.
Pull request overview
Refactors the CI Summary Report fan-in workflow to make metrics comparison more reliable (by distinguishing infra failures from “no diffs”) and to align GitHub Actions coverage gating with Codecov’s exclusion rules.
Changes:
- Always upload a diff artifact for each metrics snapshot on PRs (empty stub when no baseline), and add fan-in checks for missing diff artifacts.
- Add a coverage-profile filter that applies
.codecov.ymlignore:patterns before computing merged coverage. - Update CI Summary Report workflow outputs/links and ADR documentation to reflect the new behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
scripts/e2e/metrics_summary.sh |
Adds infra detection for missing diff artifacts and skips empty diff stubs when generating summaries. |
scripts/e2e/filter_coverage.py |
New helper to filter Go coverage profiles using .codecov.yml exclusions. |
docs/adr/004-migrating-coverage-gating-to-github-actions.md |
Documents coverage filtering + rationale for aligning with Codecov. |
.github/workflows/ci-summary-report.yml |
Wires in coverage filtering and updates summary/check-run links; runs only when upstream CI succeeded. |
.github/actions/verify-metrics-snapshot/action.yaml |
Ensures diff artifacts are always uploaded on PRs via an empty stub file. |
Comments suppressed due to low confidence (2)
scripts/e2e/metrics_summary.sh:63
snapshot_nameis derived from the diff file basename (diff_<inputs.snapshot>.txt), but diff artifacts are keyed byartifact_key(often includes matrix dimensions). For matrix jobs this produces repeated/ambiguous headings likesummary_metrics_snapshot_cassandrafor multiple different runs and makes the combined summary hard to interpret. Consider deriving the snapshot identifier from the parent artifact directory name (e.g.,diff_<artifact_key>), or ensure the diff filename includesartifact_keyso each summary section is uniquely attributable.
# Extract the base name (e.g., diff_metrics_snapshot_cassandra.txt -> metrics_snapshot_cassandra)
base_name=$(basename "$diff_file" .txt)
snapshot_name=${base_name#diff_}
dir=$(dirname "$diff_file")
# Generate summary for this diff
summary_file="$dir/summary_$snapshot_name.md"
.github/actions/verify-metrics-snapshot/action.yaml:81
- The diff artifact is named using
diff_${{ inputs.artifact_key }}, but the file created/overwritten/uploaded is alwaysdiff_${{ inputs.snapshot }}.txt. For snapshots whereartifact_keyincludes matrix dimensions, this loses that context and makes fan-in reporting ambiguous (all artifacts contain adiff_metrics_snapshot_<name>.txt). Consider naming the diff file usingartifact_keyas well (and updating the compare output path + upload path accordingly), so downstream summary can reliably attribute diffs to the correct matrix run.
- name: Create diff file stub
if: github.ref_name != 'main'
shell: bash
run: touch ./.metrics/diff_${{ inputs.snapshot }}.txt
- name: Calculate diff between the snapshots
id: compare-snapshots
if: ${{ (github.ref_name != 'main') && (steps.download-release-snapshot.outputs.cache-matched-key != '') }}
continue-on-error: true
shell: bash
run: |
python3 -m pip install prometheus-client
if python3 ./scripts/e2e/compare_metrics.py --file1 ./.metrics/${{ inputs.snapshot }}.txt --file2 ./.metrics/baseline_${{ inputs.snapshot }}.txt --output ./.metrics/diff_${{ inputs.snapshot }}.txt; then
echo "No differences found in metrics"
else
echo "🛑 Differences found in metrics"
echo "has_diff=true" >> $GITHUB_OUTPUT
fi
# Always upload the diff artifact on PRs (even when empty / no baseline yet).
# Presence of this artifact in the fan-in proves this action ran for the snapshot.
- name: Upload the diff artifact
if: github.ref_name != 'main'
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: diff_${{ inputs.artifact_key }}
path: ./.metrics/diff_${{ inputs.snapshot }}.txt
retention-days: 7
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8111 +/- ##
==========================================
- Coverage 95.69% 95.68% -0.02%
==========================================
Files 317 317
Lines 16734 16734
==========================================
- Hits 16014 16012 -2
- Misses 568 569 +1
- Partials 152 153 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Metrics Comparison SummaryTotal changes across all snapshots: 32 Detailed changes per snapshotsummary_metrics_snapshot_elasticsearch📊 Metrics Diff SummaryTotal Changes: 0
summary_metrics_snapshot_opensearch📊 Metrics Diff SummaryTotal Changes: 0
summary_metrics_snapshot_clickhouse📊 Metrics Diff SummaryTotal Changes: 0
summary_metrics_snapshot_grpc📊 Metrics Diff SummaryTotal Changes: 0
summary_metrics_snapshot_kafka📊 Metrics Diff SummaryTotal Changes: 0
summary_metrics_snapshot_memory📊 Metrics Diff SummaryTotal Changes: 0
summary_metrics_snapshot_badger📊 Metrics Diff SummaryTotal Changes: 32
❌ Removed Metrics
View diff sample-jaeger_storage_badger_compaction_current_num_lsm{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_compaction_current_num_lsm{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_get_num_memtable{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_get_num_memtable{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_get_num_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_get_num_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_get_with_result_num_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_get_with_result_num_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_iterator_num_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_iterator_num_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_put_num_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_put_num_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_read_bytes_lsm{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_read_bytes_lsm{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_read_bytes_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_read_bytes_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_read_num_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_read_num_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_size_bytes_lsm{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_size_bytes_lsm{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_size_bytes_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_size_bytes_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_write_bytes_l0{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_bytes_l0{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_write_bytes_user{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_bytes_user{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_write_bytes_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_bytes_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_write_num_vlog{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_num_vlog{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}View diff sample-jaeger_storage_badger_write_pending_num_memtable{name="another_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}
-jaeger_storage_badger_write_pending_num_memtable{name="some_store",otel_scope_name="jaeger-v2",otel_scope_schema_url="",otel_scope_version="",role="tracestore"}Code Coveragecoverage 46.4% is below required minimum 95.0% |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
.github/actions/verify-metrics-snapshot/action.yaml:81
- The new diff stub + unconditional diff artifact upload can mask failures in
compare_metrics.py. If a baseline exists butcompare_metrics.pycrashes/throws, the stub remains empty, uploads successfully, and the fan-in will skip it as an “empty diff” (treating it as no changes). Consider makingcompare_metrics.pyreturn a distinct non-0/1 exit code on unexpected errors (or write an explicit error marker into the diff file) and have this composite action fail in that case so the fan-in can surface an infra error.
- name: Create diff file stub
if: github.ref_name != 'main'
shell: bash
run: touch ./.metrics/diff_${{ inputs.snapshot }}.txt
- name: Calculate diff between the snapshots
id: compare-snapshots
if: ${{ (github.ref_name != 'main') && (steps.download-release-snapshot.outputs.cache-matched-key != '') }}
continue-on-error: true
shell: bash
run: |
python3 -m pip install prometheus-client
if python3 ./scripts/e2e/compare_metrics.py --file1 ./.metrics/${{ inputs.snapshot }}.txt --file2 ./.metrics/baseline_${{ inputs.snapshot }}.txt --output ./.metrics/diff_${{ inputs.snapshot }}.txt; then
echo "No differences found in metrics"
else
echo "🛑 Differences found in metrics"
echo "has_diff=true" >> $GITHUB_OUTPUT
fi
# Always upload the diff artifact on PRs (even when empty / no baseline yet).
# Presence of this artifact in the fan-in proves this action ran for the snapshot.
- name: Upload the diff artifact
if: github.ref_name != 'main'
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: diff_${{ inputs.artifact_key }}
path: ./.metrics/diff_${{ inputs.snapshot }}.txt
retention-days: 7
.github/workflows/ci-summary-report.yml:84
actions.listWorkflowRunArtifactsis called without pagination/per_page. GitHub’s API defaults to 30 artifacts per page; this repo’s CI run can easily exceed that (metrics snapshots + diffs + coverage artifacts), so the fan-in will silently miss artifacts and produce incorrect metrics/coverage gating. Update the github-script step to requestper_page: 100and paginate until all artifacts are fetched (or followartifacts.data.total_count).
// List all artifacts from the target workflow run
const artifacts = await github.rest.actions.listWorkflowRunArtifacts({
owner,
repo,
run_id: workflowRunId,
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Problem
Codecov's PR status check suffers from two reliability issues:
Coverage gating now runs entirely within GitHub Actions, producing a
Coverage Gatecheck run at the same time as the existingMetrics Comparisoncheck run.Codecov uploads are retained for long-term historical trending and per-flag breakdown views; only the gating responsibility is moved.
Design
Full design rationale:
docs/adr/004-migrating-coverage-gating-to-github-actions.mdFan-in pattern — The new
ci-summary-report.ymlworkflow (renamed fromci-compare-metrics.yml) triggers on"CI Orchestrator"completion viaworkflow_run. This fires only after all three stages (lint, unit tests, E2E) finish, ensuring allcoverage-*artifacts are available. It also grants write permissions needed to post PR comments from fork PRs.Single job, two flows — The same
summary-reportjob handles both PR analysis and main-branch baseline saves:actions/cacheCoverage policy — Two independent gates applied to the filtered merged profile:
mainbaselineThe merged profile is filtered using the
ignore:patterns read at runtime from.codecov.yml(generated protobuf files, mocks, integration test infrastructure). Without this filteringgo tool cover -funcon merged profiles yields ~42% because it counts all instrumented packages including generated code; after filtering the number is ~95.6%, consistent with Codecov's project-level figure.Infrastructure validation —
verify-metrics-snapshotnow always uploads a diff artifact on PRs (an empty stub when there are no metric changes, actual diffs when there are). The fan-in performs a 1-to-1 check: for everymetrics_snapshot_*artifact there must be a correspondingdiff_metrics_snapshot_*artifact. A missing diff means the action never ran — an infra failure — reported as a separate error distinct from metric regressions.Changes
New files
scripts/e2e/filter_coverage.py— reads theignore:list from.codecov.ymland filters a Go coverage profile, keeping both tools in sync from a single source of truth.docs/adr/004-migrating-coverage-gating-to-github-actions.md— ADR documenting the design..github/workflows/ci-summary-report.yml— fan-in workflow replacingci-compare-metrics.yml; handles metrics comparison, coverage gating, PR comment, and check run creation.Modified:
upload-codecovactionflagsinput toflag(singular; all callers pass exactly one value).coverage-<flag>artifact before the Codecov retry step, so the artifact is preserved even if Codecov rate-limits.fail_ci_if_error: falseon the Codecov upload (artifact is already saved; Codecov failure is non-fatal).Modified:
verify-metrics-snapshotactionhas_diff == 'true'togithub.ref_name != 'main'), enabling 1-to-1 presence checking in the fan-in.Modified:
scripts/e2e/metrics_summary.shINFRA_ERRORSoutput if anymetrics_snapshot_*dir lacks a correspondingdiff_*dir.Modified: 11 E2E caller workflows +
ci-unit-tests.ymlflags:→flag:in everyupload-codecovcall site.Modified:
internal/tools/github.com/wadey/gocovmergeblank import intools.goandinstall-coverage-toolsMake target inTools.mk.Test plan
Metrics Comparisonshows ✅ no changes,Coverage Gateshows ✅ coverage ~95%main— verify the coverage baseline is saved toactions/cacheINFRA_ERRORSfires and the Metrics Comparison check shows failureAI Usage in this PR (choose one)
See AI Usage Policy.