Skip to content

fix(api): reaggregate resource inventory and attack surface after muting findings#10843

Merged
AdriiiPRodri merged 8 commits into
masterfrom
fix/api-resource-inventory-reaggregate-on-mute
Apr 27, 2026
Merged

fix(api): reaggregate resource inventory and attack surface after muting findings#10843
AdriiiPRodri merged 8 commits into
masterfrom
fix/api-resource-inventory-reaggregate-on-mute

Conversation

@AdriiiPRodri
Copy link
Copy Markdown
Contributor

Context

Follow-up to #10827, which made Overview and Finding Groups reflect newly-muted findings by extending reaggregate_all_finding_group_summaries_task. While validating that fix we saw that several other endpoints backed by their own pre-aggregated tables were still stale after muting:

  • /overviews/resource-groups (resource inventory widget)
  • /overviews/categories
  • /overviews/attack-surfaces
  • /overviews/findings-severity/timeseries (severity timeline)

Both the reaggregation dispatch and the underlying aggregators were affected.

Description

Two-part fix:

  1. Dispatch: the post-mute reaggregate-all-finding-group-summaries task now also re-runs, for each latest scan per (provider, day):

    • backfill_scan_resource_group_summaries_task -> ScanGroupSummary
    • backfill_scan_category_summaries_task -> ScanCategorySummary
    • aggregate_attack_surface_task -> AttackSurfaceOverview

    Services watchlist and severity timeseries were already covered, since they read from ScanSummary and DailySeveritySummary which the pipeline was recomputing.

  2. Idempotency: four aggregators used plain bulk_create (or a already backfilled short-circuit plus silent ignore_conflicts=True). The first post-mute run tripped the unique_* constraints, aborting the Celery chain and leaving the dependent aggregators unexecuted (observable as a PostgreSQL duplicate key value violates unique constraint "unique_scan_summary" during the first reaggregation attempt). They now delete the scan's existing rows before bulk_create so the write is atomic and re-runnable, and dropped combinations no longer linger in the summary table:

    • aggregate_findings (scan.py)
    • aggregate_attack_surface (scan.py)
    • backfill_scan_resource_group_summaries (backfill.py)
    • backfill_scan_category_summaries (backfill.py)

Other pre-aggregated tables (ResourceScanSummary, ComplianceOverviewSummary, ProviderComplianceScore) do not depend on Finding.muted and are intentionally left out of the reaggregation fan-out.

Steps to review

  1. Run a scan that produces FAIL findings with resource groups (e.g. an AWS provider that yields S3/IAM FAILs so several resource_groups values are populated).
  2. Hit /overviews/resource-groups, /overviews/categories, /overviews/attack-surfaces and /overviews/findings-severity/timeseries. Note the totals.
  3. POST /mute-rules on a subset of those finding IDs.
  4. Wait for the Celery overview queue to drain (docker compose logs worker | grep -E "reaggregate|scan-(summary|daily-severity|finding-group-summaries|resource-group-summaries|category-summaries|attack-surface)").
  5. Hit the same overview endpoints again. Muted counts should go up, non-muted counts should go down by the same amount on every one of them.
  6. Confirm no duplicate key value violates unique constraint errors appear in the worker log during step 4.

Automated:

cd api && poetry run pytest \
  src/backend/tasks/tests/test_backfill.py::TestBackfillScanCategorySummaries \
  src/backend/tasks/tests/test_backfill.py::TestBackfillScanGroupSummaries \
  src/backend/tasks/tests/test_tasks.py::TestReaggregateAllFindingGroupSummaries \
  src/backend/tasks/tests/test_scan.py::TestAggregateFindings \
  src/backend/tasks/tests/test_scan.py::TestAggregateAttackSurface \
  -x

Checklist

Community Checklist
  • This feature/issue is listed in here or roadmap.prowler.com
  • Is it assigned to me, if not, request it via the issue/feature in here or Prowler Community Slack

API

  • All issue/task requirements work as expected on the API
  • Endpoint response output (if applicable)
  • EXPLAIN ANALYZE output for new/modified queries or indexes (if applicable)
  • Performance test results (if applicable)
  • Any other relevant evidence of the implementation (if applicable)
  • Verify if API specs need to be regenerated.
  • Check if version updates are required.
  • Ensure new entries are added to api/CHANGELOG.md

License

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@AdriiiPRodri AdriiiPRodri requested a review from a team as a code owner April 22, 2026 10:32
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

✅ All necessary CHANGELOG.md files have been updated.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

Conflict Markers Resolved

All conflict markers have been successfully resolved in this pull request.

AdriiiPRodri added a commit that referenced this pull request Apr 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

🔒 Container Security Scan

Image: prowler-api:7883817
Last scan: 2026-04-27 09:02:41 UTC

📊 Vulnerability Summary

Severity Count
🔴 Critical 5
Total 5

4 package(s) affected

⚠️ Action Required

Critical severity vulnerabilities detected. These should be addressed before merging:

  • Review the detailed scan results
  • Update affected packages to patched versions
  • Consider using a different base image if updates are unavailable

📋 Resources:

AdriiiPRodri added a commit that referenced this pull request Apr 22, 2026
@AdriiiPRodri AdriiiPRodri force-pushed the fix/api-resource-inventory-reaggregate-on-mute branch from c9c28ee to ed54d23 Compare April 22, 2026 10:37
…ing findings

Extend `reaggregate_all_finding_group_summaries_task` (already chained
after `mute_historical_findings_task`) so that, for each latest scan per
`(provider, day)`, the same per-scan aggregation pipeline that scan
completion runs is re-executed against `ScanGroupSummary`,
`ScanCategorySummary` and `AttackSurfaceOverview` too. Without this,
`/overviews/resource-groups` (resource inventory), `/overviews/categories`
and `/overviews/attack-surfaces` kept pre-mute totals until the next
scan.

Make the three non-idempotent aggregators safe to re-run:

- `aggregate_attack_surface`, `backfill_scan_resource_group_summaries`
  and `backfill_scan_category_summaries` now delete the scan's existing
  rows before `bulk_create`. A plain INSERT or silent
  `ignore_conflicts=True`/`already backfilled` short-circuit would
  either violate the unique constraint (aborting the Celery chain and
  skipping downstream aggregators) or silently no-op, leaving the
  pre-aggregated tables stale after the mute.

Tests cover the new dispatch fan-out and the updated backfill behaviour.
@AdriiiPRodri AdriiiPRodri force-pushed the fix/api-resource-inventory-reaggregate-on-mute branch from ed54d23 to 14321fd Compare April 22, 2026 10:38
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 98.23009% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.65%. Comparing base (2304bf0) to head (1884984).
⚠️ Report is 21 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10843      +/-   ##
==========================================
+ Coverage   93.64%   93.65%   +0.01%     
==========================================
  Files         230      230              
  Lines       32987    33058      +71     
==========================================
+ Hits        30890    30961      +71     
  Misses       2097     2097              
Flag Coverage Δ
api 93.65% <98.23%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
prowler ∅ <ø> (∅)
api 93.65% <98.23%> (+0.01%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread api/src/backend/tasks/jobs/scan.py Outdated
Comment thread api/src/backend/tasks/tasks.py Outdated
…iew queue

Instead of DELETE+INSERT, `aggregate_findings`, `aggregate_attack_surface`,
`backfill_scan_resource_group_summaries` and
`backfill_scan_category_summaries` now write via
`bulk_create(update_conflicts=True, ...)`. The DELETE+INSERT pattern was
vulnerable to a race between two concurrent writers on the same scan
(e.g. scan completion overlapping with post-mute reaggregation): the
second task's INSERT would hit the `unique_*_per_scan` constraint after
the first committed and abort the Celery chain, leaving downstream
aggregators unexecuted. Upsert is atomic in PostgreSQL and safe for the
pre-aggregated tables here because the underlying aggregators seed every
(combination-key) unconditionally, so muting never drops a key and no
row is left orphan.

Also move `backfill_scan_resource_group_summaries_task` and
`backfill_scan_category_summaries_task` from the `backfill` queue to
`overview`, matching their sibling per-scan aggregators
(`perform_scan_summary_task`, `aggregate_daily_severity_task`,
`aggregate_finding_group_summaries_task`, `aggregate_attack_surface_task`).
These wrappers had no dispatchers outside the post-mute reaggregation
chain, so the queue rename is safe and removes the `.set(queue=...)`
dispatch-site hack.
…ign Celery task names

These two aggregators were misnamed as "backfill" but are, in practice,
the primary per-scan aggregators for `ScanGroupSummary` and
`ScanCategorySummary` -- same role as `aggregate_findings`,
`aggregate_daily_severity`, `aggregate_finding_group_summaries` and
`aggregate_attack_surface`. Rename to reflect that:

- Python: `backfill_scan_category_summaries` ->
  `aggregate_scan_category_summaries`; `backfill_scan_resource_group_summaries` ->
  `aggregate_scan_resource_group_summaries`; and the matching `*_task`
  wrappers.
- Celery task names: `backfill-scan-category-summaries` ->
  `scan-category-summaries`; `backfill-scan-resource-group-summaries` ->
  `scan-resource-group-summaries`. These slot in with the sibling
  `scan-summary`, `scan-daily-severity`, `scan-finding-group-summaries`,
  `scan-attack-surface-overviews` naming.

Safe because the old task wrappers had no dispatchers outside the
post-mute reaggregation chain introduced in this PR -- no queued tasks
with the old name exist during deploy. Updates all imports/callers in
`tasks.py`, `conftest.py`, `test_backfill.py` and `test_tasks.py`.
Comment thread api/src/backend/tasks/tests/test_scan.py
- Assert ScanSummary fail/muted move when a finding is muted mid-rerun
- Assert ScanCategorySummary counters drop to zero after mute
- Assert ScanGroupSummary counters drop to zero after mute
- Reference compliance fixture explicitly to satisfy vulture
@AdriiiPRodri AdriiiPRodri merged commit 65fd333 into master Apr 27, 2026
38 of 48 checks passed
@AdriiiPRodri AdriiiPRodri deleted the fix/api-resource-inventory-reaggregate-on-mute branch April 27, 2026 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants