perf(api): speed up finding-groups /resources endpoint by AdriiiPRodri · Pull Request #10816 · prowler-cloud/prowler

AdriiiPRodri · 2026-04-21T10:30:46Z

Context

Users reported that the finding groups page felt extremely slow, especially when drilling into the /finding-groups/{check_id}/resources and /finding-groups/latest/{check_id}/resources endpoints. Profiling against a realistic dev dataset showed the request spending ~3.5 s in SQL for a response that ultimately returns ~10 rows.

Description

Profiling with django-silk + EXPLAIN ANALYZE traced the cost to two independent problems, both inside FindingGroupViewSet on the /resources action:

Redundant existence probe on every request. _paginated_resource_response ran has_mappings = self._build_resource_mapping_queryset(...).exists() before handing control to _mapping_paginated_response, which then re-ran essentially the same semi-join as part of its paginator COUNT. For the 99% of requests that hit groups with resources (non-IaC), that extra probe was pure waste. The exists() over partitioned ResourceFindingMapping cost ~586 ms of real DB time per request on the test dataset.
Planner picked a Merge Semi Join we didn't want. _build_resource_mapping_queryset filtered RFM with finding_id__in=Subquery(findings_qs). PostgreSQL kept choosing a Merge Semi Join: it read the finding_id index of each RFM partition ordered by finding_id and filtered tenant_id as a post-step. On the monthly partition currently being appended to, that meant reading ~470k index entries to keep ~38k matching the tenant, only to surface the first row of the semi-join result. Classic case of the planner treating the post-filter as free when it wasn't.

This PR avoids both problems without schema changes:

Drops the has_mappings.exists() pre-check. Instead, _paginated_resource_response now calls _mapping_paginated_response straight away and uses the paginator's total count to decide when to fall back to _combined_paginated_response for pure-orphan/IaC groups. One less query per request on the common path; behavior for IaC groups is unchanged.
Materializes the finding IDs as a plain Python list (finding_id__in=[uuid, uuid, …]) rather than Subquery(...). With an inlined value list the planner switches to Nested Loop + Index Scan on the RFM finding_id index, keyed directly by the constant list. RFM aggregation queries now execute in under 1 ms of DB time.
Adds a request-scoped cache (_resolve_finding_ids) so the three helpers that build different RFM querysets from the same filtered findings set (_build_resource_mapping_queryset, _build_resource_ordering_queryset, _build_resource_aggregation) reuse the same materialized list instead of re-running the findings query 3–4× per request.

EXPLAIN ANALYZE (real captures)

Before: has_mappings.exists() pre-check (~586 ms real DB time)

Limit  (cost=282.16..293.91 rows=1 width=4) (actual time=584.270..584.283 rows=1 loops=1)
  ->  Merge Semi Join  (cost=282.16..1494541.31 rows=127167 width=4) (actual time=584.268..584.280 rows=1 loops=1)
        Merge Cond: (resource_finding_mappings.finding_id = v0.id)
        ->  Append  (cost=1.75..359959.95 rows=321941 width=16) (actual time=0.051..111.924 rows=38230 loops=1)
              ->  Index Scan using resource_finding_mappings_2026_mar_finding_id_idx on resource_finding_mappings_2026_mar
                    Filter: (tenant_id = 'a3b26b14-cbfe-40e8-b54c-dd6e2bfa8780'::uuid)
                    Rows Removed by Filter: 431220
              ->  Index Scan using resource_finding_mappings_2026_apr_finding_id_idx on resource_finding_mappings_2026_apr  (never executed)
                    Filter: (tenant_id = 'a3b26b14-cbfe-40e8-b54c-dd6e2bfa8780'::uuid)
              …

Reading 470k index entries just to post-filter tenant_id and keep 38k, only to return a single row via LIMIT 1.

Before: main RFM aggregation (Merge Semi Join, ~21 ms real DB time but unstable under load)

Finalize GroupAggregate  (actual time=18.082..20.780 rows=10 loops=1)
  ->  Gather Merge  (…)
        ->  Partial GroupAggregate
              ->  Sort
                    ->  Hash Join
                          ->  Nested Loop
                                ->  Nested Loop
                                      ->  Merge Semi Join  (actual time=1.252..1.294 rows=3 loops=3)
                                            Merge Cond: (rfm.finding_id = v0.id)
                                            ->  Sort  (rfm sorted by finding_id)
                                            ->  Sort  (findings subquery sorted by id)

After: same aggregation, Nested Loop + Index Scan driven by the materialized list (< 1 ms real DB time)

GroupAggregate  (cost=334.37..334.55 rows=2 width=369) (actual time=0.422..0.667 rows=10 loops=1)
  Group Key: resource_finding_mappings.resource_id
  ->  Sort  (actual time=0.347..0.351 rows=10 loops=1)
        Sort Key: resource_finding_mappings.resource_id
        ->  Nested Loop  (actual time=0.082..0.334 rows=10 loops=1)
              ->  Nested Loop
                    ->  Nested Loop
                          ->  Append  (actual time=0.038..0.132 rows=10 loops=1)
                                ->  Index Scan using resource_finding_mappings_2026_mar_finding_id_idx
                                      Index Cond: (finding_id = ANY ('{019ce202-b11a-…, 019ce202-b11c-…, …}'::uuid[]))
                                …

The semi-join is gone; the RFM access is a targeted Index Cond: finding_id = ANY (…) driven straight by the inlined UUID list, which is exactly what we want on partitioned RFM.

Notes on the per-request cache

The cache in _resolve_finding_ids lives on self (the FindingGroupViewSet instance). DRF creates a fresh ViewSet instance for every request (as_view() calls cls(**initkwargs) per dispatch), so the cache is effectively request-scoped. Reviewers have asked a few times whether this could cause the endpoint to serve stale data; the short answer is no:

Between requests: each request starts with self._finding_ids_cache = None and re-materializes the finding IDs against the current DB state. There is no cross-request caching.
Inside the same request: the cache forces all RFM helpers (_build_resource_mapping_queryset, _build_resource_ordering_queryset, _build_resource_aggregation) to operate on the same snapshot of finding IDs. This is actually more consistent than the previous behavior, where each helper re-ran the findings subquery and could each observe slightly different state under PostgreSQL's default read-committed isolation.
Concurrent writes during the request:
- New findings created after materialization are not visible in this request (same as before — the old code would also race depending on when each subquery ran).
- New ResourceFindingMapping rows for already-cached finding IDs are still visible, because each RFM query runs live; only the IN (...) list is frozen.
- Deleted findings leave their UUIDs in the inlined list, but the RFM join simply returns no rows for them — same net result as the un-cached version.
Identity check: the cache is keyed by is on the filtered_queryset object, not by equality. If a caller were to ever pass a different queryset instance (a clone), the cache would miss and we would re-materialize, so there is no risk of serving data from a different filter set.

Comparison

Measured against the same tenant in the dev environment. Numbers are silk-reported totals (aws RDS in eu-west-1 accessed from a local docker container, so each round-trip inflates query timings — DB time itself is sub-millisecond after the fix).

/finding-groups/latest/s3_bucket_public_access/resources (group with mappings, common path)

Variant	Total time	SQL time	# SQL queries	Slowest RFM query (real DB time)
Baseline (before)	7 872 – 8 709 ms	3 128 – 3 549 ms	8	586 ms (Merge Semi Join)
After dropping the `exists()` pre-check	7 073 ms	2 650 ms	7	586 ms (Merge Semi Join still there on the count)
After materialization + per-request cache	6 626 – 6 717 ms	2 490 – 2 614 ms	8	< 1 ms (Nested Loop + Index Scan)

/finding-groups/latest/aws-autoscaling-enable-at-rest-encryption/resources (IaC-like group with no mappings, orphan fallback path)

Variant	Total time	SQL time	# SQL queries
Baseline (before)	6 681 ms	2 554 ms	8
Without the per-request cache	8 280 – 8 315 ms	3 273 – 3 346 ms	11 (3× duplicate materialization)
With per-request cache	6 949 ms	2 679 ms	9 (+1 single materialization query)

The orphan path is essentially at par with the baseline; the mapping path (which is the common case) sees the SQL plan collapse from 586 ms per RFM touch to sub-millisecond, and shaves ~1.3–2 s off the request against the remote RDS setup used for benchmarking. In production where the API is co-located with the database, the remaining silk-reported latency (which is dominated by network round-trip time) will collapse as well.

Steps to review

Code review FindingGroupViewSet in api/src/backend/api/v1/views.py:
- _resolve_finding_ids helper and its request-scoped cache.
- _build_resource_mapping_queryset now uses the Python list instead of Subquery.
- _paginated_resource_response drops the has_mappings.exists() branch and piggybacks on the paginator count to detect orphan-only groups.
Exercise the endpoint end-to-end against a dataset that has both a mapping-backed group and an IaC/orphan-only group, and confirm:
- Rows returned match the previous behavior.
- Pagination (page[number], page[size]) and sort (sort=status,severity,delta,...) still work.
- Resource filters (filter[resource_uid], filter[resource_type], etc.) keep returning the same resources.
Optional: capture EXPLAIN ANALYZE on ResourceFindingMapping queries to confirm the planner now uses Index Scan using …_finding_id_idx with Index Cond: (finding_id = ANY (...)) instead of the previous Merge Semi Join.

Checklist

Community Checklist

This feature/issue is listed in here or roadmap.prowler.com
Is it assigned to me, if not, request it via the issue/feature in here or Prowler Community Slack

Are there new checks included in this PR? No
Review if the code is being covered by tests.
Review if code is being documented following https://github.com/google/styleguide/blob/gh-pages/pyguide.md#38-comments-and-docstrings
Review if backport is needed.
Review if is needed to change the Readme.md
Ensure new entries are added to CHANGELOG.md, if applicable.

API

All issue/task requirements work as expected on the API
Endpoint response output (same shape as before; performance-only change)
EXPLAIN ANALYZE output for new/modified queries or indexes (see Description)
Performance test results (see Comparison table)
Verify if API specs need to be regenerated.
Check if version updates are required.
Ensure new entries are added to api/CHANGELOG.md

License

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Two query-plan fixes in the FindingGroup viewset's /resources action: 1. Drop the has_mappings.exists() pre-check in _paginated_resource_response. That probe ran on every non-filtered request and re-did the same semi-join the paginator already runs; use the paginator's total count to detect orphan-only (IaC) groups instead. 2. Materialize finding_ids into a Python list before filtering ResourceFindingMapping. finding_id__in=Subquery(findings_qs) made the planner pick a Merge Semi Join that scanned hundreds of thousands of RFM index entries to post-filter tenant_id. An inlined UUID list flips the plan to Nested Loop + Index Scan on finding_id (sub-millisecond on real DB time). A request-scoped cache in _resolve_finding_ids deduplicates the materialization across helpers that build different RFM querysets from the same filtered findings set, so we do the findings round-trip once per request. On the mapping path this cuts ~1-2s off the request in the measured dev environment while keeping behavior identical for IaC/orphan groups.

github-actions · 2026-04-21T10:31:13Z

✅ All necessary CHANGELOG.md files have been updated.

github-actions · 2026-04-21T10:31:45Z

✅ Conflict Markers Resolved

All conflict markers have been successfully resolved in this pull request.

github-actions · 2026-04-21T10:40:30Z

🔒 Container Security Scan

Image: prowler-api:85d786e
Last scan: 2026-04-21 10:40:36 UTC

📊 Vulnerability Summary

Severity	Count
🔴 Critical	5
Total	5

4 package(s) affected

⚠️ Action Required

Critical severity vulnerabilities detected. These should be addressed before merging:

Review the detailed scan results
Update affected packages to patched versions
Consider using a different base image if updates are unavailable

📋 Resources:

Download full report (see artifacts)
View in Security tab
Scanned with Trivy

codecov · 2026-04-21T10:45:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.47%. Comparing base (dcec79d) to head (0b8bb98).
⚠️ Report is 11 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #10816   +/-   ##
=======================================
  Coverage   93.47%   93.47%           
=======================================
  Files         228      228           
  Lines       32027    32064   +37     
=======================================
+ Hits        29936    29973   +37     
  Misses       2091     2091

Flag	Coverage Δ
api	`93.47% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
prowler	`∅ <ø> (∅)`
api	`93.47% <100.00%> (+<0.01%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

josema-xyz

🙌

prowler-bot · 2026-04-21T10:55:10Z

💚 All backports created successfully

Status	Branch	Result
✅	v5.24

Questions ?

Please refer to the Backport tool documentation and see the Github Action logs for details

AdriiiPRodri requested a review from a team as a code owner April 21, 2026 10:30

github-actions Bot added the component/api label Apr 21, 2026

AdriiiPRodri added a commit that referenced this pull request Apr 21, 2026

docs(api): add changelog entry for #10816

76d15b3

docs(api): add changelog entry for #10816

0b8bb98

AdriiiPRodri force-pushed the perf/api-finding-groups-resources-rfm-query-plan branch from 76d15b3 to 0b8bb98 Compare April 21, 2026 10:31

josema-xyz approved these changes Apr 21, 2026

View reviewed changes

AdriiiPRodri merged commit 548389d into master Apr 21, 2026
41 checks passed

AdriiiPRodri deleted the perf/api-finding-groups-resources-rfm-query-plan branch April 21, 2026 10:54

AdriiiPRodri added the backport-to-v5.24 Backport PR to the v5.24 branch label Apr 21, 2026

prowler-bot mentioned this pull request Apr 21, 2026

perf(api): speed up finding-groups /resources endpoint #10817

Merged

prowler-bot added the was-backported The PR was successfully backported to the target branch label Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(api): speed up finding-groups /resources endpoint#10816

perf(api): speed up finding-groups /resources endpoint#10816
AdriiiPRodri merged 2 commits into
masterfrom
perf/api-finding-groups-resources-rfm-query-plan

AdriiiPRodri commented Apr 21, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

josema-xyz left a comment

Uh oh!

Uh oh!

prowler-bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

AdriiiPRodri commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Description

EXPLAIN ANALYZE (real captures)

Notes on the per-request cache

Comparison

Steps to review

Checklist

API

License

Uh oh!

github-actions Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Container Security Scan

📊 Vulnerability Summary

⚠️ Action Required

Uh oh!

codecov Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

josema-xyz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

prowler-bot commented Apr 21, 2026

💚 All backports created successfully

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AdriiiPRodri commented Apr 21, 2026 •

edited

Loading

github-actions Bot commented Apr 21, 2026 •

edited

Loading

github-actions Bot commented Apr 21, 2026 •

edited

Loading

github-actions Bot commented Apr 21, 2026 •

edited

Loading

codecov Bot commented Apr 21, 2026 •

edited

Loading