Skip to content

feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422)#27422

Merged
ceekay47 merged 1 commit intoprestodb:masterfrom
ceekay47:export-D97920227
Apr 6, 2026
Merged

feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422)#27422
ceekay47 merged 1 commit intoprestodb:masterfrom
ceekay47:export-D97920227

Conversation

@ceekay47
Copy link
Copy Markdown
Contributor

@ceekay47 ceekay47 commented Mar 24, 2026

Summary:
Queries using GROUP BY/ORDER BY ordinals (e.g. GROUP BY 1) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.

Pulled By:
ceekay47

Differential Revision: D97920227

== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.

@ceekay47 ceekay47 requested review from a team, feilong-liu and jaystarshot as code owners March 24, 2026 10:48
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Mar 24, 2026
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 24, 2026

Reviewer's Guide

Adds support in MaterializedViewQueryOptimizer for materialized view query rewriting when base queries use GROUP BY and ORDER BY ordinals, and introduces tests to validate the new behavior.

Sequence diagram for MV rewrite with GROUP BY and ORDER BY ordinals

sequenceDiagram
    actor User
    participant Planner
    participant MaterializedViewQueryOptimizer
    participant MaterializedViewInfo

    User->>Planner: Submit query with SELECT, GROUP BY 1, ORDER BY 2
    Planner->>MaterializedViewQueryOptimizer: Optimize with materialized views

    MaterializedViewQueryOptimizer->>MaterializedViewInfo: getGroupBy()
    MaterializedViewInfo-->>MaterializedViewQueryOptimizer: Optional<Set<Expression>> groupBy

    loop For each GroupingElement
        MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: removeGroupingElementPrefix(element, removablePrefix)
        alt groupBy present in MaterializedView
            loop For each expression in element.expressions
                alt expression is LongLiteral (GROUP BY ordinal)
                    MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Resolve ordinal to SelectItem
                    MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: removeExpressionPrefix(selectItem.expression, removablePrefix)
                    MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Validate resolved expression in groupBy and baseToViewColumnMap
                    MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Add resolved expression to expressionsInGroupByBuilder
                else expression is non ordinal
                    MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Validate expression in groupBy and baseToViewColumnMap
                    MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Add expression to expressionsInGroupByBuilder
                end
            end
        else groupBy absent in MaterializedView
            MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Add element.expressions to expressionsInGroupByBuilder
        end
    end

    MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Rewrite ORDER BY
    loop For each SortItem
        MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: removeSortItemPrefix(sortItem, removablePrefix)
        alt sortKey is LongLiteral (ORDER BY ordinal)
            MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Skip baseToViewColumnMap validation
            MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: visitSortItem returns original SortItem
        else sortKey is non ordinal
            MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: Validate sortKey in baseToViewColumnMap
            MaterializedViewQueryOptimizer->>MaterializedViewQueryOptimizer: visitSortItem rewrites sortKey
        end
    end

    MaterializedViewQueryOptimizer-->>Planner: Rewritten query using materialized view
    Planner-->>User: Execute against materialized view when valid
Loading

Updated class diagram for MaterializedViewQueryOptimizer ordinal handling

classDiagram
    class MaterializedViewQueryOptimizer {
    }

    class MaterializedViewInfo {
        +Optional~Set~Expression~~ getGroupBy()
        +Map~Expression, Expression~ getBaseToViewColumnMap()
    }

    class MaterializedViewVisitor {
        -Optional~Identifier~ removablePrefix
        -MaterializedViewInfo materializedViewInfo
        +visitQuerySpecification(QuerySpecification node, Void context) Node
        +visitOrderBy(OrderBy node, Void context) Node
        +visitSortItem(SortItem node, Void context) Node
        +visitSimpleGroupBy(SimpleGroupBy node, Void context) Node
    }

    class QuerySpecification {
        +Select getSelect()
        +Optional~GroupBy~ getGroupBy()
    }

    class GroupBy {
        +List~GroupingElement~ getGroupingElements()
    }

    class GroupingElement {
        +List~Expression~ getExpressions()
    }

    class OrderBy {
        +List~SortItem~ getSortItems()
    }

    class SortItem {
        +Expression getSortKey()
        +SortItemOrdering getOrdering()
        +SortItemNullOrdering getNullOrdering()
    }

    class SimpleGroupBy {
        +List~Expression~ getExpressions()
    }

    class Select {
        +List~SelectItem~ getSelectItems()
    }

    class SelectItem {
    }

    class SingleColumn {
        +Expression getExpression()
    }

    class Expression {
    }

    class LongLiteral {
        +long getValue()
    }

    class Identifier {
    }

    MaterializedViewQueryOptimizer --> MaterializedViewVisitor
    MaterializedViewVisitor --> MaterializedViewInfo
    MaterializedViewVisitor ..> QuerySpecification
    MaterializedViewVisitor ..> GroupBy
    MaterializedViewVisitor ..> GroupingElement
    MaterializedViewVisitor ..> OrderBy
    MaterializedViewVisitor ..> SortItem
    MaterializedViewVisitor ..> SimpleGroupBy
    MaterializedViewVisitor ..> Select
    Select --> SelectItem
    SelectItem <|-- SingleColumn
    Expression <|-- LongLiteral
    MaterializedViewVisitor ..> Expression
    MaterializedViewVisitor ..> LongLiteral
    MaterializedViewVisitor ..> Identifier
Loading

File-Level Changes

Change Details Files
Resolve GROUP BY ordinals to SELECT expressions during MV validation and track resolved expressions for later matching.
  • Capture the list of SELECT items at the start of GROUP BY handling in visitQuerySpecification.
  • When validating GROUP BY, detect LongLiteral ordinal expressions, map them to the corresponding SingleColumn select item, and strip any removable prefix from the underlying expression.
  • Validate MV compatibility using the resolved expression instead of the raw ordinal literal.
  • Record the resolved expressions in expressionsInGroupBy so visitSingleColumn can match against them, falling back to original expressions when MV has no GROUP BY metadata.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/MaterializedViewQueryOptimizer.java
Treat ORDER BY ordinals as already-validated positional references and pass them through unchanged during rewriting.
  • In visitOrderBy, skip MV column-map validation for sort keys that are LongLiteral ordinals, continuing to validate only non-ordinal expressions against the base-to-view column map.
  • In visitSortItem, return sort items whose sort key is a LongLiteral unchanged instead of rewriting the sort key expression; continue rewriting non-ordinal sort keys as before.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/MaterializedViewQueryOptimizer.java
Preserve GROUP BY ordinal literals during GROUP BY rewriting while still rewriting non-ordinal grouping expressions.
  • In visitSimpleGroupBy, detect LongLiteral grouping expressions and add them directly to the rewritten GROUP BY without prefix removal or further processing.
  • Continue to remove prefixes and recursively rewrite non-ordinal grouping expressions as before.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/MaterializedViewQueryOptimizer.java
Add regression tests covering MV rewrites with GROUP BY and ORDER BY ordinals.
  • Add testWithGroupByOrdinals validating that GROUP BY ordinal references follow SELECT item rewriting from base table to MV columns.
  • Add testWithOrderByOrdinals validating that ORDER BY ordinals are preserved while SELECT list is rewritten to MV columns.
  • Add testWithGroupByAndOrderByOrdinals validating combined GROUP BY and ORDER BY ordinal handling over an MV with aggregation and GROUP BY.
presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestMaterializedViewQueryOptimizer.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • When resolving GROUP BY ordinals, selectItems.get(ordinal - 1) can throw an IndexOutOfBoundsException; consider explicitly validating the ordinal against the selectItems.size() and throwing a clearer IllegalStateException with context about the invalid ordinal.
  • The new IllegalStateException for GROUP BY ordinals referencing non-SingleColumn select items could be made more informative by including the offending SELECT item and ordinal to aid debugging.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When resolving GROUP BY ordinals, `selectItems.get(ordinal - 1)` can throw an `IndexOutOfBoundsException`; consider explicitly validating the ordinal against the `selectItems.size()` and throwing a clearer `IllegalStateException` with context about the invalid ordinal.
- The new `IllegalStateException` for GROUP BY ordinals referencing non-`SingleColumn` select items could be made more informative by including the offending SELECT item and ordinal to aid debugging.

## Individual Comments

### Comment 1
<location path="presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/MaterializedViewQueryOptimizer.java" line_range="477-478" />
<code_context>
-                            if (!groupByOfMaterializedView.get().contains(expression) || !materializedViewInfo.getBaseToViewColumnMap().containsKey(expression)) {
+                            // Resolve ordinal references (e.g. GROUP BY 1) to the corresponding SELECT expression
+                            Expression resolved = expression;
+                            if (expression instanceof LongLiteral) {
+                                int ordinal = toIntExact(((LongLiteral) expression).getValue());
+                                SelectItem selectItem = selectItems.get(ordinal - 1);
+                                if (selectItem instanceof SingleColumn) {
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against invalid or out-of-range GROUP BY ordinals for clearer failures

This treats all `LongLiteral` GROUP BY expressions as valid ordinals and directly indexes `selectItems.get(ordinal - 1)`. If `ordinal <= 0` or `ordinal > selectItems.size()`, this will throw an `IndexOutOfBoundsException` with an unhelpful message. Please add an explicit bounds check and raise a descriptive error (or reuse the engine’s standard invalid-ordinal error path) to make violations easier to diagnose.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@meta-codesync meta-codesync bot changed the title [presto] Support GROUP BY and ORDER BY ordinals in MV query rewriting feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting Mar 24, 2026
@meta-codesync meta-codesync bot changed the title feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) Mar 24, 2026
ceekay47 added a commit to ceekay47/presto that referenced this pull request Mar 24, 2026
…riting (prestodb#27422)

Summary:

Queries using GROUP BY/ORDER BY ordinals (e.g. `GROUP BY 1`) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.

```
== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.
```

Differential Revision: D97920227
ceekay47 added a commit to ceekay47/presto that referenced this pull request Mar 24, 2026
…riting (prestodb#27422)

Summary:
Pull Request resolved: prestodb#27422

Queries using GROUP BY/ORDER BY ordinals (e.g. `GROUP BY 1`) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.

```
== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.
```

Differential Revision: D97920227
@steveburnett
Copy link
Copy Markdown
Contributor

Please add a release note - or NO RELEASE NOTE - following the Release Notes Guidelines to pass the failing but not required CI check.

ceekay47 added a commit to ceekay47/presto that referenced this pull request Mar 24, 2026
…riting (prestodb#27422)

Summary:

Queries using GROUP BY/ORDER BY ordinals (e.g. `GROUP BY 1`) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.

```
== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.
```

Differential Revision: D97920227
ceekay47 added a commit to ceekay47/presto that referenced this pull request Mar 27, 2026
…riting (prestodb#27422)

Summary:

Queries using GROUP BY/ORDER BY ordinals (e.g. `GROUP BY 1`) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.

```
== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.
```

Differential Revision: D97920227
ceekay47 added a commit to ceekay47/presto that referenced this pull request Mar 27, 2026
…riting (prestodb#27422)

Summary:
Pull Request resolved: prestodb#27422

Queries using GROUP BY/ORDER BY ordinals (e.g. `GROUP BY 1`) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.

```
== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.
```

Differential Revision: D97920227
@ceekay47 ceekay47 requested a review from amitkdutta March 27, 2026 19:15
@meta-codesync meta-codesync bot changed the title feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) Apr 2, 2026
ceekay47 added a commit to ceekay47/presto that referenced this pull request Apr 2, 2026
…writing (prestodb#27422) (prestodb#27422)

Summary:
Queries using GROUP BY/ORDER BY ordinals (e.g. `GROUP BY 1`) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.


```
== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.
```


Differential Revision: D97920227

Pulled By: ceekay47
@ceekay47 ceekay47 changed the title feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) feat: Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) Apr 2, 2026
@steveburnett
Copy link
Copy Markdown
Contributor

  • Please edit the PR title to follow semantic commit style to pass the failing and required CI check. See the failure in the test for advice.

" feat: Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422)"
I think you just need to delete the space before feat and the CI check should pass.

@ceekay47 ceekay47 changed the title feat: Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) feat:Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) Apr 2, 2026
@ceekay47 ceekay47 changed the title feat:Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) feat: Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) Apr 2, 2026
Copy link
Copy Markdown
Member

@hantangwangd hantangwangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ceekay47 for this feature. Overall looks good to me — just a few nits and one suggestion about test case additions.

@meta-codesync meta-codesync bot changed the title feat: Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) Apr 3, 2026
ceekay47 added a commit to ceekay47/presto that referenced this pull request Apr 3, 2026
…writing (prestodb#27422) (prestodb#27422)

Summary:
Queries using GROUP BY/ORDER BY ordinals (e.g. `GROUP BY 1`) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.


```
== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.
```


Differential Revision: D97920227

Pulled By: ceekay47
@ceekay47 ceekay47 changed the title feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) Apr 3, 2026
…writing (prestodb#27422) (prestodb#27422) (prestodb#27422)

Summary:
Queries using GROUP BY/ORDER BY ordinals (e.g. `GROUP BY 1`) silently
fell back to the base table because the MV optimizer runs before the
analyzer resolves ordinals to column references. Fix by resolving
ordinals to SELECT expressions during MV validation and passing them
through unchanged during rewriting.

Pulled By:
ceekay47

```
== RELEASE NOTES ==

General Changes
* Add support for ``GROUP BY`` and ``ORDER BY`` ordinal references in
  materialized view query rewriting. Previously, queries like
  ``SELECT a, SUM(b) FROM t GROUP BY 1`` would silently skip
  materialized view optimization.
```


ceekay47

Differential Revision: D97920227
@meta-codesync meta-codesync bot changed the title feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) (#27422) Apr 4, 2026
@ceekay47 ceekay47 changed the title feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) (#27422) (#27422) feat(planner): Support GROUP BY and ORDER BY ordinals in MV query rewriting (#27422) Apr 4, 2026
@ceekay47 ceekay47 merged commit ee0eec7 into prestodb:master Apr 6, 2026
118 of 124 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants