Skip to content

feat(analyzer): Add session property to gate CTAS IF NOT EXISTS query analysis#27504

Open
kevintang2022 wants to merge 1 commit intoprestodb:masterfrom
kevintang2022:export-D95135442
Open

feat(analyzer): Add session property to gate CTAS IF NOT EXISTS query analysis#27504
kevintang2022 wants to merge 1 commit intoprestodb:masterfrom
kevintang2022:export-D95135442

Conversation

@kevintang2022
Copy link
Copy Markdown
Contributor

@kevintang2022 kevintang2022 commented Apr 3, 2026

Differential Revision: D95135442

Description

Add session property always_analyze_create_table_query_enabled to gate
analyzing inner queries on CTAS IF NOT EXISTS statements. When enabled,
the analyzer processes the inner SELECT query even when the target table
already exists, ensuring view definitions are populated for access
control checks.

Motivation and Context

When a CREATE TABLE AS SELECT IF NOT EXISTS statement targets an
existing table, the analyzer short-circuits and skips analyzing the
inner query. This means view definitions referenced in the SELECT are
never collected, causing access control checks on those views to be
skipped.

Impact

No impact when the session property is disabled (default: false).
When enabled, CTAS IF NOT EXISTS queries against existing tables will
now correctly populate view definitions and enforce access control
checks on referenced views.

Test Plan

Added testCreateTableAsSelectIfNotExistsWithViews in
TestViewDefinitionCollector to verify view definitions are collected
on the CTAS IF NOT EXISTS no-op path. Existing tests pass with the
session property enabled in CLIENT_SESSION.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

== RELEASE NOTES ==

General Changes
* Add session property ``always_analyze_create_table_query_enabled`` to enable analysis on inner queries even if table already exists.

@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Apr 3, 2026
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Apr 3, 2026

Reviewer's Guide

Adds a new session property to always analyze CTAS IF NOT EXISTS inner queries so view definitions are collected for access control, wires it into StatementAnalyzer, and adds tests to validate view collection under this mode.

Sequence diagram for CTAS IF NOT EXISTS analysis with new session property

sequenceDiagram
    actor User
    participant Client
    participant Coordinator
    participant StatementAnalyzer
    participant Session

    User->>Client: submit CREATE TABLE IF NOT EXISTS ... AS SELECT ...
    Client->>Coordinator: send query
    Coordinator->>StatementAnalyzer: analyze CreateTableAsSelect
    activate StatementAnalyzer
    StatementAnalyzer->>Session: getSystemProperty ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED
    Session-->>StatementAnalyzer: property value (true or false)
    alt table already exists
        StatementAnalyzer-->>Coordinator: warning TableAlreadyExists (skipping creation)
        alt ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED is true
            StatementAnalyzer->>StatementAnalyzer: process inner SELECT query
            StatementAnalyzer-->>Coordinator: analysis result with view definitions collected
        else property is false
            StatementAnalyzer-->>Coordinator: analysis result without analyzing inner query
        end
    else table does not exist
        StatementAnalyzer->>Coordinator: proceed with normal CTAS analysis and creation
    end
    Coordinator-->>Client: analysis complete / plan generated
    Client-->>User: return result and warnings
Loading

Class diagram for updated SystemSessionProperties and StatementAnalyzer

classDiagram
    class SystemSessionProperties {
        <<final>>
        +static String NATIVE_ENFORCE_JOIN_BUILD_INPUT_PARTITION
        +static String NATIVE_EXECUTION_SCALE_WRITER_THREADS_ENABLED
        +static String TRY_FUNCTION_CATCHABLE_ERRORS
        +static String ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED
        +SystemSessionProperties(FeaturesConfig featuresConfig)
        +static String getTryFunctionCatchableErrors(Session session)
        +static boolean isAlwaysAnalyzeCreateTableQueryEnabled(Session session)
    }

    class StatementAnalyzer {
        +Scope visitCreateTableAsSelect(CreateTableAsSelect node, Optional~Scope~ context)
        -Scope process(Query query, Scope scope)
    }

    class Session
    class FeaturesConfig
    class Scope
    class Query
    class CreateTableAsSelect

    SystemSessionProperties ..> Session : uses
    SystemSessionProperties ..> FeaturesConfig : uses
    StatementAnalyzer ..> SystemSessionProperties : calls isAlwaysAnalyzeCreateTableQueryEnabled
    StatementAnalyzer ..> Query : analyzes
    StatementAnalyzer ..> Scope : manages
    StatementAnalyzer ..> CreateTableAsSelect : visits
Loading

File-Level Changes

Change Details Files
Introduce a session property to always analyze CTAS inner queries and expose a getter for it.
  • Add ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED session property name constant.
  • Register the new boolean session property with description and default value.
  • Expose isAlwaysAnalyzeCreateTableQueryEnabled accessor to read the property from Session.
presto-main-base/src/main/java/com/facebook/presto/SystemSessionProperties.java
Ensure CTAS IF NOT EXISTS no-op still analyzes the inner query when the new session property is enabled.
  • In CreateTableAsSelect analysis, when the target table exists and IF NOT EXISTS is used, conditionally invoke analysis of the inner query based on the new session property before returning the synthetic scope.
presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
Enable the new CTAS analysis behavior in analyzer tests and add coverage for view collection in CTAS IF NOT EXISTS.
  • Turn on ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED in the default session used by analyzer tests.
  • Add a test ensuring view definitions are collected for a CREATE TABLE IF NOT EXISTS AS SELECT query whose target table already exists.
presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/AbstractAnalyzerTest.java
presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestViewDefinitionCollector.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The new test testCreateTableAsSelectIfNotExistsWithViews in TestViewDefinitionCollector is missing a @Test annotation (and appropriate visibility), so it will not run as part of the suite in its current form.
  • In StatementAnalyzer.visitCreateTableAsSelect, consider guarding the process(node.getQuery(), scope) call with any existing view-definition-collection or access-control-related flags to avoid extra analysis work when view tracking is disabled.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new test `testCreateTableAsSelectIfNotExistsWithViews` in `TestViewDefinitionCollector` is missing a `@Test` annotation (and appropriate visibility), so it will not run as part of the suite in its current form.
- In `StatementAnalyzer.visitCreateTableAsSelect`, consider guarding the `process(node.getQuery(), scope)` call with any existing view-definition-collection or access-control-related flags to avoid extra analysis work when view tracking is disabled.

## Individual Comments

### Comment 1
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestViewDefinitionCollector.java" line_range="65-62" />
<code_context>
         ), ImmutableMap.of());
     }

+    public void testCreateTableAsSelectIfNotExistsWithViews()
+    {
+        // t1 already exists, so this hits the IF NOT EXISTS no-op path.
+        // View definitions should still be populated from the inner SELECT.
+        @Language("SQL") String query = "CREATE TABLE IF NOT EXISTS t1 AS SELECT view_definer1.a, view_definer1.c, view_invoker2.y FROM view_definer1 left join view_invoker2 on view_invoker2.y = view_definer1.c";
+
+        assertViewDefinitions(query, ImmutableMap.of(
+                "tpch.s1.view_invoker2", "select x, y, z from t13",
+                "tpch.s1.view_definer1", "select a,b,c from t1"
+        ), ImmutableMap.of());
+    }
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add complementary coverage for CTAS IF NOT EXISTS when the new session property is disabled and for the table-does-not-exist path.

This test covers the "table already exists" no-op path with the property enabled. To complete coverage, please also add:

1) A case where the same CTAS IF NOT EXISTS is analyzed with `ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED` disabled, asserting that view definitions are not collected (or whatever the intended behavior is).

2) A CTAS IF NOT EXISTS where the target table does not exist, confirming that the normal CTAS path still collects view definitions and that the new property does not change successful creation behavior.
</issue_to_address>

### Comment 2
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/AbstractAnalyzerTest.java" line_range="159" />
<code_context>
             .setSchema("s1")
             .setSystemProperty(CHECK_ACCESS_CONTROL_ON_UTILIZED_COLUMNS_ONLY, "true")
             .setSystemProperty(CHECK_ACCESS_CONTROL_WITH_SUBFIELDS, "true")
+            .setSystemProperty(ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED, "true")
             .build();

</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests that run with the new session property disabled instead of only enabling it globally for all analyzer tests.

Since the shared session builder now forces `ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED` to `true`, analyzer tests no longer exercise the default (`false`) behavior for CTAS IF NOT EXISTS. Please either add at least one analyzer test that builds a session with this property set to `false`, or limit the `true` setting to only the tests that require it, so the default configuration remains covered by automated tests.

Suggested implementation:

```java
            .setSchema("s1")
            .setSystemProperty(CHECK_ACCESS_CONTROL_ON_UTILIZED_COLUMNS_ONLY, "true")
            .setSystemProperty(CHECK_ACCESS_CONTROL_WITH_SUBFIELDS, "true")
            .build();

```

To fully implement the review comment:

1. In the analyzer tests that specifically exercise `CREATE TABLE AS ... IF NOT EXISTS` behavior with the new analysis, build a `Session` that sets `ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED` to `"true"` explicitly. For example (in the relevant test class):
   - Start from the base session (e.g. `getSession()` or the shared session builder in this file).
   - Call `.setSystemProperty(ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED, "true")` on the `Session.SessionBuilder`.
2. Ensure at least one analyzer test that covers CTAS IF NOT EXISTS uses the default session (without overriding this property), so that the default `false` behavior remains covered by automated tests.
3. If useful, you can add a helper in this file such as `protected Session getSessionWithAnalyzeCreateTableEnabled()` that wraps the base session builder and sets this property to `"true"`, and then update only the tests that need the non-default behavior to use that helper.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@@ -62,6 +62,18 @@ public void testCreateTableAsSelectWithViews()
), ImmutableMap.of());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add complementary coverage for CTAS IF NOT EXISTS when the new session property is disabled and for the table-does-not-exist path.

This test covers the "table already exists" no-op path with the property enabled. To complete coverage, please also add:

  1. A case where the same CTAS IF NOT EXISTS is analyzed with ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED disabled, asserting that view definitions are not collected (or whatever the intended behavior is).

  2. A CTAS IF NOT EXISTS where the target table does not exist, confirming that the normal CTAS path still collects view definitions and that the new property does not change successful creation behavior.

.setSchema("s1")
.setSystemProperty(CHECK_ACCESS_CONTROL_ON_UTILIZED_COLUMNS_ONLY, "true")
.setSystemProperty(CHECK_ACCESS_CONTROL_WITH_SUBFIELDS, "true")
.setSystemProperty(ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED, "true")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding tests that run with the new session property disabled instead of only enabling it globally for all analyzer tests.

Since the shared session builder now forces ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED to true, analyzer tests no longer exercise the default (false) behavior for CTAS IF NOT EXISTS. Please either add at least one analyzer test that builds a session with this property set to false, or limit the true setting to only the tests that require it, so the default configuration remains covered by automated tests.

Suggested implementation:

            .setSchema("s1")
            .setSystemProperty(CHECK_ACCESS_CONTROL_ON_UTILIZED_COLUMNS_ONLY, "true")
            .setSystemProperty(CHECK_ACCESS_CONTROL_WITH_SUBFIELDS, "true")
            .build();

To fully implement the review comment:

  1. In the analyzer tests that specifically exercise CREATE TABLE AS ... IF NOT EXISTS behavior with the new analysis, build a Session that sets ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED to "true" explicitly. For example (in the relevant test class):
    • Start from the base session (e.g. getSession() or the shared session builder in this file).
    • Call .setSystemProperty(ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED, "true") on the Session.SessionBuilder.
  2. Ensure at least one analyzer test that covers CTAS IF NOT EXISTS uses the default session (without overriding this property), so that the default false behavior remains covered by automated tests.
  3. If useful, you can add a helper in this file such as protected Session getSessionWithAnalyzeCreateTableEnabled() that wraps the base session builder and sets this property to "true", and then update only the tests that need the non-default behavior to use that helper.

@kevintang2022 kevintang2022 changed the title Fix error message for view definitions on CTAS if not exists feat(analyzer): Add session property to gate CTAS IF NOT EXISTS query analysis Apr 3, 2026
@meta-codesync meta-codesync bot changed the title feat(analyzer): Add session property to gate CTAS IF NOT EXISTS query analysis Fix error message for view definitions on CTAS if not exists (#27504) Apr 3, 2026
kevintang2022 added a commit to kevintang2022/presto that referenced this pull request Apr 3, 2026
…b#27504)

Summary: Pull Request resolved: prestodb#27504

Differential Revision: D95135442
@kevintang2022 kevintang2022 changed the title Fix error message for view definitions on CTAS if not exists (#27504) feat(analyzer): Add session property to gate CTAS IF NOT EXISTS query analysis Apr 3, 2026
kevintang2022 added a commit to kevintang2022/presto that referenced this pull request Apr 3, 2026
kevintang2022 added a commit to kevintang2022/presto that referenced this pull request Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants