feat(analyzer): Add session property to gate CTAS IF NOT EXISTS query analysis#27504
feat(analyzer): Add session property to gate CTAS IF NOT EXISTS query analysis#27504kevintang2022 wants to merge 1 commit intoprestodb:masterfrom
Conversation
Reviewer's GuideAdds a new session property to always analyze CTAS IF NOT EXISTS inner queries so view definitions are collected for access control, wires it into StatementAnalyzer, and adds tests to validate view collection under this mode. Sequence diagram for CTAS IF NOT EXISTS analysis with new session propertysequenceDiagram
actor User
participant Client
participant Coordinator
participant StatementAnalyzer
participant Session
User->>Client: submit CREATE TABLE IF NOT EXISTS ... AS SELECT ...
Client->>Coordinator: send query
Coordinator->>StatementAnalyzer: analyze CreateTableAsSelect
activate StatementAnalyzer
StatementAnalyzer->>Session: getSystemProperty ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED
Session-->>StatementAnalyzer: property value (true or false)
alt table already exists
StatementAnalyzer-->>Coordinator: warning TableAlreadyExists (skipping creation)
alt ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED is true
StatementAnalyzer->>StatementAnalyzer: process inner SELECT query
StatementAnalyzer-->>Coordinator: analysis result with view definitions collected
else property is false
StatementAnalyzer-->>Coordinator: analysis result without analyzing inner query
end
else table does not exist
StatementAnalyzer->>Coordinator: proceed with normal CTAS analysis and creation
end
Coordinator-->>Client: analysis complete / plan generated
Client-->>User: return result and warnings
Class diagram for updated SystemSessionProperties and StatementAnalyzerclassDiagram
class SystemSessionProperties {
<<final>>
+static String NATIVE_ENFORCE_JOIN_BUILD_INPUT_PARTITION
+static String NATIVE_EXECUTION_SCALE_WRITER_THREADS_ENABLED
+static String TRY_FUNCTION_CATCHABLE_ERRORS
+static String ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED
+SystemSessionProperties(FeaturesConfig featuresConfig)
+static String getTryFunctionCatchableErrors(Session session)
+static boolean isAlwaysAnalyzeCreateTableQueryEnabled(Session session)
}
class StatementAnalyzer {
+Scope visitCreateTableAsSelect(CreateTableAsSelect node, Optional~Scope~ context)
-Scope process(Query query, Scope scope)
}
class Session
class FeaturesConfig
class Scope
class Query
class CreateTableAsSelect
SystemSessionProperties ..> Session : uses
SystemSessionProperties ..> FeaturesConfig : uses
StatementAnalyzer ..> SystemSessionProperties : calls isAlwaysAnalyzeCreateTableQueryEnabled
StatementAnalyzer ..> Query : analyzes
StatementAnalyzer ..> Scope : manages
StatementAnalyzer ..> CreateTableAsSelect : visits
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The new test
testCreateTableAsSelectIfNotExistsWithViewsinTestViewDefinitionCollectoris missing a@Testannotation (and appropriate visibility), so it will not run as part of the suite in its current form. - In
StatementAnalyzer.visitCreateTableAsSelect, consider guarding theprocess(node.getQuery(), scope)call with any existing view-definition-collection or access-control-related flags to avoid extra analysis work when view tracking is disabled.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new test `testCreateTableAsSelectIfNotExistsWithViews` in `TestViewDefinitionCollector` is missing a `@Test` annotation (and appropriate visibility), so it will not run as part of the suite in its current form.
- In `StatementAnalyzer.visitCreateTableAsSelect`, consider guarding the `process(node.getQuery(), scope)` call with any existing view-definition-collection or access-control-related flags to avoid extra analysis work when view tracking is disabled.
## Individual Comments
### Comment 1
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestViewDefinitionCollector.java" line_range="65-62" />
<code_context>
), ImmutableMap.of());
}
+ public void testCreateTableAsSelectIfNotExistsWithViews()
+ {
+ // t1 already exists, so this hits the IF NOT EXISTS no-op path.
+ // View definitions should still be populated from the inner SELECT.
+ @Language("SQL") String query = "CREATE TABLE IF NOT EXISTS t1 AS SELECT view_definer1.a, view_definer1.c, view_invoker2.y FROM view_definer1 left join view_invoker2 on view_invoker2.y = view_definer1.c";
+
+ assertViewDefinitions(query, ImmutableMap.of(
+ "tpch.s1.view_invoker2", "select x, y, z from t13",
+ "tpch.s1.view_definer1", "select a,b,c from t1"
+ ), ImmutableMap.of());
+ }
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add complementary coverage for CTAS IF NOT EXISTS when the new session property is disabled and for the table-does-not-exist path.
This test covers the "table already exists" no-op path with the property enabled. To complete coverage, please also add:
1) A case where the same CTAS IF NOT EXISTS is analyzed with `ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED` disabled, asserting that view definitions are not collected (or whatever the intended behavior is).
2) A CTAS IF NOT EXISTS where the target table does not exist, confirming that the normal CTAS path still collects view definitions and that the new property does not change successful creation behavior.
</issue_to_address>
### Comment 2
<location path="presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/AbstractAnalyzerTest.java" line_range="159" />
<code_context>
.setSchema("s1")
.setSystemProperty(CHECK_ACCESS_CONTROL_ON_UTILIZED_COLUMNS_ONLY, "true")
.setSystemProperty(CHECK_ACCESS_CONTROL_WITH_SUBFIELDS, "true")
+ .setSystemProperty(ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED, "true")
.build();
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests that run with the new session property disabled instead of only enabling it globally for all analyzer tests.
Since the shared session builder now forces `ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED` to `true`, analyzer tests no longer exercise the default (`false`) behavior for CTAS IF NOT EXISTS. Please either add at least one analyzer test that builds a session with this property set to `false`, or limit the `true` setting to only the tests that require it, so the default configuration remains covered by automated tests.
Suggested implementation:
```java
.setSchema("s1")
.setSystemProperty(CHECK_ACCESS_CONTROL_ON_UTILIZED_COLUMNS_ONLY, "true")
.setSystemProperty(CHECK_ACCESS_CONTROL_WITH_SUBFIELDS, "true")
.build();
```
To fully implement the review comment:
1. In the analyzer tests that specifically exercise `CREATE TABLE AS ... IF NOT EXISTS` behavior with the new analysis, build a `Session` that sets `ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED` to `"true"` explicitly. For example (in the relevant test class):
- Start from the base session (e.g. `getSession()` or the shared session builder in this file).
- Call `.setSystemProperty(ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED, "true")` on the `Session.SessionBuilder`.
2. Ensure at least one analyzer test that covers CTAS IF NOT EXISTS uses the default session (without overriding this property), so that the default `false` behavior remains covered by automated tests.
3. If useful, you can add a helper in this file such as `protected Session getSessionWithAnalyzeCreateTableEnabled()` that wraps the base session builder and sets this property to `"true"`, and then update only the tests that need the non-default behavior to use that helper.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| @@ -62,6 +62,18 @@ public void testCreateTableAsSelectWithViews() | |||
| ), ImmutableMap.of()); | |||
There was a problem hiding this comment.
suggestion (testing): Add complementary coverage for CTAS IF NOT EXISTS when the new session property is disabled and for the table-does-not-exist path.
This test covers the "table already exists" no-op path with the property enabled. To complete coverage, please also add:
-
A case where the same CTAS IF NOT EXISTS is analyzed with
ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLEDdisabled, asserting that view definitions are not collected (or whatever the intended behavior is). -
A CTAS IF NOT EXISTS where the target table does not exist, confirming that the normal CTAS path still collects view definitions and that the new property does not change successful creation behavior.
| .setSchema("s1") | ||
| .setSystemProperty(CHECK_ACCESS_CONTROL_ON_UTILIZED_COLUMNS_ONLY, "true") | ||
| .setSystemProperty(CHECK_ACCESS_CONTROL_WITH_SUBFIELDS, "true") | ||
| .setSystemProperty(ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED, "true") |
There was a problem hiding this comment.
suggestion (testing): Consider adding tests that run with the new session property disabled instead of only enabling it globally for all analyzer tests.
Since the shared session builder now forces ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED to true, analyzer tests no longer exercise the default (false) behavior for CTAS IF NOT EXISTS. Please either add at least one analyzer test that builds a session with this property set to false, or limit the true setting to only the tests that require it, so the default configuration remains covered by automated tests.
Suggested implementation:
.setSchema("s1")
.setSystemProperty(CHECK_ACCESS_CONTROL_ON_UTILIZED_COLUMNS_ONLY, "true")
.setSystemProperty(CHECK_ACCESS_CONTROL_WITH_SUBFIELDS, "true")
.build();To fully implement the review comment:
- In the analyzer tests that specifically exercise
CREATE TABLE AS ... IF NOT EXISTSbehavior with the new analysis, build aSessionthat setsALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLEDto"true"explicitly. For example (in the relevant test class):- Start from the base session (e.g.
getSession()or the shared session builder in this file). - Call
.setSystemProperty(ALWAYS_ANALYZE_CREATE_TABLE_QUERY_ENABLED, "true")on theSession.SessionBuilder.
- Start from the base session (e.g.
- Ensure at least one analyzer test that covers CTAS IF NOT EXISTS uses the default session (without overriding this property), so that the default
falsebehavior remains covered by automated tests. - If useful, you can add a helper in this file such as
protected Session getSessionWithAnalyzeCreateTableEnabled()that wraps the base session builder and sets this property to"true", and then update only the tests that need the non-default behavior to use that helper.
…b#27504) Summary: Pull Request resolved: prestodb#27504 Differential Revision: D95135442
59ff293 to
300f988
Compare
300f988 to
2e02fbb
Compare
2e02fbb to
19208cd
Compare
19208cd to
06fe44e
Compare
Differential Revision: D95135442
Description
Add session property
always_analyze_create_table_query_enabledto gateanalyzing inner queries on CTAS IF NOT EXISTS statements. When enabled,
the analyzer processes the inner SELECT query even when the target table
already exists, ensuring view definitions are populated for access
control checks.
Motivation and Context
When a
CREATE TABLE AS SELECT IF NOT EXISTSstatement targets anexisting table, the analyzer short-circuits and skips analyzing the
inner query. This means view definitions referenced in the SELECT are
never collected, causing access control checks on those views to be
skipped.
Impact
No impact when the session property is disabled (default:
false).When enabled, CTAS IF NOT EXISTS queries against existing tables will
now correctly populate view definitions and enforce access control
checks on referenced views.
Test Plan
Added
testCreateTableAsSelectIfNotExistsWithViewsinTestViewDefinitionCollectorto verify view definitions are collectedon the CTAS IF NOT EXISTS no-op path. Existing tests pass with the
session property enabled in
CLIENT_SESSION.Contributor checklist
Release Notes