[Delta Core 2.4][Spark 3.4] Execute MERGE using Dataframe API in Scala to ensure merge command appears in the Logical Execution Plan and subsequently picked up by QueryExecutionListener #4825
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
(cherrypick of #3456) and (cherrypick of #3585)
This change ensures that the MERGE command executed via the Scala API is properly captured in the Logical Execution Plan and recognized by the QueryExecutionListener. While Spark 3.5.X and 4.x support lineage capture from the logical plan, earlier versions (3.1–3.4) do not, necessitating a backward-compatible solution.
This update manually resolves the plan, then executes it via the DataFrame API, allowing the command to flow through Spark’s standard analysis and execution pipeline. As a result, Spark data lineage can be captured using tools like Spline Spark Agent and etc.
Resolves (original issue: #1521) Covered by existing tests.
References:
(Cherrypick of #3456)
(Original issue: #1521)