-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[Kernel]support RowTrackingDomainMetadata in Transaction.addDomainMetadata #4856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[Kernel]support RowTrackingDomainMetadata in Transaction.addDomainMetadata #4856
Conversation
@@ -93,6 +94,7 @@ public class TransactionImpl implements Transaction { | |||
private int maxRetries; | |||
private int logCompactionInterval; | |||
private Optional<CRCInfo> currentCrcInfo; | |||
private Optional<Long> providedRowIdHighWatermark = Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this to domainMetadataState?
@@ -252,7 +261,8 @@ private long commitWithRetry( | |||
protocol, | |||
Optional.empty() /* winningTxnRowIdHighWatermark */, | |||
dataActions, | |||
resolvedDomainMetadatas); | |||
resolvedDomainMetadatas, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do the check outside updateRowIdHighWatermarkIfNeeded
so we don't even need to change the signature of pdateRowIdHighWatermarkIfNeeded
kernel/kernel-api/src/main/java/io/delta/kernel/internal/DeltaErrors.java
Show resolved
Hide resolved
Set<String> SUPPORTED_SYSTEM_DOMAINS = new HashSet<>(); | ||
SUPPORTED_SYSTEM_DOMAINS.add(RowTrackingMetadataDomain.DOMAIN_NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a static variable
@@ -74,18 +74,21 @@ public class ConflictChecker { | |||
|
|||
// Helper states during conflict resolution | |||
private Optional<Long> lastWinningRowIdHighWatermark = Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this into the private final
section
} | ||
} | ||
}); | ||
|
||
if (currRowIdHighWatermark.get() != prevRowIdHighWatermark) { | ||
// If the txn builder has explicitly provided a row ID high watermark, we should use that value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do a conflict resolution when the user provides a high watermark? I think the intention when they provide a high watermark is the connector wants to explicitly dictate the high watermark and the commit contents. Rebasing when conflicts could create unintended problems and can make the code too complex.
Which Delta project/connector is this regarding?
Description
Support the pass the system domain metadata
RowTrackingDomainMetadata
in the Transaction.addDomianMetadata API, which would be used to set the highWaterMark of row tracking feature. This provided high water mark must be larger or same as the current largest row id, otherwise we would throw the exception.This is required for supporting IcebergCompatV3 table as we need to keep the Iceberg nextRowId and Delta highWaterMark in sync.
How was this patch tested?
Unit tests
Does this PR introduce any user-facing changes?