[DRAF PR] Add a test case for fix of DB-10138 #1

ashetkar · 2025-03-07T10:47:56Z

Add test case to execute a prepared statement with null bind values
Verifies the fix for Handle null values for partition keys in bound statements in YCQL driver yugabyte/yugabyte-db#21210

Summary: This revision adds the procedure `yb_index_check()` that checks whether the given index is consistent with its base relation or not. This operation doesn't support GIN indexes yet. #### Context An index row has the following components: - index attributes: key and non-key columns (if any) - ybbasectid: system attribute storing the ybctid of base relation row - ybuniqueidxkeysuffix: system attribute, present only for unique indexes (it is non-null only when the index is in null-are-distinct mode and the key columns contain at least one NULL) The structure of an index row is as follows (in `<DocKey>` -> `<DocValue>` format): - Non-unique index: (key cols, ybbasectid) -> (non-key cols) - Unique index: (key cols, ybuniqueidxkeysuffix) -> (non-key cols, ybbasectid) ####Consistency Definition An index row is consistent if all of its attributes are consistent. An index attribute is consistent if its value and the corresponding base relation value are - for key attributes: binary equal (semantic equality without binary equality runs the risk of allowing multiple index rows for a given base table row if the key column can have multiple binary representations). - for non-key attributes: binary or semantically equal. Note: if both the values are NULL, they are consistent. ####Consistency Check Index consistency check is done in two steps: # Check for spurious index rows # Check for missing index rows **Part 1: Check for spurious index rows** Here, we check if the index contains a row that it should not. To do this: For every index row, fetch the row in the base table (filtered by partial index predicate) such that baserow.ybctid == indexrow.ybbasectid. If such a row doesn’t exist, use a baserow with all NULL values. The result will be the same as a LEFT join on indexrow.ybbasectid = baserow.ybctid with the index table on the left (if the index was a regular relation). Fetch the following columns as the join targetlist: - from index row: ybbasectid, index attributes, ybuniqueidxkeysuffix (only for unique indexes) - from base table row: ybctid, columns/expressions corresponding to index attributes On the joined result, make the following checks: # ybbasectid should be non-null # ybbasectid should be equal to ybctid # index attributes and the corresponding base relation column/expression should be consistent as per the above definition # for unique indexes, ybuniqueidxkeysuffix should be non-null iff the index uses null-are-distinct mode and key columns contain at least one null. When non-null, it should be equal to ybbasectid If the above checks pass for every row in the index, it implies that the index does not contain any spurious rows. This can be proved by contradiction as follows: Let’s assume that the above checks passed for every row in the index, yet it contains a spurious row, namely indexrow1. This index row must satisfy the following: - indexrow1.ybbasectid != null (from check #1) - base table has a row, namely baserow, such that baserow.ybctid == indexrow1.ybbasectid (otherwise ybctid would be null and check yugabyte#2 would have failed) - index attributes of indexrow1 are consistent with baserow (from check yugabyte#3) - If the index is unique, indexrow1.ybuniqueidxkeysuffix is either null or equal to ybbasectid, depending on the index mode and key cols (from check yugabyte#4) The above shows that indexrow1 has a valid counterpart in the baserow. Given this, the only possible reason why indexrow1 should not have been present in the index is that another index row, namely indexrow2, must exist such that the pair (indexrow2, baserow) also satisfies the above checks. We can say that indexrow1 and indexrow2 - have the same ybbasectid (baserow.ybctid == indexrow2.ybbasectid == indexrow1.ybbasectid). - have binary equal values for key columns. This is because key cols of both index rows are binary equal to the corresponding baserow values (from check yugabyte#3 and definition of consistency). - have identical ybuniqueidxkeysuffix (it depends on index type, mode, and key cols - all of these are already established to be the same for the two index rows). The DocKey of the index row is created by a subset of (key cols, ybbasectid, ybuniqueidxkeysuffix). Each component is identical for the two index rows, implying identical DocKeys. This is not possible because DocDB does not allow duplicate DocKeys. Hence, such an indexrow1 does not exist. **Part 2: Check for missing index rows** This part checks if no entries are missing from the index. Given that it is already established that the index does not contain any spurious rows, it suffices to check if the index row count is what it should be. That is, for every qualifying row in the base table (filtered by partial index predicate), the index should contain one row. ####Implementation - To fetch the index row and the corresponding base table tow efficiently, batch nested loop join is used (details below) - Both parts of the check use a single read time. This works out of the box because the entire check is executed as a single YSQL statement. **Batch Nested Loop Join usage** Batchable join clauses must be of the form `inner_indexed_var = expression on (outer_vars)` and the expression must not involve functions. To satisfy the above requirement, - join condition: baserow.ybctid == indexrow.ybbasectid. - outer subplan: index relation scan - inner subplan: base relation scan. BNL expects an index on the var referenced in the join clause (ybctid, in this case). So, a dummy primary key index object on the ybctid column is temporarily created (not persisted in the PG catalog). Like any other PK index in YB, this index points to the base relation and doesn't have a separate docdb table. Because such an index object doesn't actually exist, the planner was bypassed and the join plan was hardcoded. Jira: DB-15118 Test Plan: ./yb_build.sh --java-test org.yb.pgsql.TestPgRegressYbIndexCheck Reviewers: amartsinchyk, tnayak Reviewed By: amartsinchyk Subscribers: smishra, yql Differential Revision: https://phorge.dev.yugabyte.com/D41376

Summary: After f85bbca, TServerSharedData is now set up as the prepare user data of the shared memory allocator. TServerSharedData is initially mapped to a temporary address before address negotiation runs, then remapped to the negotiated address once postmaster starts and we negotiate addresses. But other code may access TServerSharedData both before and after we remap, and thus there is a data race on the pointer to TServerSharedData between code using TServerSharedData and code remapping it. Additionally, it is possible for a pointer to the temporary address to be retrieved, then shared memory remapped, then pointer to temporary address used, resulting in a SIGSEGV. This diff puts access to the TServerSharedData pointer behind URCU (ConcurrentValue) and additionally delays unmapping the temporary address until we have reached a quiescent state after updating the TServerSharedData pointer. This fixes TSAN failures from the above data race, with the following trace: ``` [ts-2] WARNING: ThreadSanitizer: data race (pid=10713) [ts-2] Write of size 8 at 0x721c00001db8 by thread T48: [ts-2] #0 yb::tserver::SharedMemoryManager::InitializeParentAllocatorsAndObjects() ${YB_SRC_ROOT}/src/yb/tserver/tserver_shared_mem.cc:328:9 (libtserver_shared.so+0x11f66) [ts-2] #1 yb::tserver::SharedMemoryManager::ExecuteParentNegotiator(std::shared_ptr<yb::AddressSegmentNegotiator> const&) ${YB_SRC_ROOT}/src/yb/tserver/tserver_shared_mem.cc:318:3 (libtserver_shared.so+0x11f66) [ts-2] Previous read of size 8 at 0x721c00001db8 by thread T41 (mutexes: write M0): [ts-2] #0 yb::tserver::SharedMemoryManager::SharedData() const ${YB_SRC_ROOT}/src/yb/tserver/tserver_shared_mem.h:182:13 (libtserver.so+0x26623d) [ts-2] #1 yb::tserver::DbServerBase::shared_object() const ${YB_SRC_ROOT}/src/yb/tserver/db_server_base.cc:136:31 (libtserver.so+0x26623d) [ts-2] yugabyte#2 yb::tserver::TabletServer::SetYsqlDBCatalogVersions(yb::master::DBCatalogVersionDataPB const&) ${YB_SRC_ROOT}/src/yb/tserver/tablet_server.cc:1126:7 (libtserver.so+0x3fafc8) [ts-2] yugabyte#3 yb::tserver::HeartbeatPoller::TryHeartbeat() ${YB_SRC_ROOT}/src/yb/tserver/heartbeater.cc:499:15 (libtserver.so+0x278e87) [ts-2] yugabyte#4 yb::tserver::HeartbeatPoller::Poll() ${YB_SRC_ROOT}/src/yb/tserver/heartbeater.cc:273:19 (libtserver.so+0x276143) [ts-2] yugabyte#5 yb::tserver::MasterLeaderPollScheduler::Impl::Run() ${YB_SRC_ROOT}/src/yb/tserver/master_leader_poller.cc:133:25 (libtserver.so+0x27ec33) ``` Jira: DB-15671 Test Plan: Jenkins Reviewers: sergei Reviewed By: sergei Subscribers: zdrudi, ybase, yql Differential Revision: https://phorge.dev.yugabyte.com/D42395

…select statement with key columns only Summary: There are several scenarios where rows are requested by key columns only (including constraints and aggregates like count). Doc reader have an optimization for this kind of reads: since actual values does not matter (because key column values can be extracted from the row's doc key) then there is no need to decode row values and it is enough to check if the row is tombstoned or not. Unfortunately, this optimization was not updated to take fast backward scan into account: an iterator stops reading when the firstly met record is not marked with a tombstone value, which is the case for fast backward scan, which reads a row in the reverse order, from the oldest to the newest change (per sub doc key), and the oldest record is generally corresponds to the initially inserted values. The change updates the algorithm if checking the row existence: the scanning continues until all row changes have been taken into account to get the actual state of the row. **Examples of row existence check before the fix** Example #1 ``` Record #1: KEY, T1 => PACKED ROW Record yugabyte#2: KEY, COL1, T2 => COL1 NEW VALUE 1 Forward scan / old backward scan: #1 (alive, T1) => stop result: row exists. Fast backward scan: yugabyte#2 (alive, T2) => stop result: row exists. ``` Example yugabyte#2 ``` Record #1: KEY, T3 => DEL Record yugabyte#2: KEY, T1 => PACKED ROW Record yugabyte#3: KEY, COL1, T2 => COL1 NEW VALUE Forward scan / old backward scan: #1 (deleted, T3), yugabyte#2 (skipped: T1 < T3), yugabyte#3 (skipped: T2 < T3) => stop result: row does not exist. Fast backward scan: yugabyte#3 (alive, T2) => stop result: row exists. ``` Example yugabyte#3 ``` Record #1: KEY, T4 => PACKED ROW Record yugabyte#2: KEY, T3 => DEL Record yugabyte#3: KEY, T1 => PACKED ROW Record yugabyte#4: KEY, COL1, T2 => COL1 NEW VALUE Forward scan / old backward scan: #1 (alive, T4) => stop result: row exists. Fast backward scan: yugabyte#4 (alive, T2) => stop result: row exists. ``` **Examples of row existence check with the fix** Example #1 ``` Record #1: KEY, T1 => PACKED ROW Record yugabyte#2: KEY, COL1, T2 => COL1 NEW VALUE 1 Fast backward scan: yugabyte#2 (alive, T2) => #1 (skipped, T1 < T2) => stop result: row exists. ``` Example yugabyte#2 ``` Record #1: KEY, T3 => DEL Record yugabyte#2: KEY, T1 => PACKED ROW Record yugabyte#3: KEY, COL1, T2 => COL1 NEW VALUE Fast backward scan: yugabyte#3 (alive, T2) => yugabyte#2 (skipped: T1 < T2) => #1 (deleted: T3 > T2) => stop result: row does not exist. ``` Example yugabyte#3 ``` Record #1: KEY, T4 => PACKED ROW Record yugabyte#2: KEY, T3 => DEL Record yugabyte#3: KEY, T1 => PACKED ROW Record yugabyte#4: KEY, COL1, T2 => COL1 NEW VALUE Fast backward scan: yugabyte#4 (alive, T2) => yugabyte#3 (skipped: T1 < T2) => yugabyte#2 (deleted: T3 > T2) => #1 (alive: T4 > T3) => stop result: row exists. ``` Jira: DB-15387 Test Plan: ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Fast_WithNulls_kV1 ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Fast_WithNulls_kV2 ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Fast_WithNulls_kNone ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Fast_WithoutNulls_kV1 ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Fast_WithoutNulls_kV2 ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Fast_WithoutNulls_kNone ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Slow_WithNulls_kV1 ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Slow_WithNulls_kV2 ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Slow_WithNulls_kNone ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Slow_WithoutNulls_kV1 ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Slow_WithoutNulls_kV2 ./yb_build.sh --cxx-test pgwrapper_pg_single_tserver-test --gtest_filter PgSingleTServerTest/PgFastBackwardScanTest.Simple/Slow_WithoutNulls_kNone **manual testing:** ``` SET yb_use_hash_splitting_by_default = FALSE; CREATE TABLE t1(c0 int UNIQUE DEFAULT 1); INSERT INTO t1(c0) VALUES (2); DELETE FROM t1; SELECT * FROM t1; SELECT * FROM t1 GROUP BY t1.c0 ORDER BY t1.c0 DESC; ``` Reviewers: sergei, rthallam Reviewed By: sergei, rthallam Subscribers: ybase, yql Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D42212

@plt

Summary: After commit f85bbca, vmodule flag is no longer respected by postgres process, for example: ``` ybd release --cxx-test pgwrapper_pg_analyze-test --gtest_filter PgAnalyzeTest.AnalyzeSamplingColocated --test-args '--vmodule=pg_sample=1' -n 2 -- -p 1 -k zgrep pg_sample ~/logs/latest_test/1.log ``` shows no vlogs. The reason is that `VLOG(1)` is used early by ``` #0 0x00007f7e1b48b090 in google::InitVLOG3__(google::SiteFlag*, int*, char const*, int)@plt () from /net/dev-server-timur/share/code/yugabyte-db/build/debug-clang19-dynamic-ninja/lib/libyb_util_shmem.so #1 0x00007f7e1b47616e in yb::(anonymous namespace)::NegotiatorSharedState::WaitProposal (this=0x7f7e215e8000) at ../../src/yb/util/shmem/reserved_address_segment.cc:108 yugabyte#2 0x00007f7e1b4781e0 in yb::AddressSegmentNegotiator::Impl::NegotiateChild (fd=45) at ../../src/yb/util/shmem/reserved_address_segment.cc:252 yugabyte#3 0x00007f7e1b4737ce in yb::AddressSegmentNegotiator::NegotiateChild (fd=45) at ../../src/yb/util/shmem/reserved_address_segment.cc:376 yugabyte#4 0x00007f7e1b742b7b in yb::tserver::SharedMemoryManager::InitializePostmaster (this=0x7f7e202e9788 <yb::pggate::PgSharedMemoryManager()::shared_mem_manager>, fd=45) at ../../src/yb/tserver/tserver_shared_mem.cc:252 yugabyte#5 0x00007f7e2023588f in yb::pggate::PgSetupSharedMemoryAddressSegment () at ../../src/yb/yql/pggate/pg_shared_mem.cc:29 yugabyte#6 0x00007f7e202788e9 in YBCSetupSharedMemoryAddressSegment () at ../../src/yb/yql/pggate/ybc_pg_shared_mem.cc:22 yugabyte#7 0x000055636b8956f5 in PostmasterMain (argc=21, argv=0x52937fe4e790) at ../../../../../../src/postgres/src/backend/postmaster/postmaster.c:1083 yugabyte#8 0x000055636b774bfe in PostgresServerProcessMain (argc=21, argv=0x52937fe4e790) at ../../../../../../src/postgres/src/backend/main/main.c:209 yugabyte#9 0x000055636b7751f2 in main () ``` and caches `vmodule` value before `InitGFlags` sets it from environment. The fix is to explicitly call `UpdateVmodule` from `InitGFlags` after setting `vmodule`. Jira: DB-15888 Test Plan: ``` ybd release --cxx-test pgwrapper_pg_analyze-test --gtest_filter PgAnalyzeTest.AnalyzeSamplingColocated --test-args '--vmodule=pg_sample=1' -n 2 -- -p 1 -k zgrep pg_sample ~/logs/latest_test/1.log ``` Reviewers: hsunder Reviewed By: hsunder Subscribers: ybase, yql Tags: #jenkins-ready, #jenkins-trigger Differential Revision: https://phorge.dev.yugabyte.com/D42731

…rdup for tablegroup_name Summary: As part of D36859 / 0dbe7d6, backup and restore support for colocated tables when multiple tablespaces exist was introduced. Upon fetching the tablegroup_name from `pg_yb_tablegroup`, the value was read and assigned via `PQgetvalue` without copying. This led to a use-after-free bug when the tablegroup_name was later read in dumpTableSchema since the result from the SQL query is immediately cleared in the next line (`PQclear`). ``` [P-yb-controller-1] ==3037==ERROR: AddressSanitizer: heap-use-after-free on address 0x51d0002013e6 at pc 0x55615b0a1f92 bp 0x7fff92475970 sp 0x7fff92475118 [P-yb-controller-1] READ of size 8 at 0x51d0002013e6 thread T0 [P-yb-controller-1] #0 0x55615b0a1f91 in strcmp ${YB_LLVM_TOOLCHAIN_DIR}/src/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:470:5 [P-yb-controller-1] #1 0x55615b1b90ba in dumpTableSchema ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:15789:8 [P-yb-controller-1] yugabyte#2 0x55615b178163 in dumpTable ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:15299:4 [P-yb-controller-1] yugabyte#3 0x55615b178163 in dumpDumpableObject ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:10216:4 [P-yb-controller-1] yugabyte#4 0x55615b178163 in main ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:1019:3 [P-yb-controller-1] yugabyte#5 0x7f3c0184e7e4 in __libc_start_main (/lib64/libc.so.6+0x3a7e4) (BuildId: fd70eb98f80391a177070fcb8d757a63fe49b802) [P-yb-controller-1] yugabyte#6 0x55615b0894bd in _start (${BUILD_ROOT}/postgres/bin/ysql_dump+0x10d4bd) [P-yb-controller-1] [P-yb-controller-1] 0x51d0002013e6 is located 358 bytes inside of 2048-byte region [0x51d000201280,0x51d000201a80) [P-yb-controller-1] freed by thread T0 here: [P-yb-controller-1] #0 0x55615b127196 in free ${YB_LLVM_TOOLCHAIN_DIR}/src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:52:3 [P-yb-controller-1] #1 0x7f3c02d65e85 in PQclear ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/fe-exec.c:755:3 [P-yb-controller-1] yugabyte#2 0x55615b1c0103 in getYbTablePropertiesAndReloptions ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:19108:4 [P-yb-controller-1] yugabyte#3 0x55615b1b8fab in dumpTableSchema ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:15765:3 [P-yb-controller-1] yugabyte#4 0x55615b178163 in dumpTable ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:15299:4 [P-yb-controller-1] yugabyte#5 0x55615b178163 in dumpDumpableObject ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:10216:4 [P-yb-controller-1] yugabyte#6 0x55615b178163 in main ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:1019:3 [P-yb-controller-1] yugabyte#7 0x7f3c0184e7e4 in __libc_start_main (/lib64/libc.so.6+0x3a7e4) (BuildId: fd70eb98f80391a177070fcb8d757a63fe49b802) [P-yb-controller-1] [P-yb-controller-1] previously allocated by thread T0 here: [P-yb-controller-1] #0 0x55615b12742f in malloc ${YB_LLVM_TOOLCHAIN_DIR}/src/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:68:3 [P-yb-controller-1] #1 0x7f3c02d680a7 in pqResultAlloc ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/fe-exec.c:633:28 [P-yb-controller-1] yugabyte#2 0x7f3c02d81294 in getRowDescriptions ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/fe-protocol3.c:544:4 [P-yb-controller-1] yugabyte#3 0x7f3c02d7f793 in pqParseInput3 ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/fe-protocol3.c:324:11 [P-yb-controller-1] yugabyte#4 0x7f3c02d6bcc8 in parseInput ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/fe-exec.c:2014:2 [P-yb-controller-1] yugabyte#5 0x7f3c02d6bcc8 in PQgetResult ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/fe-exec.c:2100:3 [P-yb-controller-1] yugabyte#6 0x7f3c02d6cd87 in PQexecFinish ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/fe-exec.c:2417:19 [P-yb-controller-1] yugabyte#7 0x7f3c02d6cd87 in PQexec ${YB_SRC_ROOT}/src/postgres/src/interfaces/libpq/fe-exec.c:2256:9 [P-yb-controller-1] yugabyte#8 0x55615b1f45df in ExecuteSqlQuery ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_backup_db.c:296:8 [P-yb-controller-1] yugabyte#9 0x55615b1f4213 in ExecuteSqlQueryForSingleRow ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_backup_db.c:311:8 [P-yb-controller-1] yugabyte#10 0x55615b1c008d in getYbTablePropertiesAndReloptions ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:19102:10 [P-yb-controller-1] yugabyte#11 0x55615b1b8fab in dumpTableSchema ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:15765:3 [P-yb-controller-1] yugabyte#12 0x55615b178163 in dumpTable ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:15299:4 [P-yb-controller-1] yugabyte#13 0x55615b178163 in dumpDumpableObject ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:10216:4 [P-yb-controller-1] yugabyte#14 0x55615b178163 in main ${YB_SRC_ROOT}/src/postgres/src/bin/pg_dump/pg_dump.c:1019:3 [P-yb-controller-1] yugabyte#15 0x7f3c0184e7e4 in __libc_start_main (/lib64/libc.so.6+0x3a7e4) (BuildId: fd70eb98f80391a177070fcb8d757a63fe49b802) ``` This revision fixes the issue by using pg_strdup to make a copy of the string. Jira: DB-15915 Test Plan: ./yb_build.sh asan --cxx-test integration-tests_xcluster_ddl_replication-test --gtest_filter XClusterDDLReplicationTest.DDLReplicationTablesNotColocated Reviewers: aagrawal, skumar, mlillibridge, sergei Reviewed By: aagrawal, sergei Subscribers: sergei, yql Differential Revision: https://phorge.dev.yugabyte.com/D43386

…terTSConfig Summary: Fixing this ASAN error: ``` ../../../../../../src/postgres/src/backend/commands/event_trigger.c:1817:36: runtime error: null pointer passed as argument 2, which is declared to never be null /usr/include/string.h:44:28: note: nonnull attribute specified here ``` Call stack: ``` #0 0x55597fecb32d in EventTriggerCollectAlterTSConfig : src/postgres/src/backend/commands/event_trigger.c:1817:2 #1 0x55598004305e in DropConfigurationMapping : src/postgres/src/backend/commands/tsearchcmds.c:1518:2 ``` The function `EventTriggerCollectAlterTSConfig()` is called with `NULL` argument in the PG code: `static void DropConfigurationMapping(......) {` `EventTriggerCollectAlterTSConfig(stmt, cfgId, NULL, 0);` This `NULL` can go into `memcpy()`. `memcpy(command->d.atscfg.dictIds, dictIds, sizeof(Oid) * ndicts); <<< dictIds==NULL here` The case is not expected. The code is fixed. Test Plan: ./yb_build.sh asan --java-test org.yb.pgsql.TestPgRegressTypesString#schedule Reviewers: mihnea, kfranz Reviewed By: kfranz Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D43395

…lations Summary: This revision fixes two bugs in foreign keys that reference partitioned tables when the root and the leaf PK relations have different column orders. The details of the bugs are as follows: # When the foreign key is created using ALTER TABLE, the cleanup post the constraint validation is not done properly, leading to `TupleDesc reference leak` warnings. Specifically, the tuples in `EState.es_tupleTable` are not reset. It contains the ResultRelInfo.ri_PartitionTupleSlot when the root and leaf relations have different column orders. # When the root and leaf relations have different column orders, ExecGetChildToRootMap() allocates the tuple conversion map, if not already done. `YbFindReferencedPartition()` incorrectly calls this function in the current memory context; rather, it should be called in the query context. This leads to use-after-free when the foreign key constraint is in DEFERRED mode. Below is a detailed explanation: When inserting into a relation with fk constraint, YbFindReferencedPartition() is invoked in two code paths with different memory context in each case: invocation #1: AfterTriggerSaveEvent -> YbAddTriggerFKReferenceIntent -> YBCBuildYBTupleIdDescriptor -> YbFindReferencedPartition: In this case, it is invoked in per-query memory context. invocation yugabyte#2: afterTriggerInvokeEvents -> AfterTriggerExecute -> ExecCallTriggerFunc -> RI_FKey_check_ins -> RI_FKey_check -> YBCBuildYBTupleIdDescriptor -> YbFindReferencedPartition: In this case, it is invoked in per-tuple memory context. In the IMMEDIATE mode, the same executor state is used in both invocations. The invocation #1 allocates the map in per-query memory context, and invocation yugabyte#2 just reads it, without needing to reallocate it. In the DEFERRED case, the executor states are different. afterTriggerInvokeEvents() creates a new local executor state to execute the triggers. Hence, the invocation yugabyte#2 needs to allocate the map, but it ends up doing it in the (incorrect) per-tuple memory context. Consider the case when two (or more) tuples are batch-inserted in the FK relation. The first trigger execution invokes ExecGetChildToRootMap(), which sets ri_ChildToRootMapValid as valid and allocates the map in the per-tuple context. This context is reset before the next trigger execution. The second trigger execution seg faults because ri_ChildToRootMapValid indicates the map is valid, but that is not the case. In passing, also initialise estate->yb_es_pk_proutes to NIL to keep the behaviour the same as other fields of list type in EState. Jira: DB-16152 Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressForeignKey' Close: yugabyte#26772 Reviewers: dmitry, kramanathan, #db-approvers Reviewed By: kramanathan, #db-approvers Subscribers: svc_phabricator, yql Differential Revision: https://phorge.dev.yugabyte.com/D43893

Summary: Added `check_data_blocks` to sst_dump tool. It will check all data blocks consistency using SST file index and per-block checksums. `--skip_uncompress` option specifies that `check_data_blocks` command should skip uncompressing data blocks (false by default). If any corrupt data blocks are found, the tool will generate `dd` commands which could be used to save specific blocks (corrupted block and one block before/after) for analysis. Example usage: ``` ./sst_dump --command=check_data_blocks --file=000046.sst 2>&1 | tee sst_dump.out cat sst_dump.out | grep '^dd ' ``` Jira: DB-16720 Test Plan: Tested manually on artificially corrupt SST file: ``` ~/code/yugabyte5/build/latest/bin/sst_dump --command=check_data_blocks --file=000046.sst 2>&1 | tee sst_dump.out WARNING: Logging before InitGoogleLogging() is written to STDERR I0517 07:21:12.046017 3481 sst_dump_tool.cc:433] Checking data block #0 handle: BlockHandle { offset: 0 size: 3927 } W0517 07:21:12.051733 3481 file_reader_writer.cc:101] Read attempt #1 failed in file 000046.sst.sblock.0 : Corruption (yb/rocksdb/table/format.cc:345): Block checksum mismatch in file: 000046.sst.sblock.0, block handle: BlockHandle { offset: 996452 size: 4024 }, expected checksum: 1118200703, actual checksum: 1023486239. W0517 07:21:12.051852 3481 file_reader_writer.cc:101] Read attempt yugabyte#2 failed in file 000046.sst.sblock.0 : Corruption (yb/rocksdb/table/format.cc:345): Block checksum mismatch in file: 000046.sst.sblock.0, block handle: BlockHandle { offset: 996452 size: 4024 }, expected checksum: 1118200703, actual checksum: 1023486239. E0517 07:21:12.051862 3481 format.cc:466] ReadBlockContents: Corruption (yb/rocksdb/table/format.cc:345): Block checksum mismatch in file: 000046.sst.sblock.0, block handle: BlockHandle { offset: 996452 size: 4024 }, expected checksum: 1118200703, actual checksum: 1023486239. @ 0x7fd68e2eafde rocksdb::SstFileReader::CheckDataBlocks() @ 0x7fd68e2ec8ca rocksdb::SSTDumpTool::Run() @ 0x555855a53f78 main @ 0x7fd688e45ca3 __libc_start_main @ 0x555855a53eae _start W0517 07:21:12.056023 3481 sst_dump_tool.cc:441] Failed to read block with handle: BlockHandle { offset: 996452 size: 4024 }. Corruption (yb/rocksdb/table/format.cc:345): Block checksum mismatch in file: 000046.sst.sblock.0, block handle: BlockHandle { offset: 996452 size: 4024 }, expected checksum: 1118200703, actual checksum: 1023486239. from [] to [] Process 000046.sst Sst file format: block-based dd if="000046.sst.sblock.0" bs=1 skip=992433 count=4014 of="000046.sst.sblock.0.offset_992433.size_4014.part" dd if="000046.sst.sblock.0" bs=1 skip=996452 count=4024 of="000046.sst.sblock.0.offset_996452.size_4024.part" dd if="000046.sst.sblock.0" bs=1 skip=1000481 count=4019 of="000046.sst.sblock.0.offset_1000481.size_4019.part" W0517 07:21:12.634256 3481 file_reader_writer.cc:101] Read attempt #1 failed in file 000046.sst.sblock.0 : Corruption (yb/rocksdb/table/format.cc:345): Block checksum mismatch in file: 000046.sst.sblock.0, block handle: BlockHandle { offset: 119754675 size: 11766 }, expected checksum: 84894407, actual checksum: 2295535311. W0517 07:21:12.634332 3481 file_reader_writer.cc:101] Read attempt yugabyte#2 failed in file 000046.sst.sblock.0 : Corruption (yb/rocksdb/table/format.cc:345): Block checksum mismatch in file: 000046.sst.sblock.0, block handle: BlockHandle { offset: 119754675 size: 11766 }, expected checksum: 84894407, actual checksum: 2295535311. E0517 07:21:12.634339 3481 format.cc:466] ReadBlockContents: Corruption (yb/rocksdb/table/format.cc:345): Block checksum mismatch in file: 000046.sst.sblock.0, block handle: BlockHandle { offset: 119754675 size: 11766 }, expected checksum: 84894407, actual checksum: 2295535311. @ 0x7fd68e2eafde rocksdb::SstFileReader::CheckDataBlocks() @ 0x7fd68e2ec8ca rocksdb::SSTDumpTool::Run() @ 0x555855a53f78 main @ 0x7fd688e45ca3 __libc_start_main @ 0x555855a53eae _start W0517 07:21:12.638135 3481 sst_dump_tool.cc:441] Failed to read block with handle: BlockHandle { offset: 119754675 size: 11766 }. Corruption (yb/rocksdb/table/format.cc:345): Block checksum mismatch in file: 000046.sst.sblock.0, block handle: BlockHandle { offset: 119754675 size: 11766 }, expected checksum: 84894407, actual checksum: 2295535311. dd if="000046.sst.sblock.0" bs=1 skip=119742901 count=11769 of="000046.sst.sblock.0.offset_119742901.size_11769.part" dd if="000046.sst.sblock.0" bs=1 skip=119754675 count=11766 of="000046.sst.sblock.0.offset_119754675.size_11766.part" dd if="000046.sst.sblock.0" bs=1 skip=119766446 count=1384 of="000046.sst.sblock.0.offset_119766446.size_1384.part" ``` ``` cat sst_dump.out | grep '^dd ' dd if="000046.sst.sblock.0" bs=1 skip=992433 count=4014 of="000046.sst.sblock.0.offset_992433.size_4014.part" dd if="000046.sst.sblock.0" bs=1 skip=996452 count=4024 of="000046.sst.sblock.0.offset_996452.size_4024.part" dd if="000046.sst.sblock.0" bs=1 skip=1000481 count=4019 of="000046.sst.sblock.0.offset_1000481.size_4019.part" dd if="000046.sst.sblock.0" bs=1 skip=119742901 count=11769 of="000046.sst.sblock.0.offset_119742901.size_11769.part" dd if="000046.sst.sblock.0" bs=1 skip=119754675 count=11766 of="000046.sst.sblock.0.offset_119754675.size_11766.part" dd if="000046.sst.sblock.0" bs=1 skip=119766446 count=1384 of="000046.sst.sblock.0.offset_119766446.size_1384.part" ``` Reviewers: hsunder, yyan Reviewed By: yyan Subscribers: ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D44035

Summary: Currently, `CREATE RULE` does not bump the catalog version under the assumption that creating a rule can only insert entries into system catalog tables but not update or delete them. However, this assumption is false; we need to bump the catalog version in order to maintain behavior parity with vanilla PG. ### What are rules? Rules allow users to specify additional commands that are to be performed automatically whenever an insert/update/delete is executed on a given table. For example: ``` CREATE RULE redirect_to_destination AS ON INSERT TO intermediate_table WHERE NEW.id > 25 DO INSTEAD INSERT INTO destination_table VALUES (NEW.id, NEW.name); ``` This rule redirects inserts from `intermediate_table` to `destination_table` instead when the inserted values fall into the given range. ### Rules internals PG stores rules in the `pg_rewrite` table. Additionally, there is a column in `pg_class` (`relhasrules`) that specifies whether the relation has rules on it. The rules and the relhasrules flag are stored in the relcache. When we build the relcache entry for a table, first we check if `relhasrules` is true. If so, we load the rules from `pg_rewrite`; otherwise, we just set the rules to null: ``` lang=c,name=relcache.c if (relation->rd_rel->relhasrules) RelationBuildRuleLock(relation); else { relation->rd_rules = NULL; relation->rd_rulescxt = NULL; } ``` We read the rules during query planning in `RewriteQuery()`: ``` lang=c,name=rewriteHandler.c locks = matchLocks(event, rt_entry_relation->rd_rules, result_relation, parsetree, &hasUpdate); product_orig_rt_length = list_length(parsetree->rtable); product_queries = fireRules(parsetree, result_relation, event, locks, &instead, &returning, &qual_product); ``` ### How the issue occurs The issue comes up during the following scenario: 1. On backend #1, we build the relcache entry for table `t` before there are any rules on `t`. This could be during relcache preloading, when we build the relcache entries for all user tables, or an ad-hoc load of `t`'s relcache entry. 2. Create the rule on `intermediate_table` on backend yugabyte#2. This create a row in `pg_rewrite` and flips `relhasrules` to `true`, but the relcache entries in backend #1 stay the same because we don't bump the catalog version. 3. On backend #1, do an insert on `intermediate_table` that should trigger the rule. Because the relcache entry still has `rd_rules=NULL`, none of the rules will fire. Jira: DB-16686 Test Plan: ``` ./yb_build.sh release --sj --cxx-test pg_catalog_version-test --gtest-filter "*CreateRule*" ``` Reviewers: myang Reviewed By: myang Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D44279

Add a test case for fix of DB-10138

eece6bf

ashetkar mentioned this pull request Mar 7, 2025

Handle null partition keys in a bound statement yugabyte/cassandra-java-driver#40

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DRAF PR] Add a test case for fix of DB-10138 #1

[DRAF PR] Add a test case for fix of DB-10138 #1

Uh oh!

ashetkar commented Mar 7, 2025

Uh oh!

Uh oh!

[DRAF PR] Add a test case for fix of DB-10138 #1

Are you sure you want to change the base?

[DRAF PR] Add a test case for fix of DB-10138 #1

Uh oh!

Conversation

ashetkar commented Mar 7, 2025

Uh oh!

Uh oh!