Fix potential hash collisions of traceql StringArray + compare() improvement#5835
Fix potential hash collisions of traceql StringArray + compare() improvement#5835mdisibio merged 4 commits intografana:mainfrom
Conversation
…re-computations in compare() function, replace usage of unsafe.StringData
| _, _ = h.Write(seedBytes) | ||
| _, _ = h.Write([]byte(str)) |
There was a problem hiding this comment.
We use a similar approach for avoiding hash collisions for the dedicated columns in block_meta.go:L377
I'm not sure what the best method is, but it would be nice to use the same in both places
There was a problem hiding this comment.
Can you clarify - are you thinking the same seed/separator bytes is enough? Or perhaps shared code. There is a small difference here in that an empty string is ok for attribute values, but not for attribute name in the dedicated columns config.
There was a problem hiding this comment.
Yeah, just suggesting to use the same separator and seed for consistency, not necessarily sharing the same code. It could also be a nice touch to switch from xxhash to fnv (as we don't use xxhash anywhere else).
Mainly suggesting this to make things a bit more consistent. Totally fine if you think it’s out of scope for this PR - don’t want to block it
What this PR does:
This fixes a potential hash collision of StringArray static MapKey. Similar to #5716 (comment) we need to write a separator between values, so that
"foo","bar"hashes to a different value than"foobar". Also ports the missing clone of attribute buffers to vparquet5, which was causing panics that caused me to come across this potential hash collision.A few other changes:
[]byte(string)which the compiler can optimize in this case to be zero alloc (confirmed with benchmark).Which issue(s) this PR fixes:
Fixes n/a
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]