[vParquet3] Add dedicated columns to overrides module#2551
Merged
mapno merged 1 commit intografana:main-vparquet3from Jun 13, 2023
Merged
[vParquet3] Add dedicated columns to overrides module#2551mapno merged 1 commit intografana:main-vparquet3from
mapno merged 1 commit intografana:main-vparquet3from
Conversation
stoewer
pushed a commit
that referenced
this pull request
Jun 22, 2023
2 tasks
stoewer
pushed a commit
that referenced
this pull request
Jul 13, 2023
stoewer
pushed a commit
to stoewer/tempo
that referenced
this pull request
Jul 13, 2023
stoewer
pushed a commit
that referenced
this pull request
Jul 14, 2023
stoewer
pushed a commit
to stoewer/tempo
that referenced
this pull request
Jul 26, 2023
mapno
added a commit
that referenced
this pull request
Jul 26, 2023
) * [vParquet3] create new block encoding by copying vParquet2 * vParquet3: add dedicated columns to parquet schema and block meta (#2517) * Re-order schema to keep columns affected by column index changes low * Add spare columns for dedicated attributes to schema struct * Add dedicated column config to block meta * Read and write attributes in dedicated columns * Make order of dedicated attributes predictable when reading * Fix existing tests and benchmark * Run exiting benchmarks and tests with dedicated columns Co-authored-by: Mario <mariorvinas@gmail.com> * Add dedicated columns to overrides module (#2551) * [vParquet3] Write path (#2555) * Add dedicated columns to overrides and blocks * Improvements * Change test * Fix tests * Extend ingester_test: * Add dedicated columns config to storage block * Review comments * Add comment * [vParquet3] dedicated columns read path (#2592) * Refactor and rename function blockMetaToDedicatedColumnMapping * Query dedicated attribute columns with TraceQL * Search tag values in dedicated attribute columns * Search tags in dedicated attribute columns * Search for values in dedicated attribute columns in tests * More consistent naming * Update block and meta.json in vparquet2/test-data * Test dedicated column in traceToParquet test * Format Go code * Introduce types for dedicated column type and scope Replace StaticTypeFromString() with DedicatedColumnType.ToStaticType() * The function dedicatedColumnsToColumnMapping() can receive multiple scopes * [vParquet3] Add support for dedicated columns in compactor (#2561) * Re-order schema to keep columns affected by column index changes low * Add spare columns for dedicated attributes to schema struct * Add dedicated column config to block meta * Read and write attributes in dedicated columns * Make order of dedicated attributes predictable when reading * Fix existing tests and benchmark * Run exiting benchmarks and tests with dedicated columns * Add dedicated columns to overrides and blocks * Support dedicated columns in compactor block selection * Changes to hash * More tests --------- Co-authored-by: A. Stoewer <adrian@stoewer.me> * [vParquet3] pass dedicated columns to querier (#2603) * Add dedicated columns to SearchBlockRequest message * Assign SearchBlockRequest dedicated cols from BlockMeta and vice versa * Encode SearchBlockRequest to http request and vice versa * Don't add empty dedicated columns when building a search request * Unit tests with dedicated columns * Implement dedicated column scope and type as protobuf enums * [vParquet3] validate dedicated columns configuration (#2616) * Add validate function * Refactor: use DedicatedColumns type instead of []DedicatedColumn * Initialize logger before verifying the config This fixes the config verification output * Check for invalid dedicated columns with '-config.verify true' * Use ToTempopb() to validate dedicated column scope and type * [vParquet3] mention feature in CHANGELOG.md * [vParquet3] Address review comments * Remove TODO comment about caching the dedicated column hash * Shorten url param for dedicated columns to 'dc' * Add function to get latest encoding and use it in tests * Fix name DedicateColumnsFromTempopb * [vParquet3] Address more review comments * Remove 'Test' columns from vParquet3 schema * Rename async iterator environment variable * Do not export methods of dedicatedColumnMapping * Skip dedicated attribute lookup depending on scope in searchTagValues * Validate maximum number of configured dedicated columns * Test data for vparquet3 uses dedicated columns * Reduce size of block meta JSON * Use 'parquet_' prefix for dedicated column configuration * [vParquet3] Integration tests with dedicated attribute columns * Add e2e tests for encodings and dedicated attribute columns * Use dedicated attribute columns in TestSearchCompleteBlock * Add support for v2 in encodings test --------- Co-authored-by: Mario <mariorvinas@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does:
Adds dedicated columns config to overrides, and wire up the write path to work with dynamic columns
Which issue(s) this PR fixes:
Fixes #2548
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]