Skip to content

Store trielog as blobdb#6289

Merged
usmansaleem merged 20 commits intobesu-eth:mainfrom
usmansaleem:trielog_blobdb_fixed
Jan 31, 2024
Merged

Store trielog as blobdb#6289
usmansaleem merged 20 commits intobesu-eth:mainfrom
usmansaleem:trielog_blobdb_fixed

Conversation

@usmansaleem
Copy link
Copy Markdown
Contributor

@usmansaleem usmansaleem commented Dec 13, 2023

PR description

Store trielog as blobdb to improve performance. #5475 has shown that blobdb is better choice for static data as it reduced the rocksdb write amplifications during compaction. Manually tested upgrade and downgrade.

Graph showing bytes written during compaction. The yellow bar represents the node with "trielog as blobdb" PR applied (blobdb-2), the green bar represents the node without the blobdb patch (blobdb-3). On average, bytes written during compaction is lower for "trielog as blobdb vs trielog w/o blobdb":

image

Following graph shows fork choice update time (24 hours). There doesn't seem to be any negative impact of writing trielog as blobdb.

image

Fixed Issue(s)

See #5866

Signed-off-by: Usman Saleem <usman@usmans.info>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Dec 13, 2023

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.
  • I thought about the changelog and included a changelog update if required.
  • If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

@usmansaleem usmansaleem marked this pull request as ready for review January 16, 2024 22:04
Signed-off-by: Usman Saleem <usman@usmans.info>
@usmansaleem usmansaleem added TeamGroot GH issues worked on by Groot Team and removed TeamGroot GH issues worked on by Groot Team labels Jan 16, 2024
@siladu
Copy link
Copy Markdown
Contributor

siladu commented Jan 19, 2024

If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

How was this part tested?

@siladu
Copy link
Copy Markdown
Contributor

siladu commented Jan 19, 2024

@usmansaleem I'm not really sure what the graph in the description is showing, can you add some explanation please?

Also, have a feeling we discussed it last year, but can't remember what the outcome of the following test was?

Verify if pruning has an impact on performance due to compaction #5866

@siladu
Copy link
Copy Markdown
Contributor

siladu commented Jan 19, 2024

@usmansaleem I'm not really sure what the graph in the description is showing, can you add some explanation please?

ah, just spotted #5866 (comment)

Have you got a graph for the trie log pruning enabled?


Last one...

Should we have a feature flag for this or should it be enabled by default?

I guess the answer to this was enabled by default because it's possible to downgrade to an older version of the db?

@jframe
Copy link
Copy Markdown
Contributor

jframe commented Jan 19, 2024

Store trielog as blobdb to improve performance. #5475 has shown that blobdb is better choice for static data as it reduced the rocksdb write amplifications during compaction. Manually tested upgrade and downgrade.

Do you have a graph on the impact on the engine API FCU and new payload?

Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Usman Saleem <usman@usmans.info>
Copy link
Copy Markdown
Contributor

@fab-10 fab-10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since after #6445 there can be deletions to this column family, there is the need to enable, and tune, blob garbage collection for it, since now it is it disabled by default (blockchain data cf is meant to be append only with no deletions*)

  • there is an experimental feature for pruning the blockchain data as well, so enabling blob garbage collections is useful also there

Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Usman Saleem <usman@usmans.info>
@usmansaleem usmansaleem requested a review from fab-10 January 29, 2024 01:21
Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Usman Saleem <usman@usmans.info>
…nterface has introduced new method

Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Usman Saleem <usman@usmans.info>
Copy link
Copy Markdown
Contributor

@fab-10 fab-10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
It maybe be required to tune the following two options if the trielog purge is enabled and the space is not reclaimed quicky enough (cc @siladu )

  • blob_garbage_collection_age_cutoff: the cutoff that the GC logic uses to determine which blob files should be considered “old.” For example, the default value of 0.25 signals to RocksDB that blobs residing in the oldest 25% of blob files should be relocated by GC. This parameter can be tuned to adjust the trade-off between write amplification and space amplification.

  • blob_garbage_collection_force_threshold: if the ratio of garbage in the oldest blob files exceeds this threshold, targeted compactions are scheduled in order to force garbage collecting the blob files in question, assuming they are all eligible based on the value of blob_garbage_collection_age_cutoff above. This can help reduce space amplification in the case of skewed workloads where the affected files would not otherwise be picked up for compaction. This option is currently only supported with leveled compactions.

Signed-off-by: Usman Saleem <usman@usmans.info>

# Conflicts:
#	CHANGELOG.md
#	plugin-api/build.gradle
…nterface has introduced new method

Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Usman Saleem <usman@usmans.info>
@usmansaleem usmansaleem enabled auto-merge (squash) January 31, 2024 09:46
@usmansaleem usmansaleem merged commit c10cb3d into besu-eth:main Jan 31, 2024
Gabriel-Trintinalia pushed a commit to Gabriel-Trintinalia/besu that referenced this pull request Feb 1, 2024
Store trielog as blobdb to improve performance.

* Store trie_log_storage as blob
* changelog
* Add gc flag for static data
* Updating plugin-api build.gradle with new hash as SegmentIdentifier interface has introduced new method

Signed-off-by: Usman Saleem <usman@usmans.info>
Signed-off-by: Gabriel-Trintinalia <gabriel.trintinalia@consensys.net>
@usmansaleem usmansaleem deleted the trielog_blobdb_fixed branch February 1, 2024 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants