Skip to content

Memoize the signature algorithm in Transaction#8590

Closed
siladu wants to merge 2 commits intobesu-eth:mainfrom
siladu:transaction-decode-performance
Closed

Memoize the signature algorithm in Transaction#8590
siladu wants to merge 2 commits intobesu-eth:mainfrom
siladu:transaction-decode-performance

Conversation

@siladu
Copy link
Copy Markdown
Contributor

@siladu siladu commented May 6, 2025

** Testing ongoing **

Profiling the prune-pre-merge-blocks subcommand e255296 highlighted a performance issue during getBlockBody -> TransactionDecoder.decodeRLP -> Transaction<init>

The subcommand is loading (for mainnet 15million) block bodies and decoding their transactions in order to get the transactions hashes to prune.

You can see that nearly all the time spent in getBlockBody is loading the signature algorithm. Perhaps this impacts other Besu operations beyond the pruning subcommand.

Screenshot 2025-05-06 at 2 46 09 pm

We also save 8 bytes of memory for every Transaction (4 bytes for the signatureAlgorithm object reference + 4 byte offset)

Previously:

org.hyperledger.besu.ethereum.core.Transaction object internals:
OFF  SZ                                             TYPE DESCRIPTION                           VALUE
  0   8                                                  (object header: mark)                 0x0000000000000001 (non-biasable; age: 0)
  8   4                                                  (object header: class)                0x001b6c98
 12   4                                              int Transaction.size                      101
 16   8                                             long Transaction.nonce                     1
 24   8                                             long Transaction.gasLimit                  5000
 32   4                               java.util.Optional Transaction.gasPrice                  (object)
 36   4                               java.util.Optional Transaction.maxPriorityFeePerGas      (object)
 40   4                               java.util.Optional Transaction.maxFeePerGas              (object)
 44   4                               java.util.Optional Transaction.maxFeePerBlobGas          (object)
 48   4                               java.util.Optional Transaction.to                        (object)
 52   4               org.hyperledger.besu.datatypes.Wei Transaction.value                     (object)
 56   4        org.hyperledger.besu.crypto.SECPSignature Transaction.signature                 (object)
 60   4       org.hyperledger.besu.ethereum.core.Payload Transaction.payload                   (object)
 64   4                               java.util.Optional Transaction.maybeAccessList           (object)
 68   4                               java.util.Optional Transaction.chainId                   (object)
 72   4                  org.apache.tuweni.bytes.Bytes32 Transaction.hashNoSignature           (object)
 76   4           org.hyperledger.besu.datatypes.Address Transaction.sender                    (object)
 80   4              org.hyperledger.besu.datatypes.Hash Transaction.hash                      (object)
 84   4   org.hyperledger.besu.datatypes.TransactionType Transaction.transactionType           (object)
 88   4   org.hyperledger.besu.crypto.SignatureAlgorithm Transaction.signatureAlgorithm        (object)
 92   4                               java.util.Optional Transaction.versionedHashes           (object)
 96   4                               java.util.Optional Transaction.blobsWithCommitments      (object)
100   4                               java.util.Optional Transaction.maybeCodeDelegationList   (object)
104   4                               java.util.Optional Transaction.rawRlp                    (object)
108   4                                                  (object alignment gap)
Instance size: 112 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

With this PR:

OFF  SZ                                             TYPE DESCRIPTION                           VALUE
  0   8                                                  (object header: mark)                 0x0000000000000001 (non-biasable; age: 0)
  8   4                                                  (object header: class)                0x001b70c0
 12   4                                              int Transaction.size                      101
 16   8                                             long Transaction.nonce                     1
 24   8                                             long Transaction.gasLimit                  5000
 32   4                               java.util.Optional Transaction.gasPrice                  (object)
 36   4                               java.util.Optional Transaction.maxPriorityFeePerGas      (object)
 40   4                               java.util.Optional Transaction.maxFeePerGas              (object)
 44   4                               java.util.Optional Transaction.maxFeePerBlobGas          (object)
 48   4                               java.util.Optional Transaction.to                        (object)
 52   4               org.hyperledger.besu.datatypes.Wei Transaction.value                     (object)
 56   4        org.hyperledger.besu.crypto.SECPSignature Transaction.signature                 (object)
 60   4       org.hyperledger.besu.ethereum.core.Payload Transaction.payload                   (object)
 64   4                               java.util.Optional Transaction.maybeAccessList           (object)
 68   4                               java.util.Optional Transaction.chainId                   (object)
 72   4                  org.apache.tuweni.bytes.Bytes32 Transaction.hashNoSignature           (object)
 76   4           org.hyperledger.besu.datatypes.Address Transaction.sender                    (object)
 80   4              org.hyperledger.besu.datatypes.Hash Transaction.hash                      (object)
 84   4   org.hyperledger.besu.datatypes.TransactionType Transaction.transactionType           (object)
 88   4                               java.util.Optional Transaction.versionedHashes           (object)
 92   4                               java.util.Optional Transaction.blobsWithCommitments      (object)
 96   4                               java.util.Optional Transaction.maybeCodeDelegationList   (object)
100   4                               java.util.Optional Transaction.rawRlp                    (object)
Instance size: 104 bytes
Space losses: 0 bytes internal + 0 bytes external = 0 bytes total

siladu added 2 commits May 6, 2025 15:48
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Signed-off-by: Simon Dudley <simon.dudley@consensys.net>
Copy link
Copy Markdown
Contributor

@lu-pinto lu-pinto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking a bit more closely at SignatureAlgorithmFactory::getInstance and there is a setInstance method which basically does the same thing as the memoize. I think we should use that instead to set it everywhere across the board.

I wonder if we are missing any initialization step to set the factory to the default signing algorithm?

@siladu
Copy link
Copy Markdown
Contributor Author

siladu commented May 6, 2025

This PR speeds up the prune subcommand from taking 6+ days (ongoing) to take around 5 minutes for the 15 million pre-merge mainnet blocks on a Standard_D4s_v5. This is a similar performance to skipping the call to getBlockBody entirely.

TEST: This PR used with prune-pre-merge-blocks - Load blockBodies and prune transaction indexes as well as block bodies
https://github.com/siladu/besu/tree/prune-all-fast-disable-gc

dev-elc-bu-pm-mainnet-simon-4444-prune-all-fast-disable-gc-01
2025-05-06 10:35:13.646+00:00 | main | INFO  | PrunePreMergeBlockDataSubCommand | Starting pruning of block range 1 to 10001...
2025-05-06 10:39:16.017+00:00 | main | INFO  | PrunePreMergeBlockDataSubCommand | Pruning pre-merge blocks and transaction receipts completed

dev-elc-bu-pm-mainnet-simon-4444-prune-all-fast-disable-gc-02
2025-05-06 10:35:12.759+00:00 | main | INFO  | PrunePreMergeBlockDataSubCommand | Starting pruning of block range 1 to 10001...
2025-05-06 10:39:22.607+00:00 | main | INFO  | PrunePreMergeBlockDataSubCommand | Pruning pre-merge blocks and transaction receipts completed

CONTROL: Prune block bodies, skip pruning transactions indexes (i.e. skip calling getBlockBody to get the transaction hashes)
https://github.com/siladu/besu/tree/prune-fast-disable-gc

dev-elc-bu-pm-mainnet-simon-4444-prune-fast-disable-gc-01
2025-05-06 10:35:15.121+00:00 | main | INFO  | PrunePreMergeBlockDataSubCommand | Starting pruning of block range 1 to 10001...
2025-05-06 10:40:51.768+00:00 | main | INFO  | PrunePreMergeBlockDataSubCommand | Pruning pre-merge blocks and transaction receipts completed

dev-elc-bu-pm-mainnet-simon-4444-prune-fast-disable-gc-02
2025-05-06 10:35:12.830+00:00 | main | INFO  | PrunePreMergeBlockDataSubCommand | Starting pruning of block range 1 to 10001...
2025-05-06 10:39:38.330+00:00 | main | INFO  | PrunePreMergeBlockDataSubCommand | Pruning pre-merge blocks and transaction receipts completed

@ahamlat
Copy link
Copy Markdown
Contributor

ahamlat commented May 6, 2025

Yeah, I think the fact that the initialization is now static, and it is a singleton, I don't see the need of using memoize.

@siladu
Copy link
Copy Markdown
Contributor Author

siladu commented May 6, 2025

I would like to address to wider static issue in another PR as it is all over the codebase

@ahamlat
Copy link
Copy Markdown
Contributor

ahamlat commented May 6, 2025

Not sure to understand your comment @siladu. With this PR, the SIGNATURE_ALGORITHM initialization is now static, and as it is a singleton, there is no need to use memoize.

@lu-pinto
Copy link
Copy Markdown
Contributor

lu-pinto commented May 6, 2025

Not sure to understand your comment @siladu. With this PR, the SIGNATURE_ALGORITHM initialization is now static, and as it is a singleton, there is no need to use memoize.

What memoize is doing is just lazy loading, but honestly the signature algorithm will be used in any case IMO, so it in fact does nothing.

Even the comment in getInstance is very telling that not setting the singleton is not production behaviour: https://github.com/hyperledger/besu/blob/main/crypto/algorithms/src/main/java/org/hyperledger/besu/crypto/SignatureAlgorithmFactory.java#L57-L67

Take a look at examples from evmtool: https://github.com/hyperledger/besu/blob/main/ethereum/evmtool/src/main/java/org/hyperledger/besu/evmtool/StateTestSubCommand.java#L145

@daniellehrner
Copy link
Copy Markdown
Contributor

We should really think if we shouldn't just deprecate the R1 curve from the SignatureAlgorithm. It was added because of compliance reasons for institutions, but no wallet or frontend library supports it AFAIK. So not sure if anybody really uses it nowadays.

Maybe @matthew1001 knows if anybody is using it in production.

@siladu
Copy link
Copy Markdown
Contributor Author

siladu commented May 7, 2025

Not sure to understand your comment @siladu. With this PR, the SIGNATURE_ALGORITHM initialization is now static, and as it is a singleton, there is no need to use memoize.

Since this pattern is used in quite a few places: https://github.com/search?q=repo%3Ahyperledger%2Fbesu%20memoize(SignatureAlgorithmFactory%3A%3AgetInstance)&type=code

I think we should wholesale refactor this pattern and use the SIGNATURE_ALGORITHM as a singleton as suggested. We actually already set it on startup: https://github.com/hyperledger/besu/blob/14cdd1f618da9d31e70629c2707c0d2d47b28334/besu/src/main/java/org/hyperledger/besu/cli/BesuCommand.java#L2671

But I think the impact for some of these use cases would be for anyone using the R1 curve in various parts of Besu, so makes sense to consider this separately I think.

@matthew1001
Copy link
Copy Markdown
Contributor

We should really think if we shouldn't just deprecate the R1 curve from the SignatureAlgorithm. It was added because of compliance reasons for institutions, but no wallet or frontend library supports it AFAIK. So not sure if anybody really uses it nowadays.

Maybe @matthew1001 knows if anybody is using it in production.

How much code is required to continue supporting SECP256R1? If it's just another constant in a source file I think I'd suggest leaving it for now. We just happened to have a conversation with an HSM vendor recently who was talking about SECP256R1 as being preferred over SECP256K1. I don't think any/many people will be using it, but it's hard to tell. So it feels like a trade-off of source code to maintain vs cost to a user if they're using that curve.

Memoizing it feels like the right trade off given the performance benefits described in the PR. We might still find someone using e.g. an HSM for some transactions and a different signer for others - but that feels much less likely

@siladu
Copy link
Copy Markdown
Contributor Author

siladu commented May 9, 2025

I will close this PR, will explain in next comment, but for posterity since tests were complete, here's the results: TL;DR no difference to mainnet performance.


I ran two rounds of testing using Standard_D8as_v5: 3 control versus 3 test BONSAI + CHECKPOINT mainnet syncs.
The first set showed a slight dip in sync time, but it's a very noisy metric due to peers, so I reran to verify.

First set:

dev-elc-bu-nb-mainnet-simon-tx-decode-ctl-01  = 25 hours
dev-elc-bu-nb-mainnet-simon-tx-decode-ctl-02  = 23 hours
dev-elc-bu-nb-mainnet-simon-tx-decode-ctl-03  = 23 hours

dev-elc-bu-nb-mainnet-simon-tx-decode-test-01 = 26.5 hours
dev-elc-bu-nb-mainnet-simon-tx-decode-test-02 = 27 hours
dev-elc-bu-nb-mainnet-simon-tx-decode-test-03 = 25.5 hours

Second set:

# Control:
TOTAL_SYNC_DURATION: 27h 16m
TOTAL_SYNC_DURATION: 22h 36m
TOTAL_SYNC_DURATION: 28h 57m

# Test this PR:
TOTAL_SYNC_DURATION: 25h 3m
TOTAL_SYNC_DURATION: 23h 6m
TOTAL_SYNC_DURATION: 22h 7m

No difference in post-sync performance:
Screenshot 2025-05-09 at 7 34 12 pm

@siladu
Copy link
Copy Markdown
Contributor Author

siladu commented May 9, 2025

After further investigation, it turns out this issue is specific to the storage prune-premerge-blocks subcommand, although there are some wider compounding issues which don't help.

The normal besu command executes this which sets the instance:
https://github.com/hyperledger/besu/blob/7cab4bfab5e79ff1b880c6f7a08965593daa6509/besu/src/main/java/org/hyperledger/besu/cli/BesuCommand.java#L2665-L2668
So even though we still call getInstance() upon constructing every Transaction, we used the cached singleton.

The subcommands override the run() method so it doesn't execute this setup code.
The design of the SignatureAlgorithmFactory doesn't cache the instance upon retrieving it.
It also retrieves a method reference instead of initialising each algorithm once at startup: https://github.com/hyperledger/besu/blob/7cab4bfab5e79ff1b880c6f7a08965593daa6509/crypto/algorithms/src/main/java/org/hyperledger/besu/crypto/SignatureAlgorithmType.java#L30


Closing this PR in favour of

  1. A prune-premerge-blocks specific fix to set the singleton so I don't touch Transaction
  2. Wider refactor/tidy up of the singleton issues Tidy up SignatureAlgorithm Singleton and Usage #8619 (the minor memory savings for Transaction would be reaped here)

I will leave removal of R1 as a separate conversation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants