Skip to content

Add SHL, SHR and SAR shift operations for EVM v2#10154

Merged
ahamlat merged 13 commits intobesu-eth:mainfrom
ahamlat:stack_artithmetic
Apr 7, 2026
Merged

Add SHL, SHR and SAR shift operations for EVM v2#10154
ahamlat merged 13 commits intobesu-eth:mainfrom
ahamlat:stack_artithmetic

Conversation

@ahamlat
Copy link
Copy Markdown
Contributor

@ahamlat ahamlat commented Apr 1, 2026

Description

Continues the EVM v2 implementation started in #10105.
This PR adds the SHL (0x1b), SHR (0x1c), and SAR (0x1d) shift opcodes to the EVM v2 long[]-based stack.

Changes

  • Implement SHL (0x1b), SHR (0x1c), and SAR (0x1d) shift opcodes for the EVM v2 long[]-based stack, with zero heap allocation on the hot path
  • Introduce StackArithmetic — a dedicated utility class for 256-bit binary and arithmetic operations operating directly on the flat long[] operand stack
  • Add ShlOperationV2, ShrOperationV2, SarOperationV2 operation classes wired into the EVM v2 switch dispatch (with Constantinople gate)
  • Add JMH benchmarks covering all shift execution paths: zero shift, small/medium/max shifts, overflow (>=256), sign-specific scenarios (SAR positive/negative), and fully random inputs
  • Reorganize v2 operation classes under evm.v2.operation package

Benchmarks

With current main

Benchmark                                                  (caseName)  Mode  Cnt   Score   Error  Units
SarOperationOptimizedBenchmark.executeOperation               SHIFT_0  avgt   15  20.751 ± 0.233  ns/op
SarOperationOptimizedBenchmark.executeOperation      NEGATIVE_SHIFT_1  avgt   15  25.100 ± 0.224  ns/op
SarOperationOptimizedBenchmark.executeOperation      POSITIVE_SHIFT_1  avgt   15  24.940 ± 0.276  ns/op
SarOperationOptimizedBenchmark.executeOperation      ALL_BITS_SHIFT_1  avgt   15  17.057 ± 0.232  ns/op
SarOperationOptimizedBenchmark.executeOperation    NEGATIVE_SHIFT_128  avgt   15  23.680 ± 1.255  ns/op
SarOperationOptimizedBenchmark.executeOperation    NEGATIVE_SHIFT_255  avgt   15  23.860 ± 1.090  ns/op
SarOperationOptimizedBenchmark.executeOperation    POSITIVE_SHIFT_128  avgt   15  23.114 ± 0.534  ns/op
SarOperationOptimizedBenchmark.executeOperation    POSITIVE_SHIFT_255  avgt   15  22.724 ± 0.143  ns/op
SarOperationOptimizedBenchmark.executeOperation    OVERFLOW_SHIFT_256  avgt   15  23.510 ± 0.178  ns/op
SarOperationOptimizedBenchmark.executeOperation  OVERFLOW_LARGE_SHIFT  avgt   15  23.466 ± 0.186  ns/op
SarOperationOptimizedBenchmark.executeOperation           FULL_RANDOM  avgt   15  52.545 ± 0.274  ns/op

With this PR

Benchmark                                           (caseName)  Mode  Cnt   Score   Error  Units
SarOperationBenchmarkV2.executeOperation               SHIFT_0  avgt   15   6.049 ± 0.026  ns/op
SarOperationBenchmarkV2.executeOperation      NEGATIVE_SHIFT_1  avgt   15   7.862 ± 0.023  ns/op
SarOperationBenchmarkV2.executeOperation      POSITIVE_SHIFT_1  avgt   15   7.867 ± 0.069  ns/op
SarOperationBenchmarkV2.executeOperation      ALL_BITS_SHIFT_1  avgt   15   7.698 ± 0.025  ns/op
SarOperationBenchmarkV2.executeOperation    NEGATIVE_SHIFT_128  avgt   15   6.806 ± 0.018  ns/op
SarOperationBenchmarkV2.executeOperation    NEGATIVE_SHIFT_255  avgt   15   7.295 ± 0.025  ns/op
SarOperationBenchmarkV2.executeOperation    POSITIVE_SHIFT_128  avgt   15   6.821 ± 0.074  ns/op
SarOperationBenchmarkV2.executeOperation    POSITIVE_SHIFT_255  avgt   15   6.745 ± 0.019  ns/op
SarOperationBenchmarkV2.executeOperation    OVERFLOW_SHIFT_256  avgt   15   6.982 ± 0.331  ns/op
SarOperationBenchmarkV2.executeOperation  OVERFLOW_LARGE_SHIFT  avgt   15   6.974 ± 0.041  ns/op
SarOperationBenchmarkV2.executeOperation           FULL_RANDOM  avgt   15  13.682 ± 0.294  ns/op

Thanks for sending a pull request! Have you done the following?

  • Checked out our contribution guidelines?
  • Considered documentation and added the doc-change-required label to this PR if updates are required.
  • Considered the changelog and included an update if required.
  • For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

  • spotless: ./gradlew spotlessApply
  • unit tests: ./gradlew build
  • acceptance tests: ./gradlew acceptanceTest
  • integration tests: ./gradlew integrationTest
  • reference tests: ./gradlew ethereum:referenceTests:referenceTests
  • hive tests: Engine or other RPCs modified?

ahamlat added 3 commits April 1, 2026 16:07
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Copy link
Copy Markdown
Contributor

@siladu siladu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we compare benchmarks to stack_long_array as well?

@siladu siladu mentioned this pull request Apr 2, 2026
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
@ahamlat ahamlat force-pushed the stack_artithmetic branch from 0f1d4e6 to 45f61e5 Compare April 2, 2026 13:34
ahamlat added 5 commits April 2, 2026 16:43
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
@ahamlat ahamlat marked this pull request as ready for review April 2, 2026 19:31
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would append V2 to the end of the filename so to keep a consistent pattern for this project, i.e. ShiftOperationsPropertyBasedTestV2.java

.isTrue();
}

private static UInt256 getV2StackItem(final MessageFrame frame, final int offset) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would append V2 to the end of the filename so to keep a consistent pattern for this project, i.e. getStackItemV2. Though since this is already in a V2 class, maybe could just be getStackItem here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static UInt256 getV2StackItem(final MessageFrame frame, final int offset) {
private static UInt256 getStackItem(final MessageFrame frame, final int offset) {

@parthdagia05
Copy link
Copy Markdown

parthdagia05 commented Apr 3, 2026

@siladu
I've been implementing v2 operations under evm.operation.v2 from the original skeleton in #10105 (AND, OR, XOR, NOT in #10148, SUB, LT, GT, SLT, SGT, EQ, ISZERO in #10171). Should I move my files to evm.v2.operation to match this PR, or would you prefer to handle the migration after merge?

ahamlat added 2 commits April 7, 2026 12:17
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Copy link
Copy Markdown
Contributor

@siladu siladu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the V2 method name should change ideally. #10154 (comment)

@ahamlat ahamlat merged commit 4007a7e into besu-eth:main Apr 7, 2026
33 checks passed
daniellehrner added a commit that referenced this pull request Apr 8, 2026
* Add SHL, SHR and SAR shift operations for EVM v2 (#10154)

* Add SHL, SHR and SAR implementations and benchmarks for EVM v2

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>

* Upgrade RocksDB version from 9.7.3 to 10.6.2 (#9767)

* Upgrade RocksDB version from 9.7.3 to 10.6.2
* Fix JNI SIGSEGV crashes

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com>

* Add missing verification metadata (#10198)

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

* Stream debug_traceBlock* responses directly to avoid OOM on large blocks (#9848)

* stream block traces on op code level

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* correctly parse default setting for memory tracing

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* fix initcode capture for failed create op codes

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* created separate streaming debug tracer, for batch request fall back to accumulation in memory, adddress pr comments

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* execute tests from genesis and verify full trace

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* addressed pr comments

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* spotless

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* optimize trace streaming and struct log handling

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>

* spotless

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>

* Fix remaining issues and add unit tests

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>

* added back pressure when writing to the socket and reduced the buffer size to work better with netty's default buffer size

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* improve error handling by deferring to send the header only when data is available, allows to send the proper error codes during setup

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* compactHex candidate comparison

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* wire in more performant hex writer

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* introduce separate timeout for streaming calls, defaults to 10 minutes

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* spotless

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* Fix streamin/accumulating output parity, added missing refund field, corrected error format, reason encoding, returnValue prefix, and precompile gasCost, with equivalence tests between both

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* revert accidental removal of 0x prefix

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* pad memory bytes to 32 bytes

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

---------

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Co-authored-by: Ameziane H. <ameziane.hamlat@consensys.net>

---------

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Co-authored-by: ahamlat <ameziane.hamlat@consensys.net>
Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com>
Co-authored-by: Fabio Di Fabio <fabio.difabio@consensys.net>
daniellehrner added a commit that referenced this pull request Apr 9, 2026
* Add SHL, SHR and SAR shift operations for EVM v2 (#10154)

* Add SHL, SHR and SAR implementations and benchmarks for EVM v2

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>

* Upgrade RocksDB version from 9.7.3 to 10.6.2 (#9767)

* Upgrade RocksDB version from 9.7.3 to 10.6.2
* Fix JNI SIGSEGV crashes

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com>

* Add missing verification metadata (#10198)

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

* Stream debug_traceBlock* responses directly to avoid OOM on large blocks (#9848)

* stream block traces on op code level

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* correctly parse default setting for memory tracing

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* fix initcode capture for failed create op codes

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* created separate streaming debug tracer, for batch request fall back to accumulation in memory, adddress pr comments

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* execute tests from genesis and verify full trace

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* addressed pr comments

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* spotless

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* optimize trace streaming and struct log handling

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>

* spotless

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>

* Fix remaining issues and add unit tests

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>

* added back pressure when writing to the socket and reduced the buffer size to work better with netty's default buffer size

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* improve error handling by deferring to send the header only when data is available, allows to send the proper error codes during setup

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* compactHex candidate comparison

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* wire in more performant hex writer

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* introduce separate timeout for streaming calls, defaults to 10 minutes

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* spotless

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* Fix streamin/accumulating output parity, added missing refund field, corrected error format, reason encoding, returnValue prefix, and precompile gasCost, with equivalence tests between both

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* revert accidental removal of 0x prefix

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

* pad memory bytes to 32 bytes

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

---------

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Co-authored-by: Ameziane H. <ameziane.hamlat@consensys.net>

* Optimize performance and reduce memory when creating Quantity from scalar (#10134)

* Optimize performance and reduce memory when creating Quantity from scalar

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

* Benchmark other implementations

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

---------

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

* snap sync - apply BALs before flat db heal (#10151)

Signed-off-by: Miroslav Kovar <miroslavkovar@protonmail.com>

* Remove dryRunDetector workaround methods from unit tests (#10201)

* Remove dryRunDetector workaround methods from unit tests

The dryRunDetector methods were added as a workaround for a Gradle issue
that prevented @ParameterizedTest classes from being selected when running
with --dry-run. Since the issue is fixed and --dry-run is no longer used,
these methods are no longer needed.

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

* Remove dryRunDetector workaround from acceptance tests too

The Gradle issue is confirmed fixed, so the workaround is no longer
needed anywhere, including acceptance tests.

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

---------

Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>

* preserve state gas reservoir for the top level frame in case of OOG (#10205)

Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>

---------

Signed-off-by: Ameziane H. <ameziane.hamlat@consensys.net>
Signed-off-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Signed-off-by: daniellehrner <daniel.lehrner@consensys.net>
Signed-off-by: Miroslav Kovar <miroslavkovar@protonmail.com>
Co-authored-by: ahamlat <ameziane.hamlat@consensys.net>
Co-authored-by: Sally MacFarlane <macfarla.github@gmail.com>
Co-authored-by: Fabio Di Fabio <fabio.difabio@consensys.net>
Co-authored-by: Miroslav Kovář <miroslavkovar@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants