UInt256 long digits record by thomas-quadratic · Pull Request #9677 · besu-eth/besu

thomas-quadratic · 2026-01-23T12:53:43Z

PR description

This PR cumulates a few improvements/refactoring for UInt256 modular arithmetics:

digits (limbs) are big-endian ordered.
digits use longs rather than ints as primitive type.
UInt256 is a record storing its digits as fields rather than in an array.
division algorithm follows methods from the GMP division paper since widening from long is not possible.
fromBytesBE is improved to give better performance on worst cases.

Long limbs are more efficient because they divide by 2 the number of digits, thus reducing the number of steps in arithmetics operations. However, it complexifies the implementation for division as widening is not possible anymore. Fortunately, quotient estimates algorithms that avoids widening exists, see division paper.

Records are chosen to represent fixed-width UInt256 because of potential future support for Vahalla value records, which would require almost no change to the code.

Here are the benchmarks compared to the actual implementation:

Operation	Case	current (ns/op)	new (ns/op)	gain (%)
Mod	FULL_RANDOM	106.042	84.005	21%
Mod	WORST	149.179	95.7	36%

Current status

Currently working and tested ops:

AddMod
Mod
MulMod
SMod

Thanks for sending a pull request! Have you done the following?

Checked out our contribution guidelines?
Considered documentation and added the doc-change-required label to this PR if updates are required.
Considered the changelog and included an update if required.
For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Locally, you can run these tests to catch failures early:

spotless: ./gradlew spotlessApply
unit tests: ./gradlew build
acceptance tests: ./gradlew acceptanceTest
integration tests: ./gradlew integrationTest
reference tests: ./gradlew ethereum:referenceTests:referenceTests
hive tests: Engine or other RPCs modified?

thomas-quadratic · 2026-01-28T22:30:55Z

Updated benchmarks with the new commits that support the other operations: AddMod/MulMod/SMod

Operation	Case	current (ns/op)	new (ns/op)	gain (%)
Mod	FULL_RANDOM	106.042	84.005	21%
Mod	WORST	149.179	95.7	36%
SMod	FULL_RANDOM	141.778	101.1	29%
SMod	WORST	170.822	101.1	41%
AddMod	FULL_RANDOM	179.559	137.073	24%
AddMod	WORST	187.805	137.073	27%
MulMod	FULL_RANDOM	259.265	186.501	29%
MulMod	WORST	344.663	222.37	41%

thomas-quadratic · 2026-01-29T22:53:14Z

Changing single digit quotient estimates using 3 digits by 2 digits (div3by2) to 2 digits by 1 digit (div2by1).
Improved benchmarks across the board:

Operation	Case	current (ns/op)	new (ns/op)	gain (%)
Mod	FULL_RANDOM	106.042	81.032	24%
Mod	WORST	149.179	86.942	42%
SMod	FULL_RANDOM	141.778	97.15	31%
SMod	WORST	170.822	97.15	43%
AddMod	FULL_RANDOM	179.559	130.147	28%
AddMod	WORST	187.805	130.147	31%
MulMod	FULL_RANDOM	259.265	177.442	32%
MulMod	WORST	344.663	209.105	39%

siladu · 2026-02-12T01:43:03Z

Code review

Found 1 issue:

UInt512.isUInt64() uses bitwise AND (&) instead of OR (|) to check if all high limbs are zero. With AND, the expression (u7 & u6 & u5 & u4 & u3 & u2 & u1) == 0 can return true even when multiple high limbs are non-zero, as long as they don't share a common bit (e.g., u7=1, u6=2 gives 1 & 2 = 0). This causes incorrect results in Modulus64 arithmetic shortcuts (addMod, mod, mulMod) that rely on this check. The correct operator is |, consistent with UInt256.isUInt64() at line 295 which already uses (u1 | u2 | u3) == 0.

https://github.com/hyperledger/besu/blob/d86a15e7a7a9e3441d2d85ef4f96a4dc58413519/evm/src/main/java/org/hyperledger/besu/evm/UInt256.java#L941-L944

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

siladu · 2026-02-12T03:46:04Z

Code review

Found 1 issue:

UInt512.isUInt64() uses bitwise AND (&) instead of OR (|) to check if all high limbs are zero. With AND, the expression (u7 & u6 & u5 & u4 & u3 & u2 & u1) == 0 can return true even when multiple high limbs are non-zero, as long as they don't share a common bit (e.g., u7=1, u6=2 gives 1 & 2 = 0). This causes incorrect results in Modulus64 arithmetic shortcuts (addMod, mod, mulMod) that rely on this check. The correct operator is |, consistent with UInt256.isUInt64() at line 295 which already uses (u1 | u2 | u3) == 0.

Unit test showing bug here https://github.com/thomas-quadratic/besu/pull/5/changes

lu-pinto · 2026-02-14T22:06:25Z

evm/src/main/java/org/hyperledger/besu/evm/UInt256.java

+    long x63 = (x + 1) >>> 1;
+    long v0 = LUT[x9 - 256] & 0xFFFFL;
+    long v1 = (v0 << 11) - ((v0 * v0 * x40) >>> 40) - 1;
+    long v2 = (v1 << 13) + ((v1 * ((1L << 60) - v1 * x40)) >>> 47);


No need for doing computation of a constant number here:

Suggested change

long v2 = (v1 << 13) + ((v1 * ((1L << 60) - v1 * x40)) >>> 47);

long v2 = (v1 << 13) + ((v1 * (0x1000000000000000L- v1 * x40)) >>> 47);

I you want maybe just a static constant just above reciprocal method would do as well

evm/src/main/java/org/hyperledger/besu/evm/UInt256.java

Before, limbs were stored in little-endian. But to use Arrays.mismatch to our advantage, it is better to have it big-endian. This commit makes UInt256.java big-endian in limbs. We still need to migrate all tests and benchmark. Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

Also added tests that were failing and now pass. Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

Small cleaning up of the private methods for addition and compareLimbs. Should be easier for the compiler. Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

UInt256 used int[] for limbs, primarily for simplicity, e.g. having the possibility to widen to long. However, methods exists to work with long[] and no widening. This commit implements long limbs. To avoid widening, we do: 1. add: overflow check 2. mul: native multiplyHigh (compiled to assembly mulq) 3. div: more complicated, see the gnump division paper. Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

Add refactor-focused jqwik properties for the UInt256 4xlong record representation. Run: ./gradlew :evm:test --tests org.hyperledger.besu.evm.UInt256RecordProp Observed failures (at introduction time): - property_shiftLeft_matches_big_integer_mod_2_256 (seed 1123581321) Shrunk: a = 0x00000000000000000000000000000000000000000000000000000000000000ff shift = -512 Expected: 0x00..00 Got: 0xff..ff - property_shiftRight_matches_big_integer_mod_2_256 (seed 867530900) Shrunk: a = 0x00000000000000000000000000000000000000000000000000000000000000ff shift = -512 Expected: 0x00..00 Got: 0x00..ff - property_mod_matches_big_integer_unsigned (seed 123456789) Sample: arg0 = byte[] [-128, 0, 0, 0, 0, 0, 0, 0, -128] arg1 = 64-byte MSB-heavy pattern (truncated to 32) Expected: 0x00..00 Got: 0x00..80 - property_signedMod_matches_evm_semantics (seed 987654321) Sample: arg0 = byte[] [-128, 0, 0, 0, 0, 0, 0, 0, -128] arg1 = byte[] [-128] Expected: 0x00..00 Got: 0x00..80 - property_addMod_matches_big_integer_unsigned (seed 42424242) Sample: arg0 = byte[] [] arg1 = byte[] [-128, 0, 0, 0, 0, 0, 0, 0, -128] arg2 = 64-byte MSB-heavy pattern (truncated to 32) Expected: 0x00..00 Got: 0x00..80 Signed-off-by: Nikos Baxevanis <nikos.baxevanis@gmail.com>

UInt256.shiftLeft/shiftRight are only defined for 0 <= shift < 64. Gate the corresponding properties with an assumption so they don't assert EVM-wide shift semantics. This removes shift-related false positives; remaining failures are isolated to mod/signedMod/addMod. Signed-off-by: Nikos Baxevanis <nikos.baxevanis@gmail.com>

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

lu-pinto · 2026-02-24T18:17:54Z

I'm running a hoodi sync just as a smoke test. will merge when that succeeds

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

thomas-quadratic changed the title ~~Feat/uint256 as record~~ UInt256 long digits record Jan 23, 2026

This was referenced Jan 27, 2026

Migrating UInt256 to big-endian limbs #9546

Closed

Optimisation for UInt256.fromBytesBE #9547

Closed

thomas-quadratic marked this pull request as ready for review January 30, 2026 07:44

macfarla added the performance label Jan 30, 2026

github-project-automation bot added this to Performance Jan 30, 2026

macfarla moved this to In Progress in Performance Feb 1, 2026

siladu mentioned this pull request Feb 12, 2026

Expose bug thomas-quadratic/besu#5

Merged

lu-pinto reviewed Feb 14, 2026

View reviewed changes

lu-pinto reviewed Feb 16, 2026

View reviewed changes

evm/src/main/java/org/hyperledger/besu/evm/UInt256.java Show resolved Hide resolved

lu-pinto reviewed Feb 16, 2026

View reviewed changes

evm/src/main/java/org/hyperledger/besu/evm/UInt256.java Show resolved Hide resolved

lu-pinto reviewed Feb 16, 2026

View reviewed changes

evm/src/main/java/org/hyperledger/besu/evm/UInt256.java Show resolved Hide resolved

lu-pinto reviewed Feb 16, 2026

View reviewed changes

evm/src/main/java/org/hyperledger/besu/evm/UInt256.java Show resolved Hide resolved

Copilot AI review requested due to automatic review settings February 24, 2026 14:27

thomas-quadratic added 12 commits February 24, 2026 14:35

FIX: array indices bugs for fromBytesBE and mulMod.

1d36468

Also added tests that were failing and now pass. Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

FIX: offset indices in mulMod and fromBytesBE

e717b09

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

FIX: spotless

097773d

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

ENH: cleaning addition and compareLimbs

d0033c3

Small cleaning up of the private methods for addition and compareLimbs. Should be easier for the compiler. Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

ENH: migrated mod to record based UInt256

772c0ed

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

FIX: uint256 mod bugs

f544979

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

ENH: improve on worst case fromBytesBE

ddcbf5f

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

FIX: spotless

785d721

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

REF: records for addMod and mulMod

e0f7ff0

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

FIX: spotless and mod bugs

9a8366d

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

thomas-quadratic and others added 18 commits February 24, 2026 14:35

FIX: javadoc

d96c9ac

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

FIX: first reduction step overflow fix + add/sub/mul ops

f2fdbbc

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

ADD: prop tests for add/sub/mul and refactor prop tests

294795f

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

Expose bug

76f716c

Signed-off-by: Simon Dudley <simon.dudley@consensys.net>

UInt256 with better code layout and perf optimizations

bba0e64

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Branchless code for multiplication

f2cf0cc

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

spotless

12a4ae6

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

add branchless code in div2by1 and mod2by1

42af362

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Make adc(UInt256) branchless

9b0806f

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Make mac* methods carry branchless

328bd3e

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Make mul64 method carry branchless

b3df0fc

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Make reciprocal, div2by and mod2by1 method branchless

9cfc476

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Make mulSub* and addBack methods branchless

a9a39e5

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Implement fast/slow paths for reduceNormalised methods

3197d9b

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Fix issue in mulSubOverflow

4a28898

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

Fix issue with mulSub borrows

c0dc03f

Signed-off-by: Luis Pinto <luis.pinto@consensys.net>

lu-pinto force-pushed the feat/uint256-as-record branch from 4ac107c to c0dc03f Compare February 24, 2026 14:35

lu-pinto approved these changes Feb 24, 2026

View reviewed changes

FIX: removed test for code that became dead and fixed spotless

7e69743

Signed-off-by: Thomas Zamojski <thomas.zamojski@quadratic-labs.com>

Copilot AI reviewed Feb 24, 2026

View reviewed changes

Merge branch 'main' into feat/uint256-as-record

feb732a

lu-pinto enabled auto-merge (squash) March 1, 2026 10:35

lu-pinto merged commit 57a7e7d into besu-eth:main Mar 1, 2026
46 checks passed

github-project-automation bot moved this from In Progress to Done in Performance Mar 1, 2026

lu-pinto mentioned this pull request Mar 2, 2026

Implement div sdiv with long limbs #9923

Merged

10 tasks

This was referenced Mar 19, 2026

Reimplement UInt256.limbs fully in Big endian instead of Little endian #9475

Closed

Investigate long limbs on UInt256 #9356

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UInt256 long digits record#9677

UInt256 long digits record#9677
lu-pinto merged 36 commits intobesu-eth:mainfrom
thomas-quadratic:feat/uint256-as-record

thomas-quadratic commented Jan 23, 2026 •

edited

Loading

Uh oh!

thomas-quadratic commented Jan 28, 2026

Uh oh!

thomas-quadratic commented Jan 29, 2026

Uh oh!

siladu commented Feb 12, 2026

Uh oh!

siladu commented Feb 12, 2026

Code review

Uh oh!

lu-pinto Feb 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lu-pinto commented Feb 24, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

	long v2 = (v1 << 13) + ((v1 * ((1L << 60) - v1 * x40)) >>> 47);
	long v2 = (v1 << 13) + ((v1 * (0x1000000000000000L- v1 * x40)) >>> 47);

Conversation

thomas-quadratic commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR description

Current status

Thanks for sending a pull request! Have you done the following?

Locally, you can run these tests to catch failures early:

Uh oh!

thomas-quadratic commented Jan 28, 2026

Uh oh!

thomas-quadratic commented Jan 29, 2026

Uh oh!

siladu commented Feb 12, 2026

Code review

Uh oh!

siladu commented Feb 12, 2026

Code review

Uh oh!

lu-pinto Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lu-pinto commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

thomas-quadratic commented Jan 23, 2026 •

edited

Loading

lu-pinto commented Feb 24, 2026 •

edited

Loading