Skip to content

Algorithm designs #19

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tabatkins opened this issue Jun 11, 2025 · 1 comment
Open

Algorithm designs #19

tabatkins opened this issue Jun 11, 2025 · 1 comment

Comments

@tabatkins
Copy link
Collaborator

I've been reviewing the literature and existing implementations for guidance on how I'm going to specify the algorithms for .random(), .number(), .int(), and .bigint(), and I think I've come to some conclusions.

First, my writeup of a few of the algorithms is documented at https://github.com/tc39/proposal-random-functions/blob/main/generation-methods.md.

Second, my conclusions about each method:

.random()

(Number, between 0 and 1.)

I think I'm going to just use the standard "create a 53-bit integer, load it into a float, divide it by 2^53" approach used by many libraries. This uses a single 64-bit block, is fast, and is perfectly uniform.

The only downside to me is that it can generate 0, and I think it's probably most useful to default to a double-open range instead. But for .random() I can compromise and stick with the simple, widely-used solution.

I don't think it's worthwhile to take the second approach in the doc, which uses 128 bits to extend the range closer to zero and fill in every possible float, but I could be convinced. ^_^

.number()

(Number, between min and max)

The standard approach (generate a number in 1-2, subtract one, multiply by range, add min) is cheap and simple to understand, but terrible for uniformity, and only ends up exploiting 52 bits of randomness.

Instead, I think it's worthwhile to go with the approach in https://github.com/tc39/proposal-random-functions/blob/main/generation-methods.md#number-in-min-max-1, using 3 blocks of 64 bits. This approach uses two blocks to generate a perfectly uniform value in the range (with values separated by the largest distance between neighboring floats in the range), then uses the third block to fill in some additional entropy while maintaining value uniformity. It generates all possible floats in the range until you get sufficiently close to zero (where "sufficiently close" is "52 exponents down from the larger of the endpoints".

Probably defaulting to a double-closed range, but with options to include either or both ends. (The difference is trivial in the algorithm.)

.int() and .bigint() (and stepped .number())

(Integer or BigInt, between min and max)

Going with the two approaches listed in the doc. For smaller values (where the range is less than 64 bits wide), use Canon's method - 2 64-bit blocks, with a bias less than 2^-64. For larger values, use the method I've developed in the doc - use Canon's method to generate the most significant 63 bits of the range, then fill the rest randomly and do a rejection at the end if needed (less than 2^-62 chance of rejection occurring). This ensures that you generate every possible int between the endpoints, with uniform value even if you're outside the safe integer range.

Probably defaulting to a double-closed range, but with opens to exclude either or both ends. (The difference is trivial in the algorithm.)

@tabatkins
Copy link
Collaborator Author

Actually, I think I need to give more thought to the .number() algo. I think, if there are more than 2^52 floats in the |step| interval, that filling in the mantissa with bits isn't actually uniform (it'll bias towards zero, instead). But I might be able to shift into a math-based approach instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant