Skip to content

Numbers should be allowed to contain underscores. #1844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 20, 2021

Conversation

mrmonroe
Copy link
Contributor

@mrmonroe mrmonroe commented Oct 3, 2020

Description

  • Source branch in your fork has meaningful name (not master)

Fixes Issue:
#1836 Numbers should be allowed to contain underscores.

Before Merge Checklist

These items can be completed after PR is created.

(Check any items that are not applicable (NA) for this PR)

  • JavaScript implementation
  • Python implementation (NA if HTML beautifier)
  • Added Tests to data file(s)
  • [NA] Added command-line option(s) (NA if
  • [NA] README.md documents new feature/option(s)

… plus tests

in s/src/javascript/tokenizer.js and test/data/javascript/tests.js
… plus tests

in s/src/javascript/tokenizer.js and test/data/javascript/tests.js

	modified:   js/src/javascript/tokenizer.js
	modified:   js/test/generated/beautify-javascript-tests.js
	modified:   python/jsbeautifier/tests/generated/tests.py
	modified:   test/data/javascript/tests.js
	modified:   python/jsbeautifier/javascript/tokenizer.py
	modified:   test/data/javascript/tests.js

Issue beautifier#1836 updated number_pattern regex to include underscore in binary
literal.
…js-beautify into issue/underscores_1836

Merging earlier commit to my local branch no changes made. Rerunning
tests.
	modified:   test/data/javascript/tests.js

Test suite made change to tests.js file due to style issue.
@mrmonroe mrmonroe changed the title Issue/underscores 1836 #1836 Numbers should be allowed to contain underscores. Oct 4, 2020
@mrmonroe
Copy link
Contributor Author

mrmonroe commented Oct 4, 2020

Hey guys,
Looks like the code climate failure is a long-standing issue as the regex line is 104 characters long. I'll be glad to break it down but not sure if you guys have been overriding it and don't want to break style or readability for the code.

The other failures are from the ./tools/git-status-clear.sh script running in the Azure pipeline which looks like an outstanding issue, although I don't see it logged. I ran the script on my Mac and Linux boxes and have no problems with it. I'd be glad to help you guys with it but of course, I don't have access to the Azure account.

Let me know how you want to proceed.

@bitwiseman
Copy link
Member

Looks like the code climate failure is a long-standing issue as the regex line is 104 characters long. I'll be glad to break it down but not sure if you guys have been overriding it and don't want to break style or readability for the code.

Please feel free to break the regex to multiple lines/variables for clarity.

Thanks for the heads up on the CI failure, I'll look into that.

Copy link
Member

@bitwiseman bitwiseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests as noted.

Also, the current change only addresses 0b numbers. Will you be updating to handle all number forms? If so, please include tests for similar scenarios to the ones noted for other numbers.

Keep in mind _ is the start of a variable name, so ._001 cannot be treated as a number during tokenizing. It will need to be treated as . and a variable _001.

There's a bunch of edge cases.

Thanks for contributing this PR!

@@ -66,7 +66,7 @@ var TOKEN = {

var directives_core = new Directives(/\/\*/, /\*\//);

var number_pattern = /0[xX][0123456789abcdefABCDEF]*|0[oO][01234567]*|0[bB][01]*|\d+n|(?:\.\d+|\d+\.?\d*)(?:[eE][+-]?\d+)?/;
var number_pattern = /0[xX][0123456789abcdefABCDEF]*|0[oO][01234567]*|0[bB][01_]*|\d+n|(?:\.\d+|\d+\.?\d*)(?:[eE][+-]?\d+)?/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numbers still have to start with a number. 0b_101 is not valid, 0b1_01 is valid. Also, 0b1__1 is also not valid syntax either.
Note, 0b is not valid and the current regex will match it, and the result can be seen here: bt('a=0Bg0b0o0', 'a = 0B g0b0o0');

The question is whether the beautifier should "fix" problems with underscores the it does with other characters, or if it should be more permissive. :sad: g is never a valid character, but underscore is.

I guess for 0x, 0o, and 0b, the regex you have above is fine, but please add tests for 0b_101 and 0b1__01 so that the behavior doesn't change unexpectedly later.

Base automatically changed from master to main January 11, 2021 21:55
@mrmonroe
Copy link
Contributor Author

mrmonroe commented Mar 2, 2021

@bitwiseman been down with Covid for a long time and just getting my life back together. Will get to this PR in a few days.

@bitwiseman
Copy link
Member

@mrmonroe
Sorry to hear that, but welcome back. Thanks for picking this back up. Take your time.

We chose to be more permissive with underscore and not reformate invalid placements of them.
Leaving unchanged is slightly odd but it also prevents the beautifier from trying to guess
what the user actually meant.
@bitwiseman bitwiseman changed the title #1836 Numbers should be allowed to contain underscores. Numbers should be allowed to contain underscores. May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants