Description
Does this issue occur when all extensions are disabled?: Yes
- VS Code Version: Version: 1.90.2; Commit: 5437499
- OS Version: macOS Big Sur 11.6
I am working on improving Lua's auto indentation rules to fix keywords inside string literal will trigger false positive as described in #199223. If interested, you can see my progress in that issue. However I found another problem during my development.
My regex should be able to handle this case:
if s == [[then]]
- my regex should be able to tell that
then
is not followed by pure trailing space / comment => should not be indented - however vscode will still indent it on enter
- I then used the debugger in developer tool, and found that the actual string being evaluated is different from the actual text
‼️
Root Cause Analysis
After some searching, I found that a while ago the issue #209862 proposed to remove brackets from string or comments before passing to the regex module, which fix auto indent issue related to bracket detection. And it is implemented and released in PR #210641.
I see some discussion going on here #209519 (comment), and I understand the rationale behind. Nonetheless Lua has a multiline quote [[...]]
which uses the bracket character, and it can even be nested [==[...]==]
.
ref: http://lua-users.org/wiki/StringsTutorial
I believe that the current logic of removing brackets from string or comments is to detect a string token type (in the above case is the [[then]]
) and then remove all [
/ ]
from it (as they are defined as bracket in Lua's language config). But then after striping the surrounding [[]]
, the then
token stands alone. And my regex have no way to deal with this case, which will cause a false positive
More example
To make things worse, Lua also has a multiline comment --[[...]]
that can also be written in single line. 😂
Take the following as example:
if s == --[[then]] then
- this should have indent on next line
- but after
[[
and]]
are stripped, the text becomesif s == --then then
- and the regex will just skip the trailing comment part, because both of the
then
is after a--
comment
Quick thought
I don't have any bright idea right now. By just by reading the discussion mentioned above, I guess maybe removing all contents in string / comment, including the quotes / comment prefix
would be a better solution? Because indentation rule generally will not depends on the content inside string / comment? 🤔
Using the examples above:
if s == [[then]]
=> becomesif s ==
=> no more false positive ✅if s == --[[then]] then
=> becomesif s == then
=> again just check for\bthen\b\s*$
is enough to match this ✅