-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[pylint
] Implement missing-maxsplit-arg
(PLC0207
)
#17454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pylint
] Implement missing-maxsplit-arg
(PLC0207
)
#17454
Conversation
Implements `use-maxsplit-arg` (`PLC0207`) https://pylint.readthedocs.io/en/latest/user_guide/messages/convention/use-maxsplit-arg.html > Emitted when accessing only the first or last element of str.split(). The first and last element can be accessed by using str.split(sep, maxsplit=1)[0] or str.rsplit(sep, maxsplit=1)[-1] instead.
Does the original pylint handle slicing too (ie calling split but only using the first n elements)? If not, that would be a great generalization of this rule, maybe one that can be made ruff specific |
No, it only checks for indexes -1 and 0, but making it more general is a very neat idea. Having looked into the Pylint source just now, I also realize that this PR is missing the following functionality:
I'm happy to add this functionality to stay aligned with Pylint and would appreciate maintainer input on whether or not to implement the generalized rule. |
I think making the rule more general (both by handling slices and by handling variables) sounds nice, but it's also okay to add a simpler form of a rule first and extend it in follow-up PRs. |
|
code | total | + violation | - violation | + fix | - fix |
---|---|---|---|---|---|
PLC0207 | 47 | 47 | 0 | 0 | 0 |
if keywords.iter().any(|kw| { | ||
kw.arg.as_deref() == Some("maxsplit") | ||
&& matches!( | ||
kw.value, | ||
ast::Expr::NumberLiteral(ast::ExprNumberLiteral { | ||
value: ast::Number::Int(ast::Int::ONE), | ||
.. | ||
}) | ||
) | ||
}) { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ecosystem check turned up some results like this:
ext = file_path.rsplit(".", 2)[-1].lower()
The maxsplit
argument doesn't have to be passed as a keyword. We have a find_argument_value
helper that should handle this and also extract the value for your comparison to 1:
ruff/crates/ruff_python_ast/src/nodes.rs
Lines 2854 to 2859 in 1108f66
/// Return the value for the argument with the given name or at the given position, or `None` if no such | |
/// argument exists. Used to retrieve argument values that can be provided _either_ as keyword or | |
/// positional arguments. | |
pub fn find_argument_value(&self, name: &str, position: usize) -> Option<&Expr> { | |
self.find_argument(name, position).map(ArgOrKeyword::value) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, you're doing the positional comparison below, and this is a true positive because the maxsplit
value also needs to be 1. In any case, this should help to combine the keyword and positional checks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's way neater with find_argument_value
, so thanks for that!
However, this ecosystem check failure had me re-evaluate the test included, so I compared it to the Pylint one and noticed that maxsplit=2
should be a passing case. The relevant Pylint source is here.
Following that discovery, I've updated the test cases to closer match the Pylint ones, which left me with a few that require additional implementation. These will require identifying:
- when split is called on slices of strings
- kwargs set via unpacked dicts
- if index is set via variable
- if the index variable is modified in a loop (that includes the split call)
- if split is called on a static class member that happens to be a string (I'm uncertain about this one, but think it should be possible.)
- if split is called on a method that returns a string (I'm uncertain that this one is possible.)
As before, any advice is appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if split is called on a static class member that happens to be a string
I've just managed to figure this one out, and its given me a bit of a confidence boost on the rest of them, so I've converted this PR to a Draft until they're done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for the update and for your work on this. Ping me again if you end up needing help. I'm also happy merging a simpler first draft of the rule that we can expand later, if these end up being trickier than you want to deal with!
If it comes to that, let's just include tests for the unhandled cases with clear TODO comments for future work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ntBre , hope all is well. Sorry for the delay on getting back to this.
As you can see from my most recent commits, I've managed to get through a few of the cases I had listed. However, I think I've hit a bit of a wall with the rest. The remaining ones are:
- if index is set via a variable
- if the index variable is modified in a loop (that includes the split call)
- if split is called on a method that returns a string
- if kwargs are set via an unpacked dict variable
I've documented these as TODO in the test, and am happy to go ahead with merging this simpler first draft if you are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is looking great! I think this would be a good place to finish this PR and leave other extensions as follow-ups, as you said. It could be fun to add an auto-fix for this rule, as another follow-up, if you're interested :)
crates/ruff_linter/src/rules/pylint/rules/missing_maxsplit_arg.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/pylint/rules/missing_maxsplit_arg.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/pylint/rules/missing_maxsplit_arg.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/pylint/rules/missing_maxsplit_arg.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/resources/test/fixtures/pylint/missing_maxsplit_arg.py
Outdated
Show resolved
Hide resolved
crates/ruff_linter/resources/test/fixtures/pylint/missing_maxsplit_arg.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this looks great!
I had two minor nits but this is otherwise good to go.
crates/ruff_linter/src/rules/pylint/rules/missing_maxsplit_arg.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/resources/test/fixtures/pylint/missing_maxsplit_arg.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again, this is great!
I merged |
[pylint]
Implement missing-maxsplit-arg
(PLC0207
)pylint
] Implement missing-maxsplit-arg
(PLC0207
)
Summary
Implements
use-maxsplit-arg
(PLC0207
)https://pylint.readthedocs.io/en/latest/user_guide/messages/convention/use-maxsplit-arg.html
This is part of #970
Test Plan
cargo test
Additionally compared Ruff output to Pylint: