-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[ruff] Add non-empty-init-module (RUF067)
#22143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
| code | total | + violation | - violation | + fix | - fix |
|---|---|---|---|---|---|
| RUF067 | 2016 | 2016 | 0 | 0 | 0 |
| impl Violation for NonEmptyInitModule { | ||
| #[derive_message_formats] | ||
| fn message(&self) -> String { | ||
| "Code detected".to_string() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely open to suggestions here, this feels a bit too short. Adding in an __init__.py file seems redundant with the filename in the output, though. The upstream linter at least customizes the message based on the strictness (e.g. Non-import code detected for the default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the message is a bit surprising. It's like, yeah, this is code, what's wrong with it?
Using separate messages based on strictness probably makes sense. For strict mode, maybe init module must be empty (although it doesn't really say what's wrong)?
|
Based on the ecosystem results, and my conversation with Micha today, I added a few more allowed statement types in non-strict mode:
I also fixed my top-level statement check, which was previously only checking the scope and thus emitting diagnostics for statements within That will clear out 10/17 of the airflow diagnostics rendered in the comment, so I expect it to drop the total number of ecosystem hits substantially too. |
|
A few remaining common patterns in the ecosystem:
I think I'll add support for the last two at least since I'm less convinced about the loggers since they can be hard to detect and also seem a bit less common. |
Summary -- This PR adds a new rule, `non-empty-init-module`, which restricts the kind of code that can be included in an `__init__.py` file. By default, docstrings, imports, and assignments to `__all__` are allowed. When the new configuration option `lint.ruff.strictly-empty-init-modules` is enabled, no code at all is allowed. This closes #9848, where these two variants correspond to different rules in the [`flake8-empty-init-modules`](https://github.com/samueljsb/flake8-empty-init-modules/) linter. The upstream rules are EIM001, which bans all code, and EIM002, which bans non-import/docstring/`__all__` code. Since we discussed folding these into one rule on [Discord], I just added the rule to the `RUF` group instead of adding a new `EIM` plugin. I'm not really sure we need to flag docstrings even when the strict setting is enabled, but I just followed upstream for now. Similarly, as I noted in a TODO comment, we could also allow more statements involving `__all__`, such as `__all__.append(...)` or `__all__.extend(...)`. The current version only allows assignments, like upstream, as well as annotated and augmented assignments, unlike upstream. I think when we discussed this previously, we considered flagging the module itself as containing code, but for now I followed the upstream implementation of flagging each statement in the module that breaks the rule (actually the upstream linter flags each _line_, including comments). This will obviously be a bit noisier, emitting many diagnostics for the same module. But this also seems preferable because it flags every statement that needs to be fixed up front instead of only emitting one diagnostic for the whole file that persists as you keep removing more lines. It was also easy to implement in `analyze::statement` without a separate visitor. The first commit adds the rule and baseline tests, the second commit adds the option and a diff test showing the additional diagnostics when the setting is enabled. Test Plan -- New tests adapted from the upstream linter [Discord]: https://discord.com/channels/1039017663004942429/1082324250112823306/1440086001035771985
this is actually the only reason Checker::path() was used and nearly the only reason the field Checker::path exists. there's only one remaining reference to the field in a debug logging call
|
Haven't started reviewing but we should probably allow all listed here https://peps.python.org/pep-0008/#module-level-dunder-names |
MichaReiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. This is great. The bigest open question for me is whether this rule should also apply to __init__.pyi files
crates/ruff_linter/src/codes.rs
Outdated
| (Ruff, "064") => rules::ruff::rules::NonOctalPermissions, | ||
| (Ruff, "065") => rules::ruff::rules::LoggingEagerConversion, | ||
| (Ruff, "066") => rules::ruff::rules::PropertyWithoutReturn, | ||
| (Ruff, "070") => rules::ruff::rules::NonEmptyInitModule, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I prefer avoiding "holes" and instead assign rule ids on the rule that comes first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just felt bad because another rule currently using RUF067 is sitting in my review queue (along with RUF069), but I'll update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I do is that I change the rule code myself if I end up approving a contributor PR, so that we don't need to feel bad about it
crates/ruff_linter/src/rules/ruff/rules/non_empty_init_module.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/ruff/rules/non_empty_init_module.rs
Outdated
Show resolved
Hide resolved
crates/ruff_linter/src/rules/ruff/rules/non_empty_init_module.rs
Outdated
Show resolved
Hide resolved
| impl Violation for NonEmptyInitModule { | ||
| #[derive_message_formats] | ||
| fn message(&self) -> String { | ||
| "Code detected".to_string() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the message is a bit surprising. It's like, yeah, this is code, what's wrong with it?
Using separate messages based on strictness probably makes sense. For strict mode, maybe init module must be empty (although it doesn't really say what's wrong)?
Co-authored-by: Micha Reiser <[email protected]>
I'm not sure that this is an exhaustive list, and from when I searched around last week, even |
ruff] Add non-empty-init-module (RUF070)ruff] Add non-empty-init-module (RUF067)
|
Part of the rationale is currently that “Including code in |
|
Good catch, thanks! That rationale works for the strict form of the rule, but I agree that it doesn't make much sense next to the first example I have here. I'll try to come up with something better. |
Summary
This PR adds a new rule,
non-empty-init-module, which restricts the kind ofcode that can be included in an
__init__.pyfile. By default, docstrings,imports, and assignments to
__all__are allowed. When the new configurationoption
lint.ruff.strictly-empty-init-modulesis enabled, no code at all isallowed.
This closes #9848, where these two variants correspond to different rules in the
flake8-empty-init-moduleslinter. The upstream rules are EIM001, which bans all code, and EIM002, which
bans non-import/docstring/
__all__code. Since we discussed folding these intoone rule on Discord, I just added the rule to the
RUFgroup instead ofadding a new
EIMplugin.I'm not really sure we need to flag docstrings even when the strict setting is
enabled, but I just followed upstream for now. Similarly, as I noted in a TODO
comment, we could also allow more statements involving
__all__, such as__all__.append(...)or__all__.extend(...). The current version only allowsassignments, like upstream, as well as annotated and augmented assignments,
unlike upstream.
I think when we discussed this previously, we considered flagging the module
itself as containing code, but for now I followed the upstream implementation of
flagging each statement in the module that breaks the rule (actually the
upstream linter flags each line, including comments). This will obviously be a
bit noisier, emitting many diagnostics for the same module. But this also seems
preferable because it flags every statement that needs to be fixed up front
instead of only emitting one diagnostic for the whole file that persists as you
keep removing more lines. It was also easy to implement in
analyze::statementwithout a separate visitor.
The first commit adds the rule and baseline tests, the second commit adds the
option and a diff test showing the additional diagnostics when the setting is
enabled.
I noticed a small (~2%) performance regression on our largest benchmark, so I also added a cached
Checker::in_init_modulefield and method instead of theChecker::pathmethod. This was almost the only reason for theChecker::pathfield at all, but there's one remaining reference in awarn_user!call.Test Plan
New tests adapted from the upstream linter
Ecosystem Report
I've spot-checked the ecosystem report, and the results look "correct." This is obviously a very noisy rule if you do include code in
__init__.pyfiles. We could make it less noisy by adding more exceptions (e.g. allowingif TYPE_CHECKINGblocks, allowing__getattr__functions, allowing imports fromimportlibassignments), but I'm sort of inclined just to start simple and see what users need.