-
Notifications
You must be signed in to change notification settings - Fork 9
Add failfast
option
#133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add failfast
option
#133
Conversation
8bd277b
to
9051fd7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the feature and I agree with the current API. 👍
I didn't manage to get through all of this today, will continue tomorrow.
fc2bae2
to
1adf0d0
Compare
1adf0d0
to
a5d08a4
Compare
0d174a0
to
4c87b2f
Compare
4c87b2f
to
bcb747e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me!
It's a bit sad that we can't guarantee actual fast failing when nworkers > 1
since then the current test items scheduled on other test workers might take up to timeout
seconds to finish. What if we pretend that the timeout
is effectively zero and just kill any other workers to guarantee truly fast failure? The user likely doesn't care about all the test results when they run runtests
with failfast
... especially if we'd call the kwarg failfastandfurious
:)
@@ -278,7 +301,7 @@ end | |||
# By tracking and reusing test environments, we can avoid this issue. | |||
const TEST_ENVS = Dict{String, String}() | |||
|
|||
function _runtests(ti_filter, paths, nworkers::Int, nworker_threads::String, worker_init_expr::Expr, test_end_expr::Expr, testitem_timeout::Int, retries::Int, memory_threshold::Real, verbose_results::Bool, debug::Int, report::Bool, logs::Symbol, timeout_profile_wait::Int, gc_between_testitems::Bool) | |||
function _runtests(ti_filter, paths, nworkers::Int, nworker_threads::String, worker_init_expr::Expr, test_end_expr::Expr, testitem_timeout::Int, retries::Int, memory_threshold::Real, verbose_results::Bool, debug::Int, report::Bool, logs::Symbol, timeout_profile_wait::Int, gc_between_testitems::Bool, failfast::Bool, testitem_failfast::Bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily for this PR, but I think we should just bundle all these args into a Context struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for sure! #186
Yeah, i wonder about that too 🤔 It was so long ago that i did this i can't remember why i didn't go kill the other workers... I'll need to look into it. I suspect/hope it was laziness/simplicity (i.e. this implementation is so simple, because we just have In fact, i wonder if it's even worse than "might take up to timeout seconds to finish", because of retries? in which case that could be pretty bad, and we'd have to just name this |
Buuut, also i've no time to work on this at the minute, and it's a pain to keep rebasing and a bit of a shame not to have it at least in it's current form... so i think i might just update the documentation to call-out that we wait for testitems on others workers to finish in the current implementation, and say this may change in future releases to proactively cancel other running testitems to enable even faster failures, and merge what's here (i.e. let us land that improvement in a follow-up, non-breaking release) -- what do you think? |
bcb747e
to
962b128
Compare
To make clear this may change in a non-breaking release
Good point about the retries! I wonder if retries could be skipped relatively easily by checking first whether the run has been canceled. But I think it's fine to refine the behavior of this feature in the future 👍 |
go test
has-failfast
cargo test
has-no-fail-fast
pytest
has--exitfirst
rspec
has--fail-fast
jest
hasbail
(https://jestjs.io/docs/configuration#bail-number--boolean)failfast
since Julia v1.9 (JuliaLang/julia@88def1a)Since for us "running tests" involves running test-items which themselves can run multiple tests, we can "fail fast" at two levels:
runtests
from running new testitems as soon as one returns as a failure/error (mark that whole run as a failure)@testitem
from running the tests inside as soon as there is a failure/error (mark that test-item as a failure)Originally this PR added just a
failfast
keyword toruntests
:runtests(..., failfast=true)
stops as soon as any testitem fails.But i've extended it to add the ability for an
@testitem
to setfailfast=true
:@testitem "foo" failfast=true
stops on the first test error/failure. This can be set for all test-items by a second newruntests
keyword namedtestitem_failfast
, (i.e. this can be used to set the default for all testitems).This matches how
@testitem "foo" timeout=60
corresponds toruntests(...; testitem_timeout=60)
failfast
defaults tofalse
. I've settestitem_failfast
to default to the same value as given tofailfast
(if not set explicitly on a@testitem
).So the proposed behaviour is:
runtests(...)
=> neither runtests nor individual testitems stop earlyruntests(...; failfast=true)
=> both stop earlyruntests(...; testitem_failfast=true)
=> only testitems stop earlyruntests(...; failfast=true, testitem_failfast=false)
=> only runtests stops earlyAPI question
Would it be simpler to have the separate keywords operate independently?
The downside would be you have to set both to
true
to get "fail fastest", i.e. to both have individual testitems stop when they hit an error/failure, and to have no new testitems run once one has hit an error/failure you have to run withruntests(...; failfast=true, testitem_failfast=true)
Alternative (keywords operate independently):
runtests(...;)
=> neither runtests nor individual testitems stop earlyruntests(...; failfast=true)
=> only runtests stops earlyruntests(...; testitem_failfast=true)
=> only testitems stop earlyruntests(...; failfast=true, testitem_failfast=true)
=> both stop earlyReally the question is whether
failfast=true
should default to turning on both forms of early stopping (as currently proposed by this PR), or if it should mean only runtests fails fast?I would quite like the seperation of the two... one keyword controls one, the other controls the other... BUT I think in practice it is more ergonomic for users to just pass
failfast=true
to get the "fastest" failures (hence proposing that) ...i'd love some feedback on this decision!