rustbuild: Don't re-run tests suites that already passed #36385

New issue

Closed

rustbuild: Don't re-run tests suites that already passed#36385

Assignees

Labels

T-bootstrap

alexcrichton

opened

on Sep 10, 2016

Member

For example if the run-pass test suite passes entirely then we shouldn't re-run it if none of the dependencies have changed.

We should also be sure though that if there's a test filter that we don't consider the run-pass test suite as passing, an annoyance with the makefiles!

added

assigned

MemberAuthor

@sanxiyn would love to tackle!

Ericson2314

Contributor

Thinking back to #36032 I think the true solution is to teach cargo to test the whole dependency closure of the given package and cache suites exit statuses. Then cargo -p --recursive --cache-result std; cargo -p --recursive --cache-result rustc Just Works™.

eddyb

Member

What if this was done in compiletest? Doesn't it already record success of each test? It would significantly speed up fixing up a failing test, for example.

alexcrichton

MemberAuthor

Using compiletest sounds plausible, yeah! It'd have to check the mtime of not only the input files, though, but also other inputs like the compiler and standard libraries (just to be certain)

petrochenkov

Contributor

I'd like to complain about this once more, because this is blocking my work unless I switch to the old build system.

Rustbuild is not usable for testing in presence of transient failures ("can't create directory" or something like this).
With the old build system make check can be run few times in order until it "saturates", then it's clear that all tests are passing.
x.py test doesn't saturate, it just runs the whole test suite again and again. So,

you need to check the error code returned by x.py test to make sure tests passed. Thankfully, $? exists in MSYS2, so this check can be done post-factum.
if some test fails due to a spurious failure you have to rerun the whole test suite again to make sure there's no legit failures, and there's no guarantees it won't fail spuriously again. I've made an experiment and run x.py test 5 times (2 times yesterday and 3 today), it took a whole lot of time and each time it failed with failed to create dir in different places, so I'm yet to observe a single successful completion of x.py test.

@alexcrichton
Despite not using the horrible make language anymore most of complexity is still there because it's specific to the build system and infrastructure themselves rather than to language. So I guess you'll have to continue fixing most of the issues yourself.
Do you have any plans to implement a fix for this specific issue?
Can you at least design and describe the solution implementable by someone without much understanding of rustbuild or build systems in general?

eddyb

Member

@petrochenkov IMO we can fix this at the compiletest level - AFAIK it already outputs some files in case of success, all it needs to do is check timestamps on them and compare to the test source(s).
rustbuild relies primarily on Cargo for knowing what to rebuild, so it seems fitting to have compiletest track tests, not to mention the huge benefit of retesting only what hasn't already passed.

petrochenkov

Contributor

IMO we can fix this at the compiletest level - AFAIK it already outputs some files in case of success, all it needs to do is check timestamps on them and compare to the test source(s)

I'm continuing to investigate how rustbuild works and will try to look at compiletest as well today, maybe I'll come up with some simple solution if it exists.

petrochenkov

Contributor

LLVM build has very similar problem - if spurious failure (or Ctrl+C) happens, then the whole LLVM build is done once more instead of building only remaining parts. Looks like at least few dependency tracking babies were thrown out with make bathwater.

eddyb

Member

@petrochenkov That's surprising, we invoke cmake and it should have everything it needs to restart.

alexcrichton

MemberAuthor

@petrochenkov

Rustbuild is not usable for testing in presence of transient failures

We definitely need to investigate and fix these problems. Transient failures are unacceptable and need to always be weeded out ASAP. We shouldn't architect everything around making these sorts of test failures easier to work with.

I do realize, though, that debugging these issues are often quite difficult. I've spent many hours/days of my life tracking down spurious failures on the bots through PRs but 100% of the time it's always been an important bug we needed to fix regardless!

... most of complexity is still there because it's specific to the build system and infrastructure themselves rather than to language. So I guess you'll have to continue fixing most of the issues yourself.

One of the primary goals of rustbuild was to make the build system far more approachable than the old Makefiles. Out of curiosity, have you taken a look at the internals of the build system? I've tried to be quite thorough in terms of documentation both at src/bootstrap/README.md and src/bootstrap/*.rs, but I'm sure there's always opportunity for more documentation!

Another major goal of rustbuild is to precisely dispel the notion that I personally must fix all build system issues. It's not scalable to rely on me to fix issues or really reasonable to have all information silo'd in one place. My hope is that rustbuild's barrier to entry is vastly lower than the makefiles and with some motivation and documentation it should be just as easy to get contributors ramped up on the build system as it is with the compiler/standard library.

Along those lines I'd like to challenge the notion of "so I guess you'll have to continue fixing most of the issues yourself". I'm sorry if you're feeling bad about the makefiles to rustbuild transition (you've opened quite a few issues where you seem quite annoyed), but this is a great time to help identify all these problems and work towards solutions. The makefiles I'm sure had something on the order of years of man-hours put into them to get everything working just the way it did. Unfortunately they reached the limit where no one could understand them and we still had more ambitions for our build system and change was needed.

My goal right now is to make rustbuild as easy as possible to understand and modify and if it's not there yet then there's work still to be done. I'd love for constructive feedback about how to improve the situation or any documentation/refactorings needed!

Do you have any plans to implement a fix for this specific issue? Can you at least design and describe the solution implementable by someone without much understanding of rustbuild or build systems in general?

I believe @eddyb's solution is the most promising so far. I would imagine the steps for this look like:

Inside compiletest, we need to learn about the dependencies of each test. This includes the source file (including aux files) of the test itself plus the compiler used to compile the test. This may likely also include the standard library of the compiler being used.
For each test, we'll have a "stamp" file of a successful test run. These stamps will likely go into a temporary directory that compiletest works with.
When each test completes successfully, the stamp is created.
When running a test, if any dependency file has a later modification time than the stamp, the test is rerun. Otherwise the test is skipped and assumed successful.

We'd want to be careful that make clean definitely removes all these stamp files to ensure we don't accidentally skip tests on buildbot, but otherwise we can flesh out bugs with this as they come up.

LLVM build has very similar problem - if spurious failure (or Ctrl+C) happens, then the whole LLVM build is done once more instead of building only remaining parts.

Yes, very little effort has been made to ensure this incremental rebuild works. Right now the "let's always start from a guaranteed state" hammer is applied but it should be easy to modify this. The llvm function in src/bootstrap/native.rs is entirely responsible for compiling LLVM, and you'll fine the relevant remove_dir_all in there for where this happens.

We can likely take a similar approach with reconfiguring and such to avoid deleting this directory, though. Or perhaps we can just avoid deletion unless src/rustllvm/llvm-auto-clean-trigger changes.

Again, though, this should be relatively easy to fix, and shouldn't be blocked on me! In any case this should also become a separate issue, it's likely too distracting to discuss this here further.

petrochenkov

Contributor

@alexcrichton

I'm sorry if you're feeling bad about the makefiles to rustbuild transition (you've opened quite a few issues where you seem quite annoyed)...

Of course I'm annoyed, I have to fight with infra instead of doing what I was going to do.
I don't feel bad however, it's still for the better - MIR annoyed people with ICEs for quite some time too, but the result is certainly worth it.

...but this is a great time to help identify all these problems...

That's what I do!

...and work towards solutions

... and that's what I don't do, probably. I wish I were a student with plenty of free time and energy. I've looked through the code though and it's clear enough, unless some hairy cross-compiling stuff, or unfamiliar platforms, or distribution are involved, so maybe I'll still fix something quickly fixable.

petrochenkov

Contributor

Transient failures are unacceptable and need to always be weeded out ASAP. We shouldn't architect everything around making these sorts of test failures easier to work with.

Solving concrete transient failures and solving this issue are orthogonal IMO. I agree that "failed to create dir" issues would better be fixed, but there are non-transient issues as well, not all platforms are equally well and timely supported (e.g. half of gdb tests fail on my Linux machine with older Ubuntu). Also, there's always chance you'll urgently need to Ctrl+C in the middle of tests for some reason. So, we shouldn't necessarily "architect everything around" these issues, but a workaround should certainly exist.

petrochenkov

mentioned this

on Jan 31, 2017

Delete the makefile build system #39431

petrochenkov

Contributor

From testing 39321aa:

It almost works. For some reason rsbegin.o and rsend.o are always more fresh than timestamps, so tests are still rerun unconditionally.
With an extra patch excluding rsbegin.o/rsend.o I finally managed to successfully run the full test suite with rustbuild. Yay.
Disabling individual tests by touching their timestamps works.
Caching and disabling isn't supported for tests in crates.
"Dry" run of ../x.py test with all tests previously passed is currently 3m17s on my machine (around 5s for the old build system).