You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For example if the run-pass test suite passes entirely then we shouldn't re-run it if none of the dependencies have changed.
We should also be sure though that if there's a test filter that we don't consider the run-pass test suite as passing, an annoyance with the makefiles!
Thinking back to #36032 I think the true solution is to teach cargo to test the whole dependency closure of the given package and cache suites exit statuses. Then cargo -p --recursive --cache-result std; cargo -p --recursive --cache-result rustc Just Works™.
What if this was done in compiletest? Doesn't it already record success of each test? It would significantly speed up fixing up a failing test, for example.
Using compiletest sounds plausible, yeah! It'd have to check the mtime of not only the input files, though, but also other inputs like the compiler and standard libraries (just to be certain)
I'd like to complain about this once more, because this is blocking my work unless I switch to the old build system.
Rustbuild is not usable for testing in presence of transient failures ("can't create directory" or something like this).
With the old build system make check can be run few times in order until it "saturates", then it's clear that all tests are passing. x.py test doesn't saturate, it just runs the whole test suite again and again. So,
you need to check the error code returned by x.py test to make sure tests passed. Thankfully, $? exists in MSYS2, so this check can be done post-factum.
if some test fails due to a spurious failure you have to rerun the whole test suite again to make sure there's no legit failures, and there's no guarantees it won't fail spuriously again. I've made an experiment and run x.py test 5 times (2 times yesterday and 3 today), it took a whole lot of time and each time it failed with failed to create dir in different places, so I'm yet to observe a single successful completion of x.py test.
@alexcrichton
Despite not using the horrible make language anymore most of complexity is still there because it's specific to the build system and infrastructure themselves rather than to language. So I guess you'll have to continue fixing most of the issues yourself.
Do you have any plans to implement a fix for this specific issue?
Can you at least design and describe the solution implementable by someone without much understanding of rustbuild or build systems in general?
@petrochenkov IMO we can fix this at the compiletest level - AFAIK it already outputs some files in case of success, all it needs to do is check timestamps on them and compare to the test source(s).
rustbuild relies primarily on Cargo for knowing what to rebuild, so it seems fitting to have compiletest track tests, not to mention the huge benefit of retesting only what hasn't already passed.
IMO we can fix this at the compiletest level - AFAIK it already outputs some files in case of success, all it needs to do is check timestamps on them and compare to the test source(s)
I'm continuing to investigate how rustbuild works and will try to look at compiletest as well today, maybe I'll come up with some simple solution if it exists.
LLVM build has very similar problem - if spurious failure (or Ctrl+C) happens, then the whole LLVM build is done once more instead of building only remaining parts. Looks like at least few dependency tracking babies were thrown out with make bathwater.
Rustbuild is not usable for testing in presence of transient failures
We definitely need to investigate and fix these problems. Transient failures are unacceptable and need to always be weeded out ASAP. We shouldn't architect everything around making these sorts of test failures easier to work with.
I do realize, though, that debugging these issues are often quite difficult. I've spent many hours/days of my life tracking down spurious failures on the bots through PRs but 100% of the time it's always been an important bug we needed to fix regardless!
... most of complexity is still there because it's specific to the build system and infrastructure themselves rather than to language. So I guess you'll have to continue fixing most of the issues yourself.
One of the primary goals of rustbuild was to make the build system far more approachable than the old Makefiles. Out of curiosity, have you taken a look at the internals of the build system? I've tried to be quite thorough in terms of documentation both at src/bootstrap/README.md and src/bootstrap/*.rs, but I'm sure there's always opportunity for more documentation!
Another major goal of rustbuild is to precisely dispel the notion that I personally must fix all build system issues. It's not scalable to rely on me to fix issues or really reasonable to have all information silo'd in one place. My hope is that rustbuild's barrier to entry is vastly lower than the makefiles and with some motivation and documentation it should be just as easy to get contributors ramped up on the build system as it is with the compiler/standard library.
Along those lines I'd like to challenge the notion of "so I guess you'll have to continue fixing most of the issues yourself". I'm sorry if you're feeling bad about the makefiles to rustbuild transition (you've opened quite a few issues where you seem quite annoyed), but this is a great time to help identify all these problems and work towards solutions. The makefiles I'm sure had something on the order of years of man-hours put into them to get everything working just the way it did. Unfortunately they reached the limit where no one could understand them and we still had more ambitions for our build system and change was needed.
My goal right now is to make rustbuild as easy as possible to understand and modify and if it's not there yet then there's work still to be done. I'd love for constructive feedback about how to improve the situation or any documentation/refactorings needed!
Do you have any plans to implement a fix for this specific issue? Can you at least design and describe the solution implementable by someone without much understanding of rustbuild or build systems in general?
I believe @eddyb's solution is the most promising so far. I would imagine the steps for this look like:
Inside compiletest, we need to learn about the dependencies of each test. This includes the source file (including aux files) of the test itself plus the compiler used to compile the test. This may likely also include the standard library of the compiler being used.
For each test, we'll have a "stamp" file of a successful test run. These stamps will likely go into a temporary directory that compiletest works with.
When each test completes successfully, the stamp is created.
When running a test, if any dependency file has a later modification time than the stamp, the test is rerun. Otherwise the test is skipped and assumed successful.
We'd want to be careful that make clean definitely removes all these stamp files to ensure we don't accidentally skip tests on buildbot, but otherwise we can flesh out bugs with this as they come up.
LLVM build has very similar problem - if spurious failure (or Ctrl+C) happens, then the whole LLVM build is done once more instead of building only remaining parts.
Yes, very little effort has been made to ensure this incremental rebuild works. Right now the "let's always start from a guaranteed state" hammer is applied but it should be easy to modify this. The llvm function in src/bootstrap/native.rs is entirely responsible for compiling LLVM, and you'll fine the relevant remove_dir_all in there for where this happens.
We can likely take a similar approach with reconfiguring and such to avoid deleting this directory, though. Or perhaps we can just avoid deletion unless src/rustllvm/llvm-auto-clean-trigger changes.
Again, though, this should be relatively easy to fix, and shouldn't be blocked on me! In any case this should also become a separate issue, it's likely too distracting to discuss this here further.
I'm sorry if you're feeling bad about the makefiles to rustbuild transition (you've opened quite a few issues where you seem quite annoyed)...
Of course I'm annoyed, I have to fight with infra instead of doing what I was going to do.
I don't feel bad however, it's still for the better - MIR annoyed people with ICEs for quite some time too, but the result is certainly worth it.
...but this is a great time to help identify all these problems...
That's what I do!
...and work towards solutions
... and that's what I don't do, probably. I wish I were a student with plenty of free time and energy. I've looked through the code though and it's clear enough, unless some hairy cross-compiling stuff, or unfamiliar platforms, or distribution are involved, so maybe I'll still fix something quickly fixable.
Transient failures are unacceptable and need to always be weeded out ASAP. We shouldn't architect everything around making these sorts of test failures easier to work with.
Solving concrete transient failures and solving this issue are orthogonal IMO. I agree that "failed to create dir" issues would better be fixed, but there are non-transient issues as well, not all platforms are equally well and timely supported (e.g. half of gdb tests fail on my Linux machine with older Ubuntu). Also, there's always chance you'll urgently need to Ctrl+C in the middle of tests for some reason. So, we shouldn't necessarily "architect everything around" these issues, but a workaround should certainly exist.
Activity
alexcrichton commentedon Sep 10, 2016
@sanxiyn would love to tackle!
Ericson2314 commentedon Sep 22, 2016
Thinking back to #36032 I think the true solution is to teach cargo to test the whole dependency closure of the given package and cache suites exit statuses. Then
cargo -p --recursive --cache-result std; cargo -p --recursive --cache-result rustc
Just Works™.eddyb commentedon Sep 27, 2016
What if this was done in compiletest? Doesn't it already record success of each test? It would significantly speed up fixing up a failing test, for example.
alexcrichton commentedon Sep 27, 2016
Using compiletest sounds plausible, yeah! It'd have to check the mtime of not only the input files, though, but also other inputs like the compiler and standard libraries (just to be certain)
petrochenkov commentedon Dec 10, 2016
I'd like to complain about this once more, because this is blocking my work unless I switch to the old build system.
Rustbuild is not usable for testing in presence of transient failures ("can't create directory" or something like this).
With the old build system
make check
can be run few times in order until it "saturates", then it's clear that all tests are passing.x.py test
doesn't saturate, it just runs the whole test suite again and again. So,x.py test
to make sure tests passed. Thankfully,$?
exists in MSYS2, so this check can be done post-factum.x.py test
5 times (2 times yesterday and 3 today), it took a whole lot of time and each time it failed withfailed to create dir
in different places, so I'm yet to observe a single successful completion ofx.py test
.@alexcrichton
Despite not using the horrible
make
language anymore most of complexity is still there because it's specific to the build system and infrastructure themselves rather than to language. So I guess you'll have to continue fixing most of the issues yourself.Do you have any plans to implement a fix for this specific issue?
Can you at least design and describe the solution implementable by someone without much understanding of rustbuild or build systems in general?
eddyb commentedon Dec 10, 2016
@petrochenkov IMO we can fix this at the compiletest level - AFAIK it already outputs some files in case of success, all it needs to do is check timestamps on them and compare to the test source(s).
rustbuild relies primarily on Cargo for knowing what to rebuild, so it seems fitting to have compiletest track tests, not to mention the huge benefit of retesting only what hasn't already passed.
petrochenkov commentedon Dec 10, 2016
I'm continuing to investigate how rustbuild works and will try to look at compiletest as well today, maybe I'll come up with some simple solution if it exists.
petrochenkov commentedon Dec 10, 2016
LLVM build has very similar problem - if spurious failure (or Ctrl+C) happens, then the whole LLVM build is done once more instead of building only remaining parts. Looks like at least few dependency tracking babies were thrown out with
make
bathwater.eddyb commentedon Dec 10, 2016
@petrochenkov That's surprising, we invoke
cmake
and it should have everything it needs to restart.alexcrichton commentedon Dec 11, 2016
@petrochenkov
We definitely need to investigate and fix these problems. Transient failures are unacceptable and need to always be weeded out ASAP. We shouldn't architect everything around making these sorts of test failures easier to work with.
I do realize, though, that debugging these issues are often quite difficult. I've spent many hours/days of my life tracking down spurious failures on the bots through PRs but 100% of the time it's always been an important bug we needed to fix regardless!
One of the primary goals of rustbuild was to make the build system far more approachable than the old Makefiles. Out of curiosity, have you taken a look at the internals of the build system? I've tried to be quite thorough in terms of documentation both at
src/bootstrap/README.md
andsrc/bootstrap/*.rs
, but I'm sure there's always opportunity for more documentation!Another major goal of rustbuild is to precisely dispel the notion that I personally must fix all build system issues. It's not scalable to rely on me to fix issues or really reasonable to have all information silo'd in one place. My hope is that rustbuild's barrier to entry is vastly lower than the makefiles and with some motivation and documentation it should be just as easy to get contributors ramped up on the build system as it is with the compiler/standard library.
Along those lines I'd like to challenge the notion of "so I guess you'll have to continue fixing most of the issues yourself". I'm sorry if you're feeling bad about the makefiles to rustbuild transition (you've opened quite a few issues where you seem quite annoyed), but this is a great time to help identify all these problems and work towards solutions. The makefiles I'm sure had something on the order of years of man-hours put into them to get everything working just the way it did. Unfortunately they reached the limit where no one could understand them and we still had more ambitions for our build system and change was needed.
My goal right now is to make rustbuild as easy as possible to understand and modify and if it's not there yet then there's work still to be done. I'd love for constructive feedback about how to improve the situation or any documentation/refactorings needed!
I believe @eddyb's solution is the most promising so far. I would imagine the steps for this look like:
We'd want to be careful that
make clean
definitely removes all these stamp files to ensure we don't accidentally skip tests on buildbot, but otherwise we can flesh out bugs with this as they come up.Yes, very little effort has been made to ensure this incremental rebuild works. Right now the "let's always start from a guaranteed state" hammer is applied but it should be easy to modify this. The
llvm
function insrc/bootstrap/native.rs
is entirely responsible for compiling LLVM, and you'll fine the relevantremove_dir_all
in there for where this happens.We can likely take a similar approach with reconfiguring and such to avoid deleting this directory, though. Or perhaps we can just avoid deletion unless
src/rustllvm/llvm-auto-clean-trigger
changes.Again, though, this should be relatively easy to fix, and shouldn't be blocked on me! In any case this should also become a separate issue, it's likely too distracting to discuss this here further.
petrochenkov commentedon Dec 11, 2016
@alexcrichton
Of course I'm annoyed, I have to fight with infra instead of doing what I was going to do.
I don't feel bad however, it's still for the better - MIR annoyed people with ICEs for quite some time too, but the result is certainly worth it.
That's what I do!
... and that's what I don't do, probably. I wish I were a student with plenty of free time and energy. I've looked through the code though and it's clear enough, unless some hairy cross-compiling stuff, or unfamiliar platforms, or distribution are involved, so maybe I'll still fix something quickly fixable.
petrochenkov commentedon Dec 11, 2016
Solving concrete transient failures and solving this issue are orthogonal IMO. I agree that "failed to create dir" issues would better be fixed, but there are non-transient issues as well, not all platforms are equally well and timely supported (e.g. half of
gdb
tests fail on my Linux machine with older Ubuntu). Also, there's always chance you'll urgently need to Ctrl+C in the middle of tests for some reason. So, we shouldn't necessarily "architect everything around" these issues, but a workaround should certainly exist.petrochenkov commentedon Feb 3, 2017
From testing 39321aa:
rsbegin.o
andrsend.o
are always more fresh than timestamps, so tests are still rerun unconditionally.rsbegin.o
/rsend.o
I finally managed to successfully run the full test suite with rustbuild. Yay.touch
ing their timestamps works.../x.py test
with all tests previously passed is currently3m17s
on my machine (around5s
for the old build system).alexcrichton commentedon Feb 3, 2017
Thanks for testing @petrochenkov! Want to send a PR w/ the rsbegin/rsend patch after that PR lands?
Rollup merge of rust-lang#39892 - petrochenkov:rt, r=alexcrichton
Rollup merge of rust-lang#39892 - petrochenkov:rt, r=alexcrichton
Rollup merge of rust-lang#39892 - petrochenkov:rt, r=alexcrichton
Auto merge of #39892 - petrochenkov:rt, r=alexcrichton
Rollup merge of rust-lang#39892 - petrochenkov:rt, r=alexcrichton
alexcrichton commentedon May 30, 2017
I believe this is more-or-less done now, so closing.