Description
The new runtime should be able to figure out (a) when all tasks are blocked, in which case it should report a deadlock, and (b) when a task is stuck in a "tight" infinite loop (i.e., not hitting the scheduler). The former can be done precisely; the latter will probably have to be done heuristically with some sort of watchdog thread. This maybe should be two different bugs.
The former will work with 2 global reference counts - one which tracks the number of non-sleeping schedulers, and one which tracks the number of tasks blocked on I/O. When the last scheduler goes to sleep, if the I/O-blocking refcount is zero, it means all tasks are either exited or blocked on pipes. If the latter, the runtime should emit a "your tasks deadlocked and now the process is hanging" message and exit. This will build on the refcounts we'll probably use for #7702.
Activity
glaebhoerl commentedon Jul 19, 2013
Haskell throws an exception to the main thread in this case.
Edit: Oh, but Rust doesn't have async exceptions I think. Never mind me.
unsafe
in extra::arc #9251Thiez commentedon Sep 17, 2013
If you keep track of the tasks that are blocked on acquiring a resource you can detect deadlocks even when they don't block the entire program. This would require the scheduler to somehow keep track of resources, but this could be cheap (it only needs sufficient information to construct a wait-for graph). There could be a low priority task that periodically constructs the graph and performs a cycle detection.
Even if one is unwilling to pay the price of such a task, the wait-for graph could still be constructed by the scheduler when the 'all tasks are deadlocked' scenario occurs. Having an error message that describes the scenario could be really helpful with debugging.
bblum commentedon Sep 17, 2013
How would you track wait-for information in the case of pipes? I think this would require tasks to record whenever they give a pipe endpoint away to another task.
eholk commentedon Oct 18, 2013
One of the features about pipes that hopefully is still there is that if you were blocking on a receive and the task with the other end of the pipe fails then you would be woken up. Assuming this behavior is still there, instead of crashing when there's a pipe-related deadlock, the scheduler could just pick a task at random and fail it. Rust programs were originally meant to use the crash-only software philosophy, where they would be designed to restart a failed task and recover.
bblum commentedon Oct 18, 2013
The feature is still there.
thestinger commentedon Sep 19, 2014
Closing in favour of #1930 (Thread Sanitizer), since #17325 means Rust will no longer need any special tooling for this kind of debugging.
bblum commentedon Sep 19, 2014
I disagree. Detecting data races and detecting deadlocks or infinite loops are totally different challenges.
thestinger commentedon Sep 19, 2014
Thread sanitizer does detect deadlocks. Detecting an infinite loop would definitely require using
ptrace
hacks and doesn't seem to be a challenge specific to Rust. It would need to operate at a machine code / system call level as it would need to detect side effects to find a no-op infinite loop.bblum commentedon Sep 19, 2014
Does #17325 mean the green thread scheduler is gone completely, and rust tasks will be 1:1 with pthreads instead?
thestinger commentedon Sep 19, 2014
Yes, it's going to be removed from the standard libraries and likely moved out to a third party repository. It would need to provide a new native IO / concurrency library in addition to a green one if it wants to keep doing the dynamic runtime selection.
bblum commentedon Sep 19, 2014
Is there a plan to transfer these scheduler-related issues to the other repository's issue tracker, or is the plan to just forget about them?
thestinger commentedon Sep 19, 2014
@bblum: Well, I'm linking every one to #17325 so that GitHub makes a list of the relevant issues and they can then be transferred over. I think many are not going to be relevant to a new implementation without tight integration into the standard libraries.
bblum commentedon Sep 19, 2014
I see. Thanks for clarifying.
fixes: rust-lang#7889
Auto merge of rust-lang#7896 - surechen:fix_manual_split_once, r=cams…