Skip to content

Tracking issue for PR 40537 (hint_core_should_pause) #41196

@mstewartgallus

Description

@mstewartgallus
Contributor

Activity

changed the title [-]Tracking issue for PR https://github.com/rust-lang/rust/pull/40537#issuecomment-292972023[/-] [+]Tracking issue for PR 40537 (hint_core_should_pause)[/+] on Apr 10, 2017
added
B-unstableBlocker: Implemented in the nightly compiler and unstable.
T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.
on Apr 20, 2017
clarfonthey

clarfonthey commented on May 2, 2017

@clarfonthey
Contributor

Reposting my comment from #41207: the spin crate calls this function cpu_relax; maybe we should do something similar?

I'd much prefer spin_loop_hint or something similar to the current function name. We could even potentially have a warning that suggests using this function, or even better, have the compiler automatically add these instructions inside busy loops.

added
C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFC
on Jul 22, 2017
ghost

ghost commented on Oct 5, 2017

@ghost

I like the spin_loop_hint name, but would be okay with spin_wait as well.
Java calls this onSpinWait, in .NET it's SpinWait, and Linux kernel has cpu_relax.

Would it be a good time to propose stabilization after some name bikeshedding?
I've wanted this function in stable Rust for quite a while. :)

alexcrichton

alexcrichton commented on Oct 5, 2017

@alexcrichton
Member

@rfcbot fcp merge

Sounds like a great issue to consider for stabilization!

Proposed name

spin_loop_hint

rfcbot

rfcbot commented on Oct 5, 2017

@rfcbot
Collaborator

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

added
proposed-final-comment-periodProposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off.
on Oct 5, 2017
alexcrichton

alexcrichton commented on Oct 5, 2017

@alexcrichton
Member

To be clear I also don't have much of a preference on names, @stjepang's preference of spin_loop_hint also seems fine by me

sfackler

sfackler commented on Oct 5, 2017

@sfackler
Member

spin_loop_hint or spin_wait both seem reasonable.

dtolnay

dtolnay commented on Oct 8, 2017

@dtolnay
Member

I am on board with stabilizing a function with this behavior. But in general, the more a function's behavior may be confusing, the more careful we need to be in selecting a name.

  • 👎 spin_wait because the spin loop code would read as though this call internally does a spin wait, rather than identifies the spin wait containing the call.
  • 👎 cpu_relax because the word "relax" is already claimed in the context of atomics for Ordering::Relaxed.
  • 👎 hint_core_should_pause because "pause" means do nothing for a while, where doing something else for a while may be more what you want.
  • 👍 spin_loop_hint which is what the Intel® 64 and IA-32 Architectures Software Developer’s Manual names the pause instruction.

I also sort of like the naming that WebKit uses -- pause is called YIELD_PROCESSOR and thread::yield_now is called YIELD_THREAD. This frames them as a pair of lower/higher level instructions. Yield processor is a hint to the processor, and yield thread is a hint to the OS's scheduler. Go uses the same naming -- pause is called runtime.procyield. So I'll add another name:

  • 👍 yield_processor.
clarfonthey

clarfonthey commented on Oct 8, 2017

@clarfonthey
Contributor

👍 to spin_loop_hint

dtolnay

dtolnay commented on Oct 18, 2017

@dtolnay
Member

Looking at this again, I prefer yield_processor over spin_loop_hint. It seems like both would need to be documented equally carefully, but yield_processor results in code that is more pleasant to read. Unless yield_processor mischaracterizes how people conceptualize this instruction, I would go with that one. Also, precedent in Go and WebKit is encouraging.

It could also work as std::sync::atomic::yield. Is everything that runs Rust a "processor"?

6 remaining items

ghost

ghost commented on Oct 21, 2017

@ghost

@clarcharr

Also this is a general question, but how reasonable would it be for the compiler itself to notice these cases and simply insert the instruction into busy loops?

I don't think the compiler can automatically decide when and where to insert the instruction into a loop. This is something only the programmer knows, really.

Inserting the instruction into a loop is a very delicate matter. Consider how parking_lot does it:

if self.counter <= 10 {
    cpu_relax(4 << self.counter);
} else {
    thread_yield();
}

Here, cpu_relax(t) will execute the instruction t times. Each successive step of the loop the instruction is executed twice as many times. There's a lot going on in that snippet - and is something that's very hard for the compiler to do correctly.

There's no recipe for how to properly use the instruction. You just have to carefully tweak the parameters, profile the program, and see what works best. There's no other way. :)

Ixrec

Ixrec commented on Oct 24, 2017

@Ixrec
Contributor

I'm not familiar with this function, but after spending a few minutes reading up on it, I believe spin_loop_hint is the only name suggested so far that would not be misleading to me. All of the others contain a word like yield, pause, wait, relax, etc that implies significantly more impact on execution semantics than a mere hint to the CPU.

main--

main-- commented on Oct 25, 2017

@main--
Contributor

From the Intel Optimization Manual, section 8.4.7:

The PAUSE instruction is typically used with software threads executing on two logical processors located in the same processor core, waiting for a lock to be released. Such short wait loops tend to last between tens and a few hundreds of cycles, so performance-wise it is more beneficial to wait while occupying the CPU than yielding to the OS. When the wait loop is expected to last for thousands of cycles or more, it is preferable to yield to the operating system by calling one of the OS synchronization API functions, such as WaitForSingleObject on Windows* OS.

The PAUSE instruction is intended to:

  • Temporarily provide the sibling logical processor (ready to make forward progress exiting the spin loop) with competitively shared hardware resources. [...]
  • Save power consumed by the processor core compared to executing equivalent spin loop instruction sequence [...]

The latency of PAUSE instruction in prior generation microarchitecture is about 10 cycles, whereas on Skylake microarchitecture it has been extended to as many as 140 cycles.

The increased latency (allowing more effective utilization of competitively-shared microarchitectural resources to the logical processor ready to make forward progress) has a small positive performance impact of 1-2% on highly threaded applications. [...]

As the PAUSE latency has been increased significantly, workloads that are sensitive to PAUSE latency will suffer some performance loss.

The instruction's original function (hint to avoid memory order violation) has been extended quite a bit. So yes, it does yield resources.

alexcrichton

alexcrichton commented on Oct 31, 2017

@alexcrichton
Member

I'm going to go ahead an check @BurntSushi's checkbox as it's been awhile and I think he's quite busy!

added
final-comment-periodIn the final comment period and will be merged soon unless new substantive objections are raised.
and removed
proposed-final-comment-periodProposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off.
on Oct 31, 2017
rfcbot

rfcbot commented on Oct 31, 2017

@rfcbot
Collaborator

🔔 This is now entering its final comment period, as per the review above. 🔔

rfcbot

rfcbot commented on Nov 10, 2017

@rfcbot
Collaborator

The final comment period is now complete.

bstrie

bstrie commented on Nov 17, 2017

@bstrie
Contributor

I marginally prefer spin_loop_hint over yield_processor, if only because it has zero chance of confusion with, er, whatever the yield keyword ends up doing.

added a commit that references this issue on Nov 27, 2017
ghost

ghost commented on Jan 11, 2018

@ghost

In #43751, there has been some talk of stabilizing module std::intrinsics or introducing std::hints. I wonder if such a module would be a better fit for spin_loop_hint than std::sync::atomic - what does everyone think?

(just a reminder: this feature is currently in beta, scheduled for stabilization in 1.24)

ghost

ghost commented on Apr 12, 2018

@ghost

This feature was stabilized. Let's close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    B-unstableBlocker: Implemented in the nightly compiler and unstable.C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFCT-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.final-comment-periodIn the final comment period and will be merged soon unless new substantive objections are raised.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @alexcrichton@kennytm@SimonSapin@main--@bstrie

        Issue actions

          Tracking issue for PR 40537 (hint_core_should_pause) · Issue #41196 · rust-lang/rust