-
Notifications
You must be signed in to change notification settings - Fork 290
Simple Arc implementation (without Weak refs) #253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
416ed1d
fcbd950
eea265b
9c266f6
c82fc47
4b9ec32
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
# Base Code | ||
|
||
Now that we've decided the layout for our implementation of `Arc`, let's create | ||
some basic code. | ||
|
||
## Constructing the Arc | ||
|
||
We'll first need a way to construct an `Arc<T>`. | ||
|
||
This is pretty simple, as we just need to box the `ArcInner<T>` and get a | ||
`NonNull<T>` pointer to it. | ||
|
||
We start the reference counter at 1, as that first reference is the current | ||
pointer. As the `Arc` is cloned or dropped, it is updated. It is okay to call | ||
`unwrap()` on the `Option` returned by `NonNull` as `Box::into_raw` guarantees | ||
that the pointer returned is not null. | ||
|
||
```rust,ignore | ||
impl<T> Arc<T> { | ||
pub fn new(data: T) -> Arc<T> { | ||
// We start the reference count at 1, as that first reference is the | ||
// current pointer. | ||
let boxed = Box::new(ArcInner { | ||
rc: AtomicUsize::new(1), | ||
data, | ||
}); | ||
Arc { | ||
// It is okay to call `.unwrap()` here as we get a pointer from | ||
// `Box::into_raw` which is guaranteed to not be null. | ||
ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), | ||
_marker: PhantomData, | ||
} | ||
} | ||
} | ||
``` | ||
|
||
## Send and Sync | ||
|
||
Since we're building a concurrency primitive, we'll need to be able to send it | ||
across threads. Thus, we can implement the `Send` and `Sync` marker traits. For | ||
more information on these, see [the section on `Send` and | ||
`Sync`](send-and-sync.md). | ||
|
||
This is okay because: | ||
* You can only get a mutable reference to the value inside an `Arc` if and only | ||
if it is the only `Arc` referencing that data | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* We use atomic counters for reference counting | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```rust,ignore | ||
unsafe impl<T: Sync + Send> Send for Arc<T> {} | ||
unsafe impl<T: Sync + Send> Sync for Arc<T> {} | ||
``` | ||
|
||
We need to have the bound `T: Sync + Send` because if we did not provide those | ||
bounds, it would be possible to share values that are thread-unsafe across a | ||
thread boundary via an `Arc`, which could possibly cause data races or | ||
unsoundness. | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Getting the `ArcInner` | ||
|
||
We'll now want to make a private helper function, `inner()`, which just returns | ||
the dereferenced `NonNull` pointer. | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
To dereference the `NonNull<T>` pointer into a `&T`, we can call | ||
`NonNull::as_ref`. This is unsafe, unlike the typical `as_ref` function, so we | ||
must call it like this: | ||
```rust,ignore | ||
// inside the impl<T> Arc<T> block from before: | ||
fn inner(&self) -> &ArcInner<T> { | ||
unsafe { self.ptr.as_ref() } | ||
} | ||
``` | ||
|
||
This unsafety is okay because while this `Arc` is alive, we're guaranteed that | ||
the inner pointer is valid. | ||
|
||
Here's all the code from this section: | ||
```rust,ignore | ||
impl<T> Arc<T> { | ||
pub fn new(data: T) -> Arc<T> { | ||
// We start the reference count at 1, as that first reference is the | ||
// current pointer. | ||
let boxed = Box::new(ArcInner { | ||
rc: AtomicUsize::new(1), | ||
data, | ||
}); | ||
Arc { | ||
// It is okay to call `.unwrap()` here as we get a pointer from | ||
// `Box::into_raw` which is guaranteed to not be null. | ||
ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), | ||
_marker: PhantomData, | ||
} | ||
} | ||
|
||
fn inner(&self) -> &ArcInner<T> { | ||
// This unsafety is okay because while this Arc is alive, we're | ||
// guaranteed that the inner pointer is valid. | ||
unsafe { self.ptr.as_ref() } | ||
Gankra marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} | ||
|
||
unsafe impl<T: Sync + Send> Send for Arc<T> {} | ||
unsafe impl<T: Sync + Send> Sync for Arc<T> {} | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# Cloning | ||
|
||
Now that we've got some basic code set up, we'll need a way to clone the `Arc`. | ||
|
||
Basically, we need to: | ||
1. Get the `ArcInner` value of the `Arc` | ||
2. Increment the atomic reference count | ||
3. Construct a new instance of the `Arc` from the inner pointer | ||
|
||
Next, we can update the atomic reference count as follows: | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
```rust,ignore | ||
self.inner().rc.fetch_add(1, Ordering::Relaxed); | ||
``` | ||
|
||
As described in [the standard library's implementation of `Arc` cloning][2]: | ||
> Using a relaxed ordering is alright here, as knowledge of the original | ||
> reference prevents other threads from erroneously deleting the object. | ||
> | ||
> As explained in the [Boost documentation][1]: | ||
> > Increasing the reference counter can always be done with | ||
> > memory_order_relaxed: New references to an object can only be formed from an | ||
> > existing reference, and passing an existing reference from one thread to | ||
> > another must already provide any required synchronization. | ||
> | ||
> [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html | ||
[2]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1171-L1181 | ||
|
||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
We'll need to add another import to use `Ordering`: | ||
```rust,ignore | ||
use std::sync::atomic::Ordering; | ||
``` | ||
|
||
It is possible that in some contrived programs (e.g. using `mem::forget`) that | ||
the reference count could overflow, but it's unreasonable that would happen in | ||
any reasonable program. | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Then, we need to return a new instance of the `Arc`: | ||
```rust,ignore | ||
Self { | ||
ptr: self.ptr, | ||
_marker: PhantomData | ||
} | ||
``` | ||
|
||
Now, let's wrap this all up inside the `Clone` implementation: | ||
```rust,ignore | ||
use std::sync::atomic::Ordering; | ||
|
||
impl<T> Clone for Arc<T> { | ||
fn clone(&self) -> Arc<T> { | ||
// Using a relaxed ordering is alright here as knowledge of the original | ||
// reference prevents other threads from wrongly deleting the object. | ||
self.inner().rc.fetch_add(1, Ordering::Relaxed); | ||
Self { | ||
ptr: self.ptr, | ||
_marker: PhantomData, | ||
} | ||
} | ||
} | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# Deref | ||
|
||
Alright. We now have a way to make, clone, and destroy `Arc`s, but how do we get | ||
to the data inside? | ||
|
||
What we need now is an implementation of `Deref`. | ||
|
||
We'll need to import the trait: | ||
```rust,ignore | ||
use std::ops::Deref; | ||
``` | ||
|
||
And here's the implementation: | ||
```rust,ignore | ||
impl<T> Deref for Arc<T> { | ||
type Target = T; | ||
|
||
fn deref(&self) -> &T { | ||
&self.inner().data | ||
} | ||
} | ||
``` | ||
|
||
Pretty simple, eh? This simply dereferences the `NonNull` pointer to the | ||
`ArcInner<T>`, then gets a reference to the data inside. | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,92 @@ | ||
# Dropping | ||
|
||
We now need a way to decrease the reference count and drop the data once it is | ||
low enough, otherwise the data will live forever on the heap. | ||
|
||
To do this, we can implement `Drop`. | ||
|
||
Basically, we need to: | ||
1. Get the `ArcInner` value of the `Arc` | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
2. Decrement the reference count | ||
3. If there is only one reference remaining to the data, then: | ||
4. Atomically fence the data to prevent reordering of the use and deletion of | ||
the data, then: | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
5. Drop the inner data | ||
|
||
Now, we need to decrement the reference count. We can also bring in step 3 by | ||
returning if the reference count is not equal to 1 (as `fetch_sub` returns the | ||
previous value): | ||
```rust,ignore | ||
if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 { | ||
ThePuzzlemaker marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return; | ||
} | ||
``` | ||
|
||
We then need to create an atomic fence to prevent reordering of the use of the | ||
data and deletion of the data. As described in [the standard library's | ||
implementation of `Arc`][3]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and change this to something like: What atomic ordering should we use here? To know that, we need to consider what happens-before relationships we want to ensure (or alternatively, what dataraces we want to prevent). Drop is special because it's the one place where we mutate the Arc's payload (by dropping and freeing it). This is a potential read-write data race with all the other threads that have been happily reading the payload without any synchronization. So we need to ensure all those non-atomic accesses have a proper happens-before relationship with us dropping and freeing the payload. To establish happens-before relationships with non-atomic accesses, we need (at least) Acquire-Release semantics. As a reminder, Acquires ensure non-atomic accesses after them on the same thread stay after them (they happen-before everything that comes after them) and Releases ensure non-atomic accesses before them on the same thread stay before them (everything before them happens-before them). So we have many threads that look like this:
And a "final" thread that looks like this:
And we want to ensure every thread agrees that everything happens-before E. One thing that jumps out clearly is that the non-final threads all end with an atomic access (B), and we want to keep everything else (A) before it. That's exactly what a Release does! So it seems we'd like our atomic decrement to be a Release.
However this on its own doesn't work -- our final thread would also use a Release, and that means (E) would be allowed to happen-before (D)! To prevent this we need Release's partner, an Acquire! We could make (D) AcquireRelease (AcqRel), but this would penalize all the other threads performing (B). So instead, we will introduce a separate Acquire that only happens if we're the final thread. And since we've already loaded all the values we need, we can use a fence.
If this helps, you can think of this like a sort of implicit RWLock: every Arc is a ReadGuard which allows unlimited read access until they go away and "Release the lock" (trapping all accesses on that thread before that point). The final thread then upgrades itself to a WriteGuard which "Acquires the lock" (creating a new critical section which strictly happens after everything else). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This sounds like for X, Y we always have that one happens-before the other or vice versa... which is not the way happens-before actually works. Happens-before is a partial order, and some events simply are unordered. So I think it'd be better to phrase this accordingly, saying something like "[...] and that means (D) would not be forced to happen-before (E)". Except that's also wrong, program-order is included in happens-before. The actual issue is between (E) and (A), isn't it? We must make (A) happens-before (E), and that's why there needs to be an "acquire" in the "final" thread. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The key thing with an acquire is that the release it reads from (and everything that comes before it) happens-before everything that comes after the acquire. Each acquire is paired with a release, and this pair establishes a happens-before link across threads. Personally, I find this way of thinking about it easier than thinking about the release and the acquire separately. (Also see what I said above: to my knowledge, happens-before includes program-order, so "X happens-before everything that comes after it in the same thread" is true for all X.) |
||
> This fence is needed to prevent reordering of use of the data and deletion of | ||
> the data. Because it is marked `Release`, the decreasing of the reference | ||
> count synchronizes with this `Acquire` fence. This means that use of the data | ||
> happens before decreasing the reference count, which happens before this | ||
> fence, which happens before the deletion of the data. | ||
> | ||
> As explained in the [Boost documentation][1], | ||
> | ||
> > It is important to enforce any possible access to the object in one | ||
> > thread (through an existing reference) to *happen before* deleting | ||
> > the object in a different thread. This is achieved by a "release" | ||
> > operation after dropping a reference (any access to the object | ||
> > through this reference must obviously happened before), and an | ||
> > "acquire" operation before deleting the object. | ||
> | ||
> In particular, while the contents of an Arc are usually immutable, it's | ||
> possible to have interior writes to something like a Mutex<T>. Since a Mutex | ||
> is not acquired when it is deleted, we can't rely on its synchronization logic | ||
> to make writes in thread A visible to a destructor running in thread B. | ||
> | ||
> Also note that the Acquire fence here could probably be replaced with an | ||
> Acquire load, which could improve performance in highly-contended situations. | ||
> See [2]. | ||
> | ||
> [1]: https://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html | ||
> [2]: https://github.com/rust-lang/rust/pull/41714 | ||
[3]: https://github.com/rust-lang/rust/blob/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/alloc/src/sync.rs#L1440-L1467 | ||
|
||
To do this, we do the following: | ||
```rust,ignore | ||
atomic::fence(Ordering::Acquire); | ||
``` | ||
|
||
We'll need to import `std::sync::atomic` itself: | ||
```rust,ignore | ||
use std::sync::atomic; | ||
``` | ||
|
||
Finally, we can drop the data itself. We use `Box::from_raw` to drop the boxed | ||
`ArcInner<T>` and its data. This takes a `*mut T` and not a `NonNull<T>`, so we | ||
must convert using `NonNull::as_ptr`. | ||
|
||
```rust,ignore | ||
unsafe { Box::from_raw(self.ptr.as_ptr()); } | ||
``` | ||
|
||
This is safe as we know we have the last pointer to the `ArcInner` and that its | ||
pointer is valid. | ||
|
||
Now, let's wrap this all up inside the `Drop` implementation: | ||
```rust,ignore | ||
impl<T> Drop for Arc<T> { | ||
fn drop(&mut self) { | ||
if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 { | ||
return; | ||
} | ||
// This fence is needed to prevent reordering of the use and deletion | ||
// of the data. | ||
atomic::fence(Ordering::Acquire); | ||
// This is safe as we know we have the last pointer to the `ArcInner` | ||
// and that its pointer is valid. | ||
unsafe { Box::from_raw(self.ptr.as_ptr()); } | ||
} | ||
} | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Final Code | ||
|
||
Here's the final code, with some added comments and re-ordered imports: | ||
```rust | ||
use std::marker::PhantomData; | ||
use std::ops::Deref; | ||
use std::ptr::NonNull; | ||
use std::sync::atomic::{self, AtomicUsize, Ordering}; | ||
|
||
pub struct Arc<T> { | ||
ptr: NonNull<ArcInner<T>>, | ||
_marker: PhantomData<ArcInner<T>>, | ||
} | ||
|
||
pub struct ArcInner<T> { | ||
rc: AtomicUsize, | ||
data: T, | ||
} | ||
|
||
impl<T> Arc<T> { | ||
pub fn new(data: T) -> Arc<T> { | ||
// We start the reference count at 1, as that first reference is the | ||
// current pointer. | ||
let boxed = Box::new(ArcInner { | ||
rc: AtomicUsize::new(1), | ||
data, | ||
}); | ||
Arc { | ||
// It is okay to call `.unwrap()` here as we get a pointer from | ||
// `Box::into_raw` which is guaranteed to not be null. | ||
ptr: NonNull::new(Box::into_raw(boxed)).unwrap(), | ||
_marker: PhantomData, | ||
} | ||
} | ||
|
||
fn inner(&self) -> &ArcInner<T> { | ||
// This unsafety is okay because while this Arc is alive, we're | ||
// guaranteed that the inner pointer is valid. Also, ArcInner<T> is | ||
// Sync if T is Sync. | ||
unsafe { self.ptr.as_ref() } | ||
} | ||
} | ||
|
||
unsafe impl<T: Sync + Send> Send for Arc<T> {} | ||
unsafe impl<T: Sync + Send> Sync for Arc<T> {} | ||
|
||
impl<T> Clone for Arc<T> { | ||
fn clone(&self) -> Arc<T> { | ||
// Using a relaxed ordering is alright here as knowledge of the original | ||
// reference prevents other threads from wrongly deleting the object. | ||
self.inner().rc.fetch_add(1, Ordering::Relaxed); | ||
Self { | ||
ptr: self.ptr, | ||
_marker: PhantomData, | ||
} | ||
} | ||
} | ||
|
||
impl<T> Drop for Arc<T> { | ||
fn drop(&mut self) { | ||
if self.inner().rc.fetch_sub(1, Ordering::Release) != 1 { | ||
return; | ||
} | ||
// This fence is needed to prevent reordering of the use and deletion | ||
// of the data. | ||
atomic::fence(Ordering::Acquire); | ||
// This is safe as we know we have the last pointer to the `ArcInner` | ||
// and that its pointer is valid. | ||
unsafe { Box::from_raw(self.ptr.as_ptr()); } | ||
} | ||
} | ||
|
||
impl<T> Deref for Arc<T> { | ||
type Target = T; | ||
|
||
fn deref(&self) -> &T { | ||
&self.inner().data | ||
} | ||
} | ||
``` |
Uh oh!
There was an error while loading. Please reload this page.