Description
#60611 shows that the Drop
impl of Vec
is not idempotent. That is, if Drop
fails, the Vec
is left in an "un-droppable" state, and trying to re-drop the vector invokes undefined behavior - that is, the vector must be leaked.
It might be possible to make it idempotent without adding significant overhead [0], but I don't know whether we should do this. I think we should be clearer about whether the Drop
impl of a type is idempotent or not, since making the wrong assumption can lead to UB, so I believe we should document this somewhere.
We could document this for the Vec
type, but maybe this can also be something that can be documented at the top of the libcore
/liballoc
/libstd
crates for all types (e.g. Drop
impls of standard types are not idempotent).
[0] Maybe setting the vector len
field to zero before dropping the slice elements and having the Drop
impl of RawVec
set the capacity to zero before exiting is enough to avoid trying to re-drop the elements or trying to deallocate the memory (which is always freed on the first drop attempt).
Activity
gnzlbg commentedon May 14, 2019
cc @rust-lang/libs
hanna-kruppe commentedon May 14, 2019
"It's not idempotent" should be the safe default assumption, and I know of no case so far where we've documented that a drop glue is idempotent. We might want to mention the general principle in suitable places (e.g., nomicon, UCG) but I don't think we should repeat "as usual, double-dropping this type causes UB" again on all specific types. We should only document the exceptions, if any.
(I am also quite unsure whether going through the effort of guaranteeing that certain types' drop glue is idempotent is a good use of time, as it seems extremely niche and I don't know any use cases for that. But that just means I personally won't invest time in that discussion.)
SimonSapin commentedon May 14, 2019
I think this is not an issue wit
Vec
. Rather, any unsafe code needs to be careful about panic safety, especially if it is also generic.FWIW I’ve always assumed that double drop could be as much a memory safety issue as use-after-free. (“One and a half” drop even more so.) I don’t think
Drop::drop
idempotency should be expected for any type, but that sounds like a decision for @rust-lang/wg-unsafe-code-guidelines more than @rust-lang/libs.RalfJung commentedon May 14, 2019
Fully agreed.
The best you can hope for, IMO, is that a type will make an effort to drop as much as possible even when there's a panic during dropping.
Vec
actually does that by using the slice drop glue, which, if dropping one element panics, will keep dropping the other elements. This seems more useful than allowing idempotent dropping (it minimizes leakage even without anycatch_unwind
being involved), and it makes idempotent dropping unnecessary (if you catch a panic, the involved types already did everything they can to drop as much as possible, so there's no point in calling drop again).I think it makes more sense to improve panic-resiliance of our drop impls than to make them idempotent. For example, AFAIK
VecDeque
drops twoVec
's; if dropping the first panics, the second one will be leaked. This could be improved.One interesting question I see here is the interaction with the pinning drop guarantee. Does
Vec
deallocate the backing store if dropping one of the elements panicked? If yes, is that a violation of the drop guarantee (assuming we extendedVec
with pinning projections --fn pin_get(Pin<&mut Vec<T>>, idx: usize) -> Option<Pin<&mut T>>
)?gnzlbg commentedon May 14, 2019
Today double-drops are not a form of undefined behavior. They could, however, lead to undefined behavior depending on the types involved and how
Drop
is implemented for those types.As part of the UCGs one could try to make double-drops UB per se. That would mean that unsafe code needs to make sure that double-drops don't happen, period. That might be a breaking change. If we don't do that, then we have some types for which double-drops are ok, and some types for which they are not, and the only way to tell is either by reading the documentation, or inspecting the type's source code.
I'd rather not recommend people to rely on what the source code does. It suffices that one crate starts relying on some
Drop
impl being "accidentally idempotent" in libstd today, for us to not be able to evolve that impl in the future without breaking code.A "Unless stated otherwise,
Drop
impls of standard library types are not guaranteed to be idempotent, and if they are, we reserve the right to change that without maintaining backwards compatibility" note somewhere might be enough to prevent that.I expect that to happen in an idempotent
Drop
impl forVec
as well. Right now, the issue is that all elements that can be dropped are dropped, the backing allocation of the Vec is deallocated, but the len of the Vec is not set to zero, so a double drop will try to double drop the elements again (and then the drop impl of RawVec will be invoked, whose capacity has not been set to zero, which will try to double-free memory).Yes. When drop panics, the destructors of the fields are invoked. For Vec this means that the contained RawVec is dropped, which deallocates the backing storage of the Vec without setting the capacity of the Vec to zero.
AFAICT, no. If you have a
Vec<Pin<Box<T>>>
the only thing that will be deallocated without running destructors when thatVec
is dropped is the storage of thePin<Box<T>>
. Since the destructor of thePin<Box<T>>
does not run, the destructor of theBox<T>
does not run, and the backing allocation of theBox<T>
gets leaked, which is fine (some other thread could still write to it).SimonSapin commentedon May 14, 2019
https://rust-lang.github.io/rfcs/0320-nonzeroing-dynamic-drop.html (both the proposal that is now implemented an the description of the status quo before that) is relevant to the efforts that the language makes to not let double-
Drop
happen in safe Rust. This guarantee has existed for years, since before 1.0, so writers of unsafe code rely on it.Although calling
Drop::drop
twice is not inherently UB in the Rust language, many unsafe libraries are written with the assumption that it never happens. Therefore, writers of generic unsafe code should assume that double drop of a value of a type parameter can cause UB.Calling
Vec::drop
twice with the Unix system allocator causes a doublefree
, which is UB. ButVec
is only an example.RalfJung commentedon May 14, 2019
Drop is not the only operation that has such issues; pretty much any function you call can be UB or not under certain circumstances depending on implementation details. One example is e.g. relying on
Vec::push
not to reallocate -- if we did areserve
before, is that guaranteed? What if we didreserve
and thenpop
? And so on.If we really consider such details to be stable just because they are observable by "this program was not UB but now it is", we'd have to bake an abstract model of all of these data structures into our operational semantics, just to make some more code UB. I think that is a bad approach. UB is for enabling compiler optimizations. It is not for catching clients that exploit unstable details of the current implementation of some library. Conflating these two problems makes the definition of UB much, much more complicated, I think that's a mistake.
Just because some interaction with
Vec
is not UB currently, doesn't mean we are not allowed to ever change it to be UB. There is no implicit stabilization of the full set of UB-free interactions. I agree catching such "overfit" clients is an important problem, but let's keep that library-level discussion separate from the language-level discussion of what is and is not UB.I don't understand why you'd even want to call
drop
again, if we already agree that the firstdrop
should do the maximal amount of dropping that it can. You are basically saying "makeVec
's drop idempotent by making the seconddrop
a NOP". I don't see the point.Seems like you want to write generic code that drops stuff again if the first drop panicked. But what would be a situation where that is ever a good idea? From all I can see, the only place where this can help is if the second
drop
drops stuff that the firstdrop
"missed" because of the panic. I am saying, if that is the problem, then fixdrop
to not "miss" stuff.This misses the point, because
Box<T>: Unpin
.The interesting case is a
Vec<IntrusiveListElement>
. Then we could pin theVec
, and from ourPin<&mut Vec<IntrusiveListElement>>
get aPin<&mut IntrusiveListElement>
, and insert that into the list. Now we are in a situation where if theVec
's backing store gets deallocated without dropping theIntrusiveListElement
, we have a safety violation. But it seems to me ifIntrusiveListElement::drop
can guarantee that it itself does not panic, then we are okay: if an earlier element in the list panics while dropping (maybe it's a heterogeneous list through anenum
or trait objects), we know we still get dropped properly.gnzlbg commentedon May 14, 2019
How can you drop the
Vec<IntrusiveListElement>
whilePin<&mut IntrusiveListElement>
s into the vector are still live ?The documentation of
Vec
guarantees these details, e.g., see Vec's capacity-and-reallocation section, so AFAICT Rust unsafe code can rely on these details, and breaking that code would be an API breaking change.RalfJung commentedon May 14, 2019
I was talking about the implicit drop that happens when the vector goes out of scope.
I am aware. When libraries decide to document such details, clients may of course rely on them.
VecDeque
has no such section even though many similar questions apply; thus, clients may not rely on the same properties forVecDeque
even if they happen to be true currently.This matches the situation for idempotent drop: clients may rely on this if and only if the library decides to document this as a guarantee.
I don't think we should change our language spec to detect clients relying on
VecDeque
implementation details, and similarly, I don't think we should change our language spec to detect clients that rely on somedrop
being idempotent even though that is not documented (e.g., double-drop of an emptyVec
that hadshrink_to_fit
called on it).gnzlbg commentedon May 14, 2019
@SimonSapin The problem is that safe Rust can be called from
unsafe
Rust, and it isn't clear to me from that RFC whether writers ofunsafe
code can rely on otherunsafe
code not performing a double-drop. It isn't clear either whetherunsafe
code can assume that double-dropping something is ok.gnzlbg commentedon May 14, 2019
Are you saying that this is analogous to whether users should be able to rely on double-drops invoking / not invoking UB? These data-structure properties feel quite obvious to me, but I have no idea how users can today learn that, at least when using the libstd types, they should always assume that
Drop
is not idempotent unless a type guarantees otherwise. AFAICT they can just try it for some type, and if it works, deduce that it is ok. Then they publish their crate, and some time later we break their code. Saying that "we did not guarantee that it worked anywhere" does not change the fact that the code now is broken, and we are not warning them about this either. So even if we might be right in that technically we are allowed to break that code, it might turn out that in practice, now we cannot, and that user has somehow managed to, by accident, specify that double-drops for some type must be ok. We'd have a stronger case if we explicitly call this out in the docs, warn when users do this (or panic or similar), etc.26 remaining items