Skip to content

RFC: add copy #495

Closed
Closed
@jakirkham

Description

@jakirkham
Member

Some array libraries implement a .copy() method (like NumPy). While there are some indirect ways to get at this now (asarray, reshape, etc.), currently the API lacks a way to do this directly and without other potential side-effects. Should add a method (unlike a function) would ensure the new array has the original array's type simply. Curious if there is appetite for including this in the API.

Related is a question of what interplay there is with __copy__ and/or __deepcopy__ (if any).

Activity

rgommers

rgommers commented on Oct 12, 2022

@rgommers
Member

Thanks @jakirkham. A couple of thoughts:

  • NumPy has both a function and a method. The function has an extra keyword (subok)
  • Either way, the new keywords don't seem appropriate, so it would simply be copy(x) or x.copy()
  • At that point, it's the same as copy.copy and copy.deepcopy I"d think (at least for NumPy)
  • PyTorch calls it clone, it doesn't have copy
  • Is there a reason to add it? Autograd or compilers related perhaps - better than the stdlib function?
asmeurer

asmeurer commented on Oct 12, 2022

@asmeurer
Member

Isn't this asarray(x, copy=True)? Does that have side-effects?

jakirkham

jakirkham commented on Oct 20, 2022

@jakirkham
MemberAuthor

To summarize the ask, library functions handling general arrays may want to copy as they want an array they can mutate safely without affecting user provided data.

We concluded that we want a function (not a method) and one needs to do a namespace lookup (x.__array_namespace__) to figure it out (as the type is likely not known by the library).

There is a separate question of what we call it. Either a new function (copy, clone) or use an existing one (asarray(x, copy=True)). We would also want to spell out that some libraries (Dask, JAX, maybe others) may not actually copy the underlying data.

__copy__ and __deepcopy__ are different enough (Dask would copy graphs, JAX something similar, etc.) that it is worth specifying this path may not copy the array data (if that is the user's concern) and that using the function above (name to be decided) would be preferred for data copying.

asmeurer

asmeurer commented on Oct 20, 2022

@asmeurer
Member

Right now asarray says that copy=True MUST copy the data, but maybe it should say it can not copy if it knows it solely owns the data and disallows mutation. Or can there be situations where a library thinks it solely owns the data but it doesn't actually, so it really has to do a real memory copy?

rgommers

rgommers commented on Nov 4, 2022

@rgommers
Member

@asmeurer good point. Maybe something like "must ensure that the returned array does not share data with another array, either by copying the data to a new memory location or in some other way (e.g., this property is guaranteed by design)".

added this to the v2023 milestone on Mar 9, 2023
modified the milestones: v2023, v2024 on Jan 25, 2024
changed the title [-]Adding `.copy()` method[/-] [+]RFC: add `copy()`[/+] on Apr 4, 2024
added
RFCRequest for comments. Feature requests and proposed changes.
on Apr 4, 2024
changed the title [-]RFC: add `copy()`[/-] [+]RFC: add `copy`[/+] on Apr 18, 2024

11 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    API extensionAdds new functions or objects to the API.Needs DiscussionNeeds further discussion.RFCRequest for comments. Feature requests and proposed changes.topic: CreationArray creation.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @asmeurer@rgommers@kgryte@jakirkham@lucascolley

      Issue actions

        RFC: add `copy` · Issue #495 · data-apis/array-api