Skip to content

Common APIs across array libraries (1 year later) #187

Open
@kgryte

Description

@kgryte
Contributor

Overview

Similar to gh-6, this issue looks to identify commonalities across array libraries, but only addresses those APIs which are not already included in the array API specification.

Since gh-6 and its analysis,

  1. Every array library added more APIs.
  2. Continued convergence toward NumPy APIs (CuPy, PyTorch, TensorFlow.experimental, MXNet).
  3. Greater agreement among accelerator libraries (e.g., CuPy, MXNet, Torch, TF) wrt special functions not available in NumPy, but available in SciPy.

Method

Similar to gh-6, the list was compiled by doing the following:

  1. Generating a list of APIs based on publicly documented array APIs (e.g., by scraping website documentation).
  2. Computing the intersection across the individual datasets.

The following libraries were analyzed:

  • numpy
  • cupy
  • dask.array
  • jax
  • mxnet
  • pytorch
  • tensorflow

APIs

The following APIs were found to be common across the above libraries, but not already included in the array API specification:

cbrt
clip
copysign
count_nonzero

deg2rad
diff

erf (scipy)
erfc (scipy)
erfinv (scipy)
erfcinv (scipy)
exp2

gamma (scipy)
gammaln (scipy)

histogram
hypot

i0 (bessel)

logsumexp (accelerators)

nextafter

pad

rad2deg
reciprocal
repeat
rot90
rsqrt (accelerators)
rcbrt (accelerators)

sigmoid (accelerators)

take
tile
top_k (accelerators+dask)

xlogy (scipy)

We can split the APIs into the following categories...

Array Manipulation

pad
repeat
rot90
tile

Special Functions

cbrt
clip
copysign
deg2rad
erf
erfc
erfcinv
erfinv
exp2
gamma
gammaln
hypot
i0
nextafter
rad2deg
reciprocal
rsqrt
rcbrt
sigmoid
xlogy

Reductions

count_nonzero
histogram
logsumexp
top_k

Indexing

take

Other

diff

Next Steps

  • Identify which APIs could be suitable candidates for standardization in the next version of the specification.

Activity

rgommers

rgommers commented on May 29, 2021

@rgommers
Member

Someone just proposed to add topk (or top_k) to NumPy, and I'm trying to find the best way to use the data that generated the above list. It looks like none of the make ... commands in https://github.com/data-apis/array-api-comparison show this?

added this to the v2022 milestone on Oct 4, 2021
rgommers

rgommers commented on Dec 14, 2022

@rgommers
Member

This issue is still useful; I'll remove the v2022 milestone given that we're done adding new APIs to that.

The list of common APIs here is probably longer than the list of things that make sense to add. The content here can be used as reference - one data point in future API extension conversations.

removed this from the v2022 milestone on Dec 14, 2022
TomNicholas

TomNicholas commented on May 18, 2023

@TomNicholas

FYI of this list xarray currently uses:

np.clip
np.diff
np.pad
np.repeat
np.take
np.tile
rgommers

rgommers commented on May 19, 2023

@rgommers
Member

Thanks @TomNicholas. take is implemented, and for clip there's gh-482, overall my reading is that we'll add clip for the next version.

The others need looking into, but seem to me to me among the most-often used numpy functions that we haven't included yet. repeat and tile are straightforward to implement, diff isn't too bad either, pad is a bit painful with its many options.

tirthasheshpatel

tirthasheshpatel commented on Jul 17, 2023

@tirthasheshpatel
Contributor

SciPy has a usecase for the nextafter function which is present in this list. Since it is a very basic function and implemented by all the array libraries, I think it should be added to the Array API.

rgommers

rgommers commented on Jul 25, 2023

@rgommers
Member

SciPy has a usecase for the nextafter function which is present in this list. Since it is a very basic function and implemented by all the array libraries, I think it should be added to the Array API.

Thanks for the proposal @tirthasheshpatel! That seems reasonable, and now that there's an identified need we can prioritize it. @steff456 volunteered to dig into this one.

oleksandr-pavlyk

oleksandr-pavlyk commented on Jul 25, 2023

@oleksandr-pavlyk
Contributor

What about binary elementwise functions that give smallest/largest of the two?

These can be implemented with where and less, sure, but direct implementation is much faster for non-JIT-ting implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    API extensionAdds new functions or objects to the API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @rgommers@kgryte@oleksandr-pavlyk@TomNicholas@tirthasheshpatel

        Issue actions

          Common APIs across array libraries (1 year later) · Issue #187 · data-apis/array-api