uarray

uarray is built around a back-end protocol, and overridable multimethods. It is necessary to define multimethods for back-ends to be able to override them. See the documentation of generate_multimethod on how to write multimethods.

Let’s start with the simplest:

__ua_domain__ defines the back-end domain. The domain consists of period- separated string consisting of the modules you extend plus the submodule. For example, if a submodule module2.submodule extends module1 (i.e., it exposes dispatchables marked as types available in module1), then the domain string should be "module1.module2.submodule".

For the purpose of this demonstration, we’ll be creating an object and setting its attributes directly. However, note that you can use a module or your own type as a backend as well.

>>> class Backend: pass
>>> be = Backend()
>>> be.__ua_domain__ = "ua_examples"

It might be useful at this point to sidetrack to the documentation of generate_multimethod to find out how to generate a multimethod overridable by uarray. Needless to say, writing a backend and creating multimethods are mostly orthogonal activities, and knowing one doesn’t necessarily require knowledge of the other, although it is certainly helpful. We expect core API designers/specifiers to write the multimethods, and implementors to override them. But, as is often the case, similar people write both.

Without further ado, here’s an example multimethod:

>>> import uarray as ua
>>> from uarray import Dispatchable
>>> def override_me(a, b):
...   return Dispatchable(a, int),
>>> def override_replacer(args, kwargs, dispatchables):
...     return (dispatchables[0], args[1]), {}
>>> overridden_me = ua.generate_multimethod(
...     override_me, override_replacer, "ua_examples"
... )

Next comes the part about overriding the multimethod. This requires the __ua_function__ protocol, and the __ua_convert__ protocol. The __ua_function__ protocol has the signature (method, args, kwargs) where method is the passed multimethod, args/kwargs specify the arguments and dispatchables is the list of converted dispatchables passed in.

>>> def __ua_function__(method, args, kwargs):
...     return method.__name__, args, kwargs
>>> be.__ua_function__ = __ua_function__

The other protocol of interest is the __ua_convert__ protocol. It has the signature (dispatchables, coerce). When coerce is False, conversion between the formats should ideally be an O(1) operation, but it means that no memory copying should be involved, only views of the existing data.

>>> def __ua_convert__(dispatchables, coerce):
...     for d in dispatchables:
...         if d.type is int:
...             if coerce and d.coercible:
...                 yield str(d.value)
...             else:
...                 yield d.value
>>> be.__ua_convert__ = __ua_convert__

Now that we have defined the backend, the next thing to do is to call the multimethod.

>>> with ua.set_backend(be):
...      overridden_me(1, "2")
('override_me', (1, '2'), {})

Note that the marked type has no effect on the actual type of the passed object. We can also coerce the type of the input.

>>> with ua.set_backend(be, coerce=True):
...     overridden_me(1, "2")
...     overridden_me(1.0, "2")
('override_me', ('1', '2'), {})
('override_me', ('1.0', '2'), {})

Another feature is that if you remove __ua_convert__, the arguments are not converted at all and it’s up to the backend to handle that.

>>> del be.__ua_convert__
>>> with ua.set_backend(be):
...     overridden_me(1, "2")
('override_me', (1, '2'), {})

You also have the option to return NotImplemented, in which case processing moves on to the next back-end, which in this case, doesn’t exist. The same applies to __ua_convert__.

>>> be.__ua_function__ = lambda *a, **kw: NotImplemented
>>> with ua.set_backend(be):
...     overridden_me(1, "2")
Traceback (most recent call last):
    ...
uarray.backend.BackendNotImplementedError: ...

The last possibility is if we don’t have __ua_convert__, in which case the job is left up to __ua_function__, but putting things back into arrays after conversion will not be possible.

Functions

all_of_type(arg_type)

Marks all unmarked arguments as a given type.

create_multimethod(*args, **kwargs)

Creates a decorator for generating multimethods.

generate_multimethod(argument_extractor, …)

Generates a multimethod.

mark_as(dispatch_type)

Creates a utility function to mark something as a specific type.

set_backend(backend[, coerce, only])

A context manager that sets the preferred backend.

set_global_backend(backend[, coerce, only])

This utility method replaces the default backend for permanent use.

register_backend(backend)

This utility method sets registers backend for permanent use.

clear_backends(domain[, registered, globals])

This utility method clears registered backends.

skip_backend(backend)

A context manager that allows one to skip a given backend from processing entirely.

wrap_single_convertor(convert_single)

Wraps a __ua_convert__ defined for a single element to all elements.

Classes

Dispatchable(value, dispatch_type[, coercible])

A utility class which marks an argument with a specific dispatch type.

Exceptions

BackendNotImplementedError

An exception that is thrown when no compatible backend is found for a method.

Design Philosophies

The following section discusses the design philosophies of uarray, and the reasoning behind some of these philosophies.

Modularity

uarray (and its sister modules unumpy and others to come) were designed from the ground-up to be modular. This is part of why uarray itself holds the core backend and dispatch machinery, and unumpy holds the actual multimethods. Also, unumpy can be developed completely separately to uarray, although the ideal place to have it would be NumPy itself.

However, the benefit to having it separate is that it could span multiple NumPy versions, even before NEP-18 (or even NEP-13) was available. Another benefit is that it can have a faster release cycle to help it achieve this.

Separate Imports

Code wishing to use the backend machinery for NumPy (as an example) will use the statement import unumpy as np instead of the usual import numpy as np. This is deliberate: it makes dispatching opt-in instead of being forced to use it, and the overhead associated with it. However, a package is free to define its main methods as the dispatchable versions, thereby allowing dispatch on the default implementation.

Extensibility and Choice

If some effort is put into the dispatch machinery, it’s possible to dispatch over arbitrary objects — including arrays, dtypes, and so on. A method defines the type of each dispatchable argument, and backends are only passed types they know how to dispatch over, when deciding whether or not to use that backend. For example, if a backend doesn’t know how to dispatch over dtypes, it won’t be asked to decide based on that front.

Methods can have a default implementation in terms of other methods, but they’re still overridable.

This means that only one framework is needed to, for example, dispatch over ufunc s, arrays, dtypes and all other primitive objects in NumPy, while keeping the core uarray code independent of NumPy and even unumpy.

Backends can span modules, so SciPy could jump in and define its own methods on NumPy objects and make them overridable within the NumPy backend.

User Choice

The users of unumpy or uarray can choose which backend they want to prefer with a simple context manager. They also have the ability to force a backend, and to skip a backend. This is useful for array-like objects that provide other array-like objects by composing them. For example, Dask could perform all its blockwise function calls with the following psuedocode (obviously, this is simplified):

in_arrays = extract_inner_arrays(input_arrays)
out_arrays = []
for input_arrays_single in in_arrays:
    args, kwargs = blockwise_function.replace_args_kwargs(
        args, kwargs, input_arrays_single)
    with ua.skip_backend(DaskBackend):
        out_arrays_single = blockwise_function(*args, **kwargs)
    out_arrays.append(out_arrays_single)

return combine_arrays(out_arrays)

A user would simply do the following:

with ua.use_backend(DaskBackend):
    # Write all your code here
    # It will prefer the Dask backend

There is no default backend, to unumpy, NumPy is just another backend. One can register backends, which will all be tried in indeterminate order when no backend is selected.

Addressing past flaws

The progress on NumPy’s side for defining an override mechanism has been slow, with NEP-13 being first introduced in 2013, and with the wealth of dispatchable objects (including arrays, ufuns and dtypes), and with the advent of libraries like Dask, CuPy, Xarray, PyData/Sparse and XND, it has become clear that the need for alternative array-like implementations is growing. There are even other libraries like PyTorch, and TensorFlow that’d be possible to express in NumPy API-like terms. Another example includes the Keras API, for which an overridable ukeras could be created, similar to unumpy.

uarray is intended to have fast development to fill the need posed by these communities, while keeping itself as general as possible, and quickly reach maturity, after which backwards compatibility will be guaranteed.

Performance considerations will come only after such a state has been reached.