Arithmetic and Numeric Operations

Vector and Scalar Arithmetic

A large subset of Python’s binary and in-place operators are supported on pdarray objects. Where supported, the behavior of these operators is identical to that of NumPy ndarray objects.

>>> A = ak.arange(10)
>>> A += 2
>>> A
array([2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> A + A
array([4, 6, 8, 10, 12, 14, 16, 18, 20, 22])
>>> 2 * A
array([4, 6, 8, 10, 12, 14, 16, 18, 20, 22])
>>> A == A
array([True, True, True, True, True, True, True, True, True, True])

Operations that are not implemented will raise a RuntimeError. In-place operations that would change the dtype of the pdarray are not implemented.

Element-wise Functions

Arrays support several mathematical functions that operate element-wise and return a pdarray of the same length.

arkouda.abs(pda: pdarray) pdarray[source]

Return the element-wise absolute value of the array.

Parameters:

pda (pdarray)

Returns:

A pdarray containing absolute values of the input array elements

Return type:

pdarray

Raises:

TypeError – Raised if the parameter is not a pdarray

Examples

>>> ak.abs(ak.arange(-5,-1))
array([5, 4, 3, 2])
>>> ak.abs(ak.linspace(-5,-1,5))
array([5, 4, 3, 2, 1])
arkouda.log(pda: pdarray) pdarray[source]

Return the element-wise natural log of the array.

Parameters:

pda (pdarray)

Returns:

A pdarray containing natural log values of the input array elements

Return type:

pdarray

Raises:

TypeError – Raised if the parameter is not a pdarray

Notes

Logarithms with other bases can be computed as follows:

Examples

>>> A = ak.array([1, 10, 100])
# Natural log
>>> ak.log(A)
array([0, 2.3025850929940459, 4.6051701859880918])
# Log base 10
>>> ak.log(A) / np.log(10)
array([0, 1, 2])
# Log base 2
>>> ak.log(A) / np.log(2)
array([0, 3.3219280948873626, 6.6438561897747253])
arkouda.exp(pda: pdarray) pdarray[source]

Return the element-wise exponential of the array.

Parameters:

pda (pdarray)

Returns:

A pdarray containing exponential values of the input array elements

Return type:

pdarray

Raises:

TypeError – Raised if the parameter is not a pdarray

Examples

>>> ak.exp(ak.arange(1,5))
array([2.7182818284590451, 7.3890560989306504, 20.085536923187668, 54.598150033144236])
>>> ak.exp(ak.uniform(5,1.0,5.0))
array([11.84010843172504, 46.454368507659211, 5.5571769623557188,
       33.494295836924771, 13.478894913238722])
arkouda.sin(pda: pdarray, where: bool | pdarray = True) pdarray[source]

Return the element-wise sine of the array.

Parameters:
  • pda (pdarray)

  • where (Boolean or pdarray) – This condition is broadcast over the input. At locations where the condition is True, the sine will be applied to the corresponding value. Elsewhere, it will retain its original value. Default set to True.

Returns:

A pdarray containing sin for each element of the original pdarray

Return type:

pdarray

Raises:

TypeError – Raised if the parameter is not a pdarray

arkouda.cos(pda: pdarray, where: bool | pdarray = True) pdarray[source]

Return the element-wise cosine of the array.

Parameters:
  • pda (pdarray)

  • where (Boolean or pdarray) – This condition is broadcast over the input. At locations where the condition is True, the cosine will be applied to the corresponding value. Elsewhere, it will retain its original value. Default set to True.

Returns:

A pdarray containing cosine for each element of the original pdarray

Return type:

pdarray

Raises:

TypeError – Raised if the parameter is not a pdarray

Scans

Scans perform a cumulative reduction over a pdarray, returning a pdarray of the same size.

arkouda.cumsum(pda: pdarray) pdarray[source]

Return the cumulative sum over the array.

The sum is inclusive, such that the i th element of the result is the sum of elements up to and including i.

Parameters:

pda (pdarray)

Returns:

A pdarray containing cumulative sums for each element of the original pdarray

Return type:

pdarray

Raises:

TypeError – Raised if the parameter is not a pdarray

Examples

>>> ak.cumsum(ak.arange([1,5]))
array([1, 3, 6])
>>> ak.cumsum(ak.uniform(5,1.0,5.0))
array([3.1598310770203937, 5.4110385860243131, 9.1622479306453748,
       12.710615785506533, 13.945880905466208])
>>> ak.cumsum(ak.randint(0, 1, 5, dtype=ak.bool))
array([0, 1, 1, 2, 3])
arkouda.cumprod(pda: pdarray) pdarray[source]

Return the cumulative product over the array.

The product is inclusive, such that the i th element of the result is the product of elements up to and including i.

Parameters:

pda (pdarray)

Returns:

A pdarray containing cumulative products for each element of the original pdarray

Return type:

pdarray

Raises:

TypeError – Raised if the parameter is not a pdarray

Examples

>>> ak.cumprod(ak.arange(1,5))
array([1, 2, 6, 24]))
>>> ak.cumprod(ak.uniform(5,1.0,5.0))
array([1.5728783400481925, 7.0472855509390593, 33.78523998586553,
       134.05309592737584, 450.21589865655358])

Reductions

Reductions return a scalar value.

arkouda.any(pda: pdarray) bool_[source]

Return True iff any element of the array evaluates to True.

Parameters:

pda (pdarray) – The pdarray instance to be evaluated

Returns:

Indicates if 1..n pdarray elements evaluate to True

Return type:

bool

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.all(pda: pdarray) bool_[source]

Return True iff all elements of the array evaluate to True.

Parameters:

pda (pdarray) – The pdarray instance to be evaluated

Returns:

Indicates if all pdarray elements evaluate to True

Return type:

bool

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.is_sorted(pda: pdarray) bool_[source]

Return True iff the array is monotonically non-decreasing.

Parameters:

pda (pdarray) – The pdarray instance to be evaluated

Returns:

Indicates if the array is monotonically non-decreasing

Return type:

bool

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.sum(pda: pdarray) float64[source]

Return the sum of all elements in the array.

Parameters:

pda (pdarray) – Values for which to calculate the sum

Returns:

The sum of all elements in the array

Return type:

np.float64

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.prod(pda: pdarray) float64[source]

Return the product of all elements in the array. Return value is always a np.float64 or np.int64

Parameters:

pda (pdarray) – Values for which to calculate the product

Returns:

The product calculated from the pda

Return type:

numpy_scalars

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.min(pda: pdarray) float64 | float32 | int8 | int16 | int32 | int64 | bool_ | str_ | uint8 | uint16 | uint32 | uint64[source]

Return the minimum value of the array.

Parameters:

pda (pdarray) – Values for which to calculate the min

Returns:

The min calculated from the pda

Return type:

numpy_scalars

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.max(pda: pdarray) float64 | float32 | int8 | int16 | int32 | int64 | bool_ | str_ | uint8 | uint16 | uint32 | uint64[source]

Return the maximum value of the array.

Parameters:

pda (pdarray) – Values for which to calculate the max

Returns:

The max calculated from the pda

Return type:

numpy_scalars

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.argmin(pda: pdarray) int64 | uint64[source]

Return the index of the first occurrence of the array min value.

Parameters:

pda (pdarray) – Values for which to calculate the argmin

Returns:

The index of the argmin calculated from the pda

Return type:

Union[np.int64, np.uint64]

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.argmax(pda: pdarray) int64 | uint64[source]

Return the index of the first occurrence of the array max value.

Parameters:

pda (pdarray) – Values for which to calculate the argmax

Returns:

The index of the argmax calculated from the pda

Return type:

Union[np.int64, np.uint64]

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.mean(pda: pdarray) float64[source]

Return the mean of the array.

Parameters:

pda (pdarray) – Values for which to calculate the mean

Returns:

The mean calculated from the pda sum and size

Return type:

np.float64

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • RuntimeError – Raised if there’s a server-side error thrown

arkouda.var(pda: pdarray, ddof: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 = 0) float64[source]

Return the variance of values in the array.

Parameters:
  • pda (pdarray) – Values for which to calculate the variance

  • ddof (int_scalars) – “Delta Degrees of Freedom” used in calculating var

Returns:

The scalar variance of the array

Return type:

np.float64

Raises:
  • TypeError – Raised if pda is not a pdarray instance

  • ValueError – Raised if the ddof >= pdarray size

  • RuntimeError – Raised if there’s a server-side error thrown

See also

mean, std

Notes

The variance is the average of the squared deviations from the mean, i.e., var = mean((x - x.mean())**2).

The mean is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of a hypothetical infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables.

arkouda.std(pda: pdarray, ddof: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 = 0) float64[source]

Return the standard deviation of values in the array. The standard deviation is implemented as the square root of the variance.

Parameters:
  • pda (pdarray) – values for which to calculate the standard deviation

  • ddof (int_scalars) – “Delta Degrees of Freedom” used in calculating std

Returns:

The scalar standard deviation of the array

Return type:

np.float64

Raises:
  • TypeError – Raised if pda is not a pdarray instance or ddof is not an integer

  • ValueError – Raised if ddof is an integer < 0

  • RuntimeError – Raised if there’s a server-side error thrown

See also

mean, var

Notes

The standard deviation is the square root of the average of the squared deviations from the mean, i.e., std = sqrt(mean((x - x.mean())**2)).

The average squared deviation is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of the infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even with ddof=1, it will not be an unbiased estimate of the standard deviation per se.

arkouda.mink(pda: pdarray, k: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64) pdarray[source]

Find the k minimum values of an array.

Returns the smallest k values of an array, sorted

Parameters:
  • pda (pdarray) – Input array.

  • k (int_scalars) – The desired count of minimum values to be returned by the output.

Returns:

The minimum k values from pda, sorted

Return type:

pdarray

Raises:
  • TypeError – Raised if pda is not a pdarray

  • ValueError – Raised if the pda is empty or k < 1

Notes

This call is equivalent in value to:

a[ak.argsort(a)[:k]]

and generally outperforms this operation.

This reduction will see a significant drop in performance as k grows beyond a certain value. This value is system dependent, but generally about a k of 5 million is where performance degredation has been observed.

Examples

>>> A = ak.array([10,5,1,3,7,2,9,0])
>>> ak.mink(A, 3)
array([0, 1, 2])
>>> ak.mink(A, 4)
array([0, 1, 2, 3])
arkouda.maxk(pda: pdarray, k: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64) pdarray[source]

Find the k maximum values of an array.

Returns the largest k values of an array, sorted

Parameters:
  • pda (pdarray) – Input array.

  • k (int_scalars) – The desired count of maximum values to be returned by the output.

Returns:

The maximum k values from pda, sorted

Return type:

pdarray, int

Raises:
  • TypeError – Raised if pda is not a pdarray or k is not an integer

  • ValueError – Raised if the pda is empty or k < 1

Notes

This call is equivalent in value to:

a[ak.argsort(a)[k:]]

and generally outperforms this operation.

This reduction will see a significant drop in performance as k grows beyond a certain value. This value is system dependent, but generally about a k of 5 million is where performance degredation has been observed.

Examples

>>> A = ak.array([10,5,1,3,7,2,9,0])
>>> ak.maxk(A, 3)
array([7, 9, 10])
>>> ak.maxk(A, 4)
array([5, 7, 9, 10])
arkouda.argmink(pda: pdarray, k: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64) pdarray[source]

Finds the indices corresponding to the k minimum values of an array.

Parameters:
  • pda (pdarray) – Input array.

  • k (int_scalars) – The desired count of indices corresponding to minimum array values

Returns:

The indices of the minimum k values from the pda, sorted

Return type:

pdarray, int

Raises:
  • TypeError – Raised if pda is not a pdarray or k is not an integer

  • ValueError – Raised if the pda is empty or k < 1

Notes

This call is equivalent in value to:

ak.argsort(a)[:k]

and generally outperforms this operation.

This reduction will see a significant drop in performance as k grows beyond a certain value. This value is system dependent, but generally about a k of 5 million is where performance degradation has been observed.

Examples

>>> A = ak.array([10,5,1,3,7,2,9,0])
>>> ak.argmink(A, 3)
array([7, 2, 5])
>>> ak.argmink(A, 4)
array([7, 2, 5, 3])
arkouda.argmaxk(pda: pdarray, k: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64) pdarray[source]

Find the indices corresponding to the k maximum values of an array.

Returns the largest k values of an array, sorted

Parameters:
  • pda (pdarray) – Input array.

  • k (int_scalars) – The desired count of indices corresponding to maxmum array values

Returns:

The indices of the maximum k values from the pda, sorted

Return type:

pdarray, int

Raises:
  • TypeError – Raised if pda is not a pdarray or k is not an integer

  • ValueError – Raised if the pda is empty or k < 1

Notes

This call is equivalent in value to:

ak.argsort(a)[k:]

and generally outperforms this operation.

This reduction will see a significant drop in performance as k grows beyond a certain value. This value is system dependent, but generally about a k of 5 million is where performance degradation has been observed.

Examples

>>> A = ak.array([10,5,1,3,7,2,9,0])
>>> ak.argmaxk(A, 3)
array([4, 6, 0])
>>> ak.argmaxk(A, 4)
array([1, 4, 6, 0])

Where

The where function is a way to multiplex two pdarray (or a pdarray and a scalar) based on a condition:

arkouda.where(condition: pdarray, A: str | float | float64 | float32 | int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | pdarray | Strings | Categorical, B: str | float | float64 | float32 | int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | pdarray | Strings | Categorical) pdarray | Strings | Categorical[source]

Returns an array with elements chosen from A and B based upon a conditioning array. As is the case with numpy.where, the return array consists of values from the first array (A) where the conditioning array elements are True and from the second array (B) where the conditioning array elements are False.

Parameters:
Returns:

Values chosen from A where the condition is True and B where the condition is False

Return type:

pdarray

Raises:
  • TypeError – Raised if the condition object is not a pdarray, if A or B is not an int, np.int64, float, np.float64, pdarray, str, Strings, Categorical if pdarray dtypes are not supported or do not match, or multiple condition clauses (see Notes section) are applied

  • ValueError – Raised if the shapes of the condition, A, and B pdarrays are unequal

Examples

>>> a1 = ak.arange(1,10)
>>> a2 = ak.ones(9, dtype=np.int64)
>>> cond = a1 < 5
>>> ak.where(cond,a1,a2)
array([1, 2, 3, 4, 1, 1, 1, 1, 1])
>>> a1 = ak.arange(1,10)
>>> a2 = ak.ones(9, dtype=np.int64)
>>> cond = a1 == 5
>>> ak.where(cond,a1,a2)
array([1, 1, 1, 1, 5, 1, 1, 1, 1])
>>> a1 = ak.arange(1,10)
>>> a2 = 10
>>> cond = a1 < 5
>>> ak.where(cond,a1,a2)
array([1, 2, 3, 4, 10, 10, 10, 10, 10])
>>> s1 = ak.array([f'str {i}' for i in range(10)])
>>> s2 = 'str 21'
>>> cond = (ak.arange(10) % 2 == 0)
>>> ak.where(cond,s1,s2)
array(['str 0', 'str 21', 'str 2', 'str 21', 'str 4', 'str 21', 'str 6', 'str 21', 'str 8','str 21'])
>>> c1 = ak.Categorical(ak.array([f'str {i}' for i in range(10)]))
>>> c2 = ak.Categorical(ak.array([f'str {i}' for i in range(9, -1, -1)]))
>>> cond = (ak.arange(10) % 2 == 0)
>>> ak.where(cond,c1,c2)
array(['str 0', 'str 8', 'str 2', 'str 6', 'str 4', 'str 4', 'str 6', 'str 2', 'str 8', 'str 0'])

Notes

A and B must have the same dtype and only one conditional clause is supported e.g., n < 5, n > 1, which is supported in numpy is not currently supported in Arkouda