Arithmetic and Numeric Operations¶
Vector and Scalar Arithmetic¶
A large subset of Python’s binary and in-place operators are supported on pdarray
objects. Where supported, the behavior of these operators is identical to that of NumPy ndarray
objects.
>>> A = ak.arange(10)
>>> A += 2
>>> A
array([2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> A + A
array([4, 6, 8, 10, 12, 14, 16, 18, 20, 22])
>>> 2 * A
array([4, 6, 8, 10, 12, 14, 16, 18, 20, 22])
>>> A == A
array([True, True, True, True, True, True, True, True, True, True])
Operations that are not implemented will raise a RuntimeError
. In-place operations that would change the dtype of the pdarray
are not implemented.
Element-wise Functions¶
Arrays support several mathematical functions that operate element-wise and return a pdarray
of the same length.
- arkouda.abs(pda)[source]¶
Return the element-wise absolute value of the array.
- Parameters:
pda (pdarray)
- Returns:
A pdarray containing absolute values of the input array elements
- Return type:
- Raises:
TypeError – Raised if the parameter is not a pdarray
Examples
>>> ak.abs(ak.arange(-5,-1)) array([5, 4, 3, 2])
>>> ak.abs(ak.linspace(-5,-1,5)) array([5, 4, 3, 2, 1])
- arkouda.log(pda)[source]¶
Return the element-wise natural log of the array.
- Parameters:
pda (pdarray)
- Returns:
A pdarray containing natural log values of the input array elements
- Return type:
- Raises:
TypeError – Raised if the parameter is not a pdarray
Notes
Logarithms with other bases can be computed as follows:
Examples
>>> A = ak.array([1, 10, 100]) # Natural log >>> ak.log(A) array([0, 2.3025850929940459, 4.6051701859880918]) # Log base 10 >>> ak.log(A) / np.log(10) array([0, 1, 2]) # Log base 2 >>> ak.log(A) / np.log(2) array([0, 3.3219280948873626, 6.6438561897747253])
- arkouda.exp(pda)[source]¶
Return the element-wise exponential of the array.
- Parameters:
pda (pdarray)
- Returns:
A pdarray containing exponential values of the input array elements
- Return type:
- Raises:
TypeError – Raised if the parameter is not a pdarray
Examples
>>> ak.exp(ak.arange(1,5)) array([2.7182818284590451, 7.3890560989306504, 20.085536923187668, 54.598150033144236])
>>> ak.exp(ak.uniform(5,1.0,5.0)) array([11.84010843172504, 46.454368507659211, 5.5571769623557188, 33.494295836924771, 13.478894913238722])
- arkouda.sin(pda, where=True)[source]¶
Return the element-wise sine of the array.
- Parameters:
- Returns:
A pdarray containing sin for each element of the original pdarray
- Return type:
- Raises:
TypeError – Raised if the parameter is not a pdarray
Scans¶
Scans perform a cumulative reduction over a pdarray
, returning a pdarray
of the same size.
- arkouda.cumsum(pda)[source]¶
Return the cumulative sum over the array.
The sum is inclusive, such that the
i
th element of the result is the sum of elements up to and includingi
.- Parameters:
pda (pdarray)
- Returns:
A pdarray containing cumulative sums for each element of the original pdarray
- Return type:
- Raises:
TypeError – Raised if the parameter is not a pdarray
Examples
>>> ak.cumsum(ak.arange([1,5])) array([1, 3, 6])
>>> ak.cumsum(ak.uniform(5,1.0,5.0)) array([3.1598310770203937, 5.4110385860243131, 9.1622479306453748, 12.710615785506533, 13.945880905466208])
>>> ak.cumsum(ak.randint(0, 1, 5, dtype=ak.bool_)) array([0, 1, 1, 2, 3])
- arkouda.cumprod(pda)[source]¶
Return the cumulative product over the array.
The product is inclusive, such that the
i
th element of the result is the product of elements up to and includingi
.- Parameters:
pda (pdarray)
- Returns:
A pdarray containing cumulative products for each element of the original pdarray
- Return type:
- Raises:
TypeError – Raised if the parameter is not a pdarray
Examples
>>> ak.cumprod(ak.arange(1,5)) array([1, 2, 6, 24]))
>>> ak.cumprod(ak.uniform(5,1.0,5.0)) array([1.5728783400481925, 7.0472855509390593, 33.78523998586553, 134.05309592737584, 450.21589865655358])
Reductions¶
Reductions return a scalar value.
- arkouda.any(pda, axis=None, keepdims=False)¶
Return True iff any element of the array evaluates to True.
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int or Tuple[int, ...], optional) – The axis or axes along which to compute the sum. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
Indicates if any pdarray element evaluates to True.
- Return type:
pdarray or bool
- Raises:
TypeError – Raised if pda is not a pdarray instance
RuntimeError – Raised if there’s a server-side error thrown
- arkouda.all(pda, axis=None, keepdims=False)¶
Return True iff all elements of the array evaluate to True.
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int or Tuple[int, ...], optional) – The axis or axes along which to compute the sum. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
Indicates if all pdarray elements evaluate to True.
- Return type:
pdarray or bool
- Raises:
TypeError – Raised if pda is not a pdarray instance
RuntimeError – Raised if there’s a server-side error thrown
- arkouda.is_sorted(pda, axis=None, keepdims=False)¶
Return True iff the array is monotonically non-decreasing.
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int or Tuple[int, ...], optional) – The axis or axes along which to compute the sum. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
Indicates if the array is monotonically non-decreasing.
- Return type:
pdarray or bool
- Raises:
TypeError – Raised if pda is not a pdarray instance
RuntimeError – Raised if there’s a server-side error thrown
- arkouda.sum(pda, axis=None, keepdims=False)¶
Return the sum of all elements in the array.
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int or Tuple[int, ...], optional) – The axis or axes along which to compute the sum. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
The sum of all elements in the array.
- Return type:
- Raises:
TypeError – Raised if pda is not a pdarray instance
RuntimeError – Raised if there’s a server-side error thrown
- arkouda.prod(pda, axis=None, keepdims=False)¶
Return the product of all elements in the array. Return value is always a np.float64 or np.int64
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int or Tuple[int, ...], optional) – The axis or axes along which to compute the sum. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
The product calculated from the pda.
- Return type:
- Raises:
TypeError – Raised if pda is not a pdarray instance
RuntimeError – Raised if there’s a server-side error thrown
- arkouda.min(pda, axis=None, keepdims=False)¶
Return the minimum value of the array.
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int or Tuple[int, ...], optional) – The axis or axes along which to compute the sum. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
The min calculated from the pda.
- Return type:
- Raises:
TypeError – Raised if pda is not a pdarray instance
RuntimeError – Raised if there’s a server-side error thrown
- arkouda.max(pda, axis=None, keepdims=False)¶
Return the maximum value of the array.
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int or Tuple[int, ...], optional) – The axis or axes along which to compute the sum. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
The max calculated from the pda.
- Return type:
- Raises:
TypeError – Raised if pda is not a pdarray instance
RuntimeError – Raised if there’s a server-side error thrown
- arkouda.argmin(pda, axis=None, keepdims=False)¶
Return the argmin of the array along the specified axis. This is returned as the ordered index.
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int, optional) – The axis along which to compute the index reduction. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
This argmin of the array.
- Return type:
- Raises:
TypeError – Raised if pda is not a pdarray instance. Raised axis is not an int.
RuntimeError – Raised if there’s a server-side error thrown.
- arkouda.argmax(pda, axis=None, keepdims=False)¶
Return the argmax of the array along the specified axis. This is returned as the ordered index.
- Parameters:
pda (pdarray) – The pdarray instance to be evaluated.
axis (int, optional) – The axis along which to compute the index reduction. If None, the reduction of the entire array is computed (returning a scalar).
keepdims (bool, optional) – Whether to keep the singleton dimension(s) along axis in the result.
- Returns:
This argmax of the array.
- Return type:
- Raises:
TypeError – Raised if pda is not a pdarray instance. Raised axis is not an int.
RuntimeError – Raised if there’s a server-side error thrown.
- arkouda.mean(pda)[source]¶
Return the mean of the array.
- Parameters:
pda (pdarray) – Values for which to calculate the mean
- Returns:
The mean calculated from the pda sum and size
- Return type:
np.float64
- Raises:
TypeError – Raised if pda is not a pdarray instance
RuntimeError – Raised if there’s a server-side error thrown
- arkouda.var(pda, ddof=0)[source]¶
Return the variance of values in the array.
- Parameters:
pda (pdarray) – Values for which to calculate the variance
ddof (int_scalars) – “Delta Degrees of Freedom” used in calculating var
- Returns:
The scalar variance of the array
- Return type:
np.float64
- Raises:
TypeError – Raised if pda is not a pdarray instance
ValueError – Raised if the ddof >= pdarray size
RuntimeError – Raised if there’s a server-side error thrown
Notes
The variance is the average of the squared deviations from the mean, i.e.,
var = mean((x - x.mean())**2)
.The mean is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of a hypothetical infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables.
- arkouda.std(pda, ddof=0)[source]¶
Return the standard deviation of values in the array. The standard deviation is implemented as the square root of the variance.
- Parameters:
pda (pdarray) – values for which to calculate the standard deviation
ddof (int_scalars) – “Delta Degrees of Freedom” used in calculating std
- Returns:
The scalar standard deviation of the array
- Return type:
np.float64
- Raises:
TypeError – Raised if pda is not a pdarray instance or ddof is not an integer
ValueError – Raised if ddof is an integer < 0
RuntimeError – Raised if there’s a server-side error thrown
Notes
The standard deviation is the square root of the average of the squared deviations from the mean, i.e.,
std = sqrt(mean((x - x.mean())**2))
.The average squared deviation is normally calculated as
x.sum() / N
, whereN = len(x)
. If, however, ddof is specified, the divisorN - ddof
is used instead. In standard statistical practice,ddof=1
provides an unbiased estimator of the variance of the infinite population.ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even withddof=1
, it will not be an unbiased estimate of the standard deviation per se.
- arkouda.mink(pda, k)[source]¶
Find the k minimum values of an array.
Returns the smallest k values of an array, sorted
- Parameters:
pda (pdarray) – Input array.
k (int_scalars) – The desired count of minimum values to be returned by the output.
- Returns:
The minimum k values from pda, sorted
- Return type:
- Raises:
TypeError – Raised if pda is not a pdarray
ValueError – Raised if the pda is empty or k < 1
Notes
This call is equivalent in value to:
a[ak.argsort(a)[:k]]
and generally outperforms this operation.
This reduction will see a significant drop in performance as k grows beyond a certain value. This value is system dependent, but generally about a k of 5 million is where performance degredation has been observed.
Examples
>>> A = ak.array([10,5,1,3,7,2,9,0]) >>> ak.mink(A, 3) array([0, 1, 2]) >>> ak.mink(A, 4) array([0, 1, 2, 3])
- arkouda.maxk(pda, k)[source]¶
Find the k maximum values of an array.
Returns the largest k values of an array, sorted
- Parameters:
pda (pdarray) – Input array.
k (int_scalars) – The desired count of maximum values to be returned by the output.
- Returns:
The maximum k values from pda, sorted
- Return type:
pdarray, int
- Raises:
TypeError – Raised if pda is not a pdarray or k is not an integer
ValueError – Raised if the pda is empty or k < 1
Notes
This call is equivalent in value to:
a[ak.argsort(a)[k:]]
and generally outperforms this operation.
This reduction will see a significant drop in performance as k grows beyond a certain value. This value is system dependent, but generally about a k of 5 million is where performance degredation has been observed.
Examples
>>> A = ak.array([10,5,1,3,7,2,9,0]) >>> ak.maxk(A, 3) array([7, 9, 10]) >>> ak.maxk(A, 4) array([5, 7, 9, 10])
- arkouda.argmink(pda, k)[source]¶
Finds the indices corresponding to the k minimum values of an array.
- Parameters:
pda (pdarray) – Input array.
k (int_scalars) – The desired count of indices corresponding to minimum array values
- Returns:
The indices of the minimum k values from the pda, sorted
- Return type:
pdarray, int
- Raises:
TypeError – Raised if pda is not a pdarray or k is not an integer
ValueError – Raised if the pda is empty or k < 1
Notes
This call is equivalent in value to:
ak.argsort(a)[:k]
and generally outperforms this operation.
This reduction will see a significant drop in performance as k grows beyond a certain value. This value is system dependent, but generally about a k of 5 million is where performance degradation has been observed.
Examples
>>> A = ak.array([10,5,1,3,7,2,9,0]) >>> ak.argmink(A, 3) array([7, 2, 5]) >>> ak.argmink(A, 4) array([7, 2, 5, 3])
- arkouda.argmaxk(pda, k)[source]¶
Find the indices corresponding to the k maximum values of an array.
Returns the largest k values of an array, sorted
- Parameters:
pda (pdarray) – Input array.
k (int_scalars) – The desired count of indices corresponding to maxmum array values
- Returns:
The indices of the maximum k values from the pda, sorted
- Return type:
pdarray, int
- Raises:
TypeError – Raised if pda is not a pdarray or k is not an integer
ValueError – Raised if the pda is empty or k < 1
Notes
This call is equivalent in value to:
ak.argsort(a)[k:]
and generally outperforms this operation.
This reduction will see a significant drop in performance as k grows beyond a certain value. This value is system dependent, but generally about a k of 5 million is where performance degradation has been observed.
Examples
>>> A = ak.array([10,5,1,3,7,2,9,0]) >>> ak.argmaxk(A, 3) array([4, 6, 0]) >>> ak.argmaxk(A, 4) array([1, 4, 6, 0])
Where¶
The where
function is a way to multiplex two pdarray
(or a pdarray
and a scalar) based on a condition:
- arkouda.where(condition, A, B)[source]¶
Returns an array with elements chosen from A and B based upon a conditioning array. As is the case with numpy.where, the return array consists of values from the first array (A) where the conditioning array elements are True and from the second array (B) where the conditioning array elements are False.
- Parameters:
condition (pdarray) – Used to choose values from A or B
A (Union[numeric_scalars, str, pdarray, Strings, Categorical]) – Value(s) used when condition is True
B (Union[numeric_scalars, str, pdarray, Strings, Categorical]) – Value(s) used when condition is False
- Returns:
Values chosen from A where the condition is True and B where the condition is False
- Return type:
- Raises:
TypeError – Raised if the condition object is not a pdarray, if A or B is not an int, np.int64, float, np.float64, pdarray, str, Strings, Categorical if pdarray dtypes are not supported or do not match, or multiple condition clauses (see Notes section) are applied
ValueError – Raised if the shapes of the condition, A, and B pdarrays are unequal
Examples
>>> a1 = ak.arange(1,10) >>> a2 = ak.ones(9, dtype=np.int64) >>> cond = a1 < 5 >>> ak.where(cond,a1,a2) array([1, 2, 3, 4, 1, 1, 1, 1, 1])
>>> a1 = ak.arange(1,10) >>> a2 = ak.ones(9, dtype=np.int64) >>> cond = a1 == 5 >>> ak.where(cond,a1,a2) array([1, 1, 1, 1, 5, 1, 1, 1, 1])
>>> a1 = ak.arange(1,10) >>> a2 = 10 >>> cond = a1 < 5 >>> ak.where(cond,a1,a2) array([1, 2, 3, 4, 10, 10, 10, 10, 10])
>>> s1 = ak.array([f'str {i}' for i in range(10)]) >>> s2 = 'str 21' >>> cond = (ak.arange(10) % 2 == 0) >>> ak.where(cond,s1,s2) array(['str 0', 'str 21', 'str 2', 'str 21', 'str 4', 'str 21', 'str 6', 'str 21', 'str 8','str 21'])
>>> c1 = ak.Categorical(ak.array([f'str {i}' for i in range(10)])) >>> c2 = ak.Categorical(ak.array([f'str {i}' for i in range(9, -1, -1)])) >>> cond = (ak.arange(10) % 2 == 0) >>> ak.where(cond,c1,c2) array(['str 0', 'str 8', 'str 2', 'str 6', 'str 4', 'str 4', 'str 6', 'str 2', 'str 8', 'str 0'])
Notes
A and B must have the same dtype and only one conditional clause is supported e.g., n < 5, n > 1, which is supported in numpy is not currently supported in Arkouda