arkouda.numpy.pdarraysetops

Functions

concatenate(...)

Concatenate a list or tuple of pdarray or Strings objects into

in1d(…)

Test whether each element of a 1-D array is also present in a second array.

indexof1d(→ arkouda.numpy.pdarrayclass.pdarray)

Return indices of query items in a search list of items. Items not found will be excluded.

intersect1d(...)

Find the intersection of two arrays.

setdiff1d(→ Union[arkouda.numpy.pdarrayclass.pdarray, ...)

Find the set difference of two arrays.

setxor1d(→ Union[arkouda.numpy.pdarrayclass.pdarray, ...)

Find the set exclusive-or (symmetric difference) of two arrays.

union1d(→ arkouda.pandas.groupbyclass.groupable)

Find the union of two arrays or lists of arrays.

Module Contents

arkouda.numpy.pdarraysetops.concatenate(arrays: Sequence[arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | arkouda.pandas.categorical.Categorical], axis: int = 0, ordered: bool = True) arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | arkouda.pandas.categorical.Categorical | Sequence[arkouda.pandas.categorical.Categorical][source]

Concatenate a list or tuple of pdarray or Strings objects into one pdarray or Strings object, respectively.

Parameters:
  • arrays (Sequence[Union[pdarray,Strings,Categorical]]) – The arrays to concatenate. Must all have same dtype.

  • axis (int, default = 0) – The axis along which the arrays will be joined. If axis is None, arrays are flattened before use. Only for use with pdarray, and when ordered is True. Default is 0.

  • ordered (bool) – If True (default), the arrays will be appended in the order given. If False, array data may be interleaved in blocks, which can greatly improve performance but results in non-deterministic ordering of elements.

Returns:

Single pdarray or Strings object containing all values, returned in the original order

Return type:

Union[pdarray,Strings,Categorical]

Raises:
  • ValueError – Raised if arrays is empty or if pdarrays have differing dtypes

  • TypeError – Raised if arrays is not a pdarrays or Strings python Sequence such as a list or tuple

  • RuntimeError – Raised if any array elements are dtypes for which concatenate has not been implemented.

Examples

>>> import arkouda as ak
>>> ak.concatenate([ak.array([1, 2, 3]), ak.array([4, 5, 6])])
array([1 2 3 4 5 6])
>>> ak.concatenate([ak.array([True,False,True]),ak.array([False,True,True])])
array([True False True False True True])
>>> ak.concatenate([ak.array(['one','two']),ak.array(['three','four','five'])])
array(['one', 'two', 'three', 'four', 'five'])
arkouda.numpy.pdarraysetops.in1d(A: arkouda.pandas.groupbyclass.groupable, B: arkouda.pandas.groupbyclass.groupable, assume_unique: bool = ..., symmetric: Literal[False] = ..., invert: bool = ...) arkouda.numpy.pdarrayclass.pdarray[source]
arkouda.numpy.pdarraysetops.in1d(A: arkouda.pandas.groupbyclass.groupable, B: arkouda.pandas.groupbyclass.groupable, assume_unique: bool = ..., symmetric: Literal[True] = ..., invert: bool = ...) Tuple[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.pdarrayclass.pdarray]

Test whether each element of a 1-D array is also present in a second array.

If symmetric=False (default), returns a boolean pdarray of the same shape as A indicating whether each element of A is in B.

If symmetric=True, returns a tuple (maskA, maskB) where:

  • maskA[i] is True iff A[i] is in B

  • maskB[j] is True iff B[j] is in A

If invert=True, the returned mask(s) are logically inverted.

Parameters:
  • A (list of pdarrays, pdarray, Strings, or Categorical) – Entries will be tested for membership in B

  • B (list of pdarrays, pdarray, Strings, or Categorical) – The set of elements in which to test membership

  • assume_unique (bool, optional, defaults to False) – If true, assume rows of a and b are each unique and sorted. By default, sort and unique them explicitly.

  • symmetric (bool, optional, defaults to False) – Return in1d(A, B), in1d(B, A) when A and B are single items.

  • invert (bool, optional, defaults to False) – If True, the values in the returned array are inverted (that is, False where an element of A is in B and True otherwise). Default is False. ak.in1d(a, b, invert=True) is equivalent to (but is faster than) ~ak.in1d(a, b).

Returns:

If symmetric=False (default), returns maskA:

  • maskA : pdarray Boolean array indicating whether each element of A is in B.

If symmetric=True, returns (maskA, maskB):

  • maskA : pdarray Boolean array indicating whether each element of A is in B.

  • maskB : pdarray Boolean array indicating whether each element of B is in A.

Return type:

Union[pdarray, Tuple[pdarray, pdarray]]

Raises:
  • TypeError – Raised if either A or B is not a pdarray, Strings, or Categorical object, or if both are pdarrays and either has rank > 1, or if invert is not a bool

  • RuntimeError – Raised if the dtype of either array is not supported

Examples

>>> import arkouda as ak
>>> ak.in1d(ak.array([-1, 0, 1]), ak.array([-2, 0, 2]))
array([False True False])
>>> ak.in1d(ak.array(['one','two']),ak.array(['two', 'three','four','five']))
array([False True])

Notes

in1d can be considered as an element-wise function version of the python keyword in, for 1-D sequences. in1d(a, b) is logically equivalent to ak.array([item in b for item in a]), but is much faster and scales to arbitrarily large a.

ak.in1d is not supported for bool or float64 pdarrays

arkouda.numpy.pdarraysetops.indexof1d(query: arkouda.pandas.groupbyclass.groupable, space: arkouda.pandas.groupbyclass.groupable) arkouda.numpy.pdarrayclass.pdarray[source]

Return indices of query items in a search list of items. Items not found will be excluded. When duplicate terms are present in search space return indices of all occurrences.

Parameters:
  • query ((sequence of) pdarray or Strings or Categorical) – The items to search for. If multiple arrays, each “row” is an item.

  • space ((sequence of) pdarray or Strings or Categorical) – The set of items in which to search. Must have same shape/dtype as query.

Returns:

For each item in query that is found in space, its index in space.

Return type:

pdarray

Notes

This is an alias of ak.find(query, space, all_occurrences=True, remove_missing=True).values

Examples

>>> import arkouda as ak
>>> select_from = ak.arange(10)
>>> query = select_from[ak.randint(0, select_from.size, 20, seed=10)]
>>> space = select_from[ak.randint(0, select_from.size, 20, seed=11)]

remove some values to ensure that query has entries which don’t appear in space

>>> space = space[space != 9]
>>> space = space[space != 3]
>>> ak.indexof1d(query, space)
array([0 4 1 3 10 2 6 12 13 5 7 8 9 14 5 7 11 15 5 7 0 4])
Raises:
  • TypeError – Raised if either query or space is not a pdarray, Strings, or Categorical object

  • RuntimeError – Raised if the dtype of either array is not supported

arkouda.numpy.pdarraysetops.intersect1d(ar1: arkouda.pandas.groupbyclass.groupable, ar2: arkouda.pandas.groupbyclass.groupable, assume_unique: bool = False) arkouda.numpy.pdarrayclass.pdarray | arkouda.pandas.groupbyclass.groupable[source]

Find the intersection of two arrays.

Return the sorted, unique values that are present in both input arrays.

Parameters:
  • ar1 (list of pdarrays, pdarray, Strings, or Categorical) – First input array or list of arrays.

  • ar2 (list of pdarrays, pdarray, Strings, or Categorical) – Second input array or list of arrays.

  • assume_unique (bool, default=False) – If True, the input arrays are assumed to contain unique values, which can speed up the calculation.

Returns:

Sorted 1D array of common unique elements. If the inputs are lists of arrays, a list of sorted pdarrays is returned.

Return type:

pdarray or groupable

Raises:
  • TypeError – Raised if either ar1 or ar2 is not groupable.

  • RuntimeError – Raised if the dtype of either pdarray is not supported.

Examples

Basic 1D example:

>>> import arkouda as ak
>>> ak.intersect1d(ak.array([1, 3, 4, 3]), ak.array([3, 1, 2, 1]))
array([1 3])

Multi-array example:

>>> a = ak.arange(5)
>>> b = ak.array([1, 5, 3, 4, 2])
>>> c = ak.array([1, 4, 3, 2, 5])
>>> d = ak.array([1, 2, 3, 5, 4])
>>> multia = [a, a, a]
>>> multib = [b, c, d]
>>> ak.intersect1d(multia, multib)
[array([1 3]), array([1 3]), array([1 3])]
arkouda.numpy.pdarraysetops.setdiff1d(ar1: arkouda.pandas.groupbyclass.groupable, ar2: arkouda.pandas.groupbyclass.groupable, assume_unique: bool = False) arkouda.numpy.pdarrayclass.pdarray | arkouda.pandas.groupbyclass.groupable[source]

Find the set difference of two arrays.

Return the sorted, unique values in A that are not in B.

Parameters:
Returns:

Sorted 1D array/List of sorted pdarrays of values in ar1 that are not in ar2.

Return type:

pdarray/groupable

Raises:
  • TypeError – Raised if either ar1 or ar2 is not a pdarray

  • RuntimeError – Raised if the dtype of either pdarray is not supported

Notes

ak.setdiff1d is not supported for bool pdarrays

Examples

>>> import arkouda as ak
>>> a = ak.array([1, 2, 3, 2, 4, 1])
>>> b = ak.array([3, 4, 5, 6])
>>> ak.setdiff1d(a, b)
array([1 2])

Multi-Array Example

>>> a = ak.arange(1, 6)
>>> b = ak.array([1, 5, 3, 4, 2])
>>> c = ak.array([1, 4, 3, 2, 5])
>>> d = ak.array([1, 2, 3, 5, 4])
>>> multia = [a, a, a]
>>> multib = [b, c, d]
>>> ak.setdiff1d(multia, multib)
[array([2 4 5]), array([2 4 5]), array([2 4 5])]
arkouda.numpy.pdarraysetops.setxor1d(ar1: arkouda.pandas.groupbyclass.groupable, ar2: arkouda.pandas.groupbyclass.groupable, assume_unique: bool = False) arkouda.numpy.pdarrayclass.pdarray | arkouda.pandas.groupbyclass.groupable[source]

Find the set exclusive-or (symmetric difference) of two arrays.

Return the sorted, unique values that are in only one (not both) of the input arrays.

Parameters:
Returns:

Sorted 1D array/List of sorted pdarrays of unique values that are in only one of the input arrays.

Return type:

pdarray/groupable

Raises:
  • TypeError – Raised if either ar1 or ar2 is not a groupable

  • RuntimeError – Raised if the dtype of either pdarray is not supported

Examples

>>> import arkouda as ak
>>> a = ak.array([1, 2, 3, 2, 4])
>>> b = ak.array([2, 3, 5, 7, 5])
>>> ak.setxor1d(a,b)
array([1 4 5 7])

Multi-Array Example

>>> a = ak.arange(1, 6)
>>> b = ak.array([1, 5, 3, 4, 2])
>>> c = ak.array([1, 4, 3, 2, 5])
>>> d = ak.array([1, 2, 3, 5, 4])
>>> multia = [a, a, a]
>>> multib = [b, c, d]
>>> ak.setxor1d(multia, multib)
[array([2 2 4 4 5 5]), array([2 5 2 4 4 5]), array([2 4 5 4 2 5])]
arkouda.numpy.pdarraysetops.union1d(ar1: arkouda.pandas.groupbyclass.groupable, ar2: arkouda.pandas.groupbyclass.groupable) arkouda.pandas.groupbyclass.groupable[source]

Find the union of two arrays or lists of arrays.

Return the unique, sorted array of values that appear in either of the input arrays.

Parameters:
Returns:

Unique, sorted union of the input arrays. If the inputs are lists of arrays, a list of pdarrays is returned.

Return type:

groupable

Raises:
  • TypeError – Raised if either ar1 or ar2 is not groupable.

  • RuntimeError – Raised if the dtype of either input is not supported.

Examples

Basic 1D example:

>>> import arkouda as ak
>>> ak.union1d(ak.array([-1, 0, 1]), ak.array([-2, 0, 2]))
array([-2 -1 0 1 2])

Multi-array example:

>>> a = ak.arange(1, 6)
>>> b = ak.array([1, 5, 3, 4, 2])
>>> c = ak.array([1, 4, 3, 2, 5])
>>> d = ak.array([1, 2, 3, 5, 4])
>>> multia = [a, a, a]
>>> multib = [b, c, d]
>>> ak.union1d(multia, multib)
[array([1 2 2 3 4 4 5 5]), array([1 2 5 3 2 4 4 5]), array([1 2 4 3 5 4 2 5])]