arkouda.pdarraysetops¶

Functions¶

`concatenate`(→ Union[arkouda.pdarrayclass.pdarray, ...)	Concatenate a list or tuple of `pdarray` or `Strings` objects into
`in1d`(→ Union[arkouda.pdarrayclass.pdarray, ...)	Test whether each element of a 1-D array is also present in a second array.
`indexof1d`(→ arkouda.pdarrayclass.pdarray)	Return indices of query items in a search list of items. Items not found will be excluded.
`intersect1d`(→ Union[arkouda.pdarrayclass.pdarray, ...)	Find the intersection of two arrays.
`setdiff1d`(→ Union[arkouda.pdarrayclass.pdarray, ...)	Find the set difference of two arrays.
`setxor1d`(→ Union[arkouda.pdarrayclass.pdarray, ...)	Find the set exclusive-or (symmetric difference) of two arrays.
`union1d`(→ Union[arkouda.pdarrayclass.pdarray, ...)	Find the union of two arrays/List of Arrays.

Module Contents¶

arkouda.pdarraysetops.concatenate(arrays: Sequence[arkouda.pdarrayclass.pdarray | arkouda.strings.Strings | Categorical], ordered: bool = True) → arkouda.pdarrayclass.pdarray | arkouda.strings.Strings | Categorical[source]¶

Concatenate a list or tuple of pdarray or Strings objects into one pdarray or Strings object, respectively.

Parameters:

arrays (Sequence[Union[pdarray,Strings,Categorical]]) – The arrays to concatenate. Must all have same dtype.
ordered (bool) – If True (default), the arrays will be appended in the order given. If False, array data may be interleaved in blocks, which can greatly improve performance but results in non-deterministic ordering of elements.

Returns:

Single pdarray or Strings object containing all values, returned in the original order

Return type:

Union[pdarray,Strings,Categorical]

Raises:

ValueError – Raised if arrays is empty or if 1..n pdarrays have differing dtypes
TypeError – Raised if arrays is not a pdarrays or Strings python Sequence such as a list or tuple
RuntimeError – Raised if 1..n array elements are dtypes for which concatenate has not been implemented.

Examples

>>> ak.concatenate([ak.array([1, 2, 3]), ak.array([4, 5, 6])])
array([1, 2, 3, 4, 5, 6])

>>> ak.concatenate([ak.array([True,False,True]),ak.array([False,True,True])])
array([True, False, True, False, True, True])

>>> ak.concatenate([ak.array(['one','two']),ak.array(['three','four','five'])])
array(['one', 'two', 'three', 'four', 'five'])

arkouda.pdarraysetops.in1d(pda1: arkouda.groupbyclass.groupable, pda2: arkouda.groupbyclass.groupable, assume_unique: bool = False, symmetric: bool = False, invert: bool = False) → arkouda.pdarrayclass.pdarray | arkouda.groupbyclass.groupable[source]¶

Test whether each element of a 1-D array is also present in a second array.

Returns a boolean array the same length as pda1 that is True where an element of pda1 is in pda2 and False otherwise.

Support multi-level – test membership of rows of a in the set of rows of b.

Parameters:

a (list of pdarrays, pdarray, Strings, or Categorical) – Rows are elements for which to test membership in b
b (list of pdarrays, pdarray, Strings, or Categorical) – Rows are elements of the set in which to test membership
assume_unique (bool) – If true, assume rows of a and b are each unique and sorted. By default, sort and unique them explicitly.
symmetric (bool) – Return in1d(pda1, pda2), in1d(pda2, pda1) when pda1 and 2 are single items.
invert (bool, optional) – If True, the values in the returned array are inverted (that is, False where an element of pda1 is in pda2 and True otherwise). Default is False. ak.in1d(a, b, invert=True) is equivalent to (but is faster than) ~ak.in1d(a, b).

Returns:

True for each row in a that is contained in b
Return Type
———— – pdarray, bool

Notes

Only works for pdarrays of int64 dtype, float64, Strings, or Categorical

arkouda.pdarraysetops.indexof1d(query: arkouda.groupbyclass.groupable, space: arkouda.groupbyclass.groupable) → arkouda.pdarrayclass.pdarray[source]¶

Return indices of query items in a search list of items. Items not found will be excluded. When duplicate terms are present in search space return indices of all occurrences.

Parameters:

query ((sequence of) pdarray or Strings or Categorical) – The items to search for. If multiple arrays, each “row” is an item.
space ((sequence of) pdarray or Strings or Categorical) – The set of items in which to search. Must have same shape/dtype as query.

Returns:

indices – For each item in query, its index in space.

Return type:

pdarray, int64

Notes

This is an alias of ak.find(query, space, all_occurrences=True, remove_missing=True).values

Examples

>>> select_from = ak.arange(10)
>>> arr1 = select_from[ak.randint(0, select_from.size, 20, seed=10)]
>>> arr2 = select_from[ak.randint(0, select_from.size, 20, seed=11)]
# remove some values to ensure we have some values
# which don't appear in the search space
>>> arr2 = arr2[arr2 != 9]
>>> arr2 = arr2[arr2 != 3]

>>> ak.indexof1d(arr1, arr2)
array([0 4 1 3 10 2 6 12 13 5 7 8 9 14 5 7 11 15 5 7 0 4])

Raises:

TypeError – Raised if either keys or arr is not a pdarray, Strings, or Categorical object
RuntimeError – Raised if the dtype of either array is not supported

arkouda.pdarraysetops.intersect1d(pda1: arkouda.groupbyclass.groupable, pda2: arkouda.groupbyclass.groupable, assume_unique: bool = False) → arkouda.pdarrayclass.pdarray | arkouda.groupbyclass.groupable[source]¶

Find the intersection of two arrays.

Return the sorted, unique values that are in both of the input arrays.

Parameters:

pda1 (pdarray/Sequence[pdarray, Strings, Categorical]) – Input array/Sequence of groupable objects
pda2 (pdarray/List) – Input array/sequence of groupable objects
assume_unique (bool) – If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.

Returns:

Sorted 1D array/List of sorted pdarrays of common and unique elements.

Return type:

pdarray/groupable

Raises:

TypeError – Raised if either pda1 or pda2 is not a pdarray
RuntimeError – Raised if the dtype of either pdarray is not supported

Notes

ak.setdiff1d is not supported for bool or float64 pdarrays

Examples

>>> a = ak.array([1, 2, 3, 2, 4, 1])
>>> b = ak.array([3, 4, 5, 6])
>>> ak.setdiff1d(a, b)
array([1, 2])
#Multi-Array Example
>>> a = ak.arange(1, 6)
>>> b = ak.array([1, 5, 3, 4, 2])
>>> c = ak.array([1, 4, 3, 2, 5])
>>> d = ak.array([1, 2, 3, 5, 4])
>>> multia = [a, a, a]
>>> multib = [b, c, d]
>>> ak.setdiff1d(multia, multib)
[array([2, 4, 5]), array([2, 4, 5]), array([2, 4, 5])]

arkouda.pdarraysetops.setxor1d(pda1: arkouda.groupbyclass.groupable, pda2: arkouda.groupbyclass.groupable, assume_unique: bool = False) → arkouda.pdarrayclass.pdarray | arkouda.groupbyclass.groupable[source]¶

Find the set exclusive-or (symmetric difference) of two arrays.

Return the sorted, unique values that are in only one (not both) of the input arrays.

Parameters:

pda1 (pdarray/Sequence[pdarray, Strings, Categorical]) – Input array/Sequence of groupable objects
pda2 (pdarray/List) – Input array/sequence of groupable objects
assume_unique (bool) – If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.

Returns:

Sorted 1D array/List of sorted pdarrays of unique values that are in only one of the input arrays.

Return type:

pdarray/groupable

Raises:

TypeError – Raised if either pda1 or pda2 is not a pdarray
RuntimeError – Raised if the dtype of either pdarray is not supported

Notes

ak.setxor1d is not supported for bool or float64 pdarrays

Examples

>>> a = ak.array([1, 2, 3, 2, 4])
>>> b = ak.array([2, 3, 5, 7, 5])
>>> ak.setxor1d(a,b)
array([1, 4, 5, 7])
#Multi-Array Example
>>> a = ak.arange(1, 6)
>>> b = ak.array([1, 5, 3, 4, 2])
>>> c = ak.array([1, 4, 3, 2, 5])
>>> d = ak.array([1, 2, 3, 5, 4])
>>> multia = [a, a, a]
>>> multib = [b, c, d]
>>> ak.setxor1d(multia, multib)
[array([2, 2, 4, 4, 5, 5]), array([2, 5, 2, 4, 4, 5]), array([2, 4, 5, 4, 2, 5])]

arkouda.pdarraysetops.union1d(pda1: arkouda.groupbyclass.groupable, pda2: arkouda.groupbyclass.groupable) → arkouda.pdarrayclass.pdarray | arkouda.groupbyclass.groupable[source]¶

Find the union of two arrays/List of Arrays.

Return the unique, sorted array of values that are in either of the two input arrays.

Parameters:

pda1 (pdarray/Sequence[pdarray, Strings, Categorical]) – Input array/Sequence of groupable objects
pda2 (pdarray/List) – Input array/sequence of groupable objects

Returns:

Unique, sorted union of the input arrays.

Return type:

pdarray/groupable

Raises:

TypeError – Raised if either pda1 or pda2 is not a pdarray
RuntimeError – Raised if the dtype of either array is not supported