arkouda.pandas.join¶
Functions¶
|
Compute the internal size of a hypothetical join between a and b. Returns |
|
Generate a segmented array of variable-length, contiguous ranges between pairs of |
|
Inner-join on equality between two integer arrays where the time-window predicate is also true. |
Module Contents¶
- arkouda.pandas.join.compute_join_size(a: arkouda.numpy.pdarrayclass.pdarray, b: arkouda.numpy.pdarrayclass.pdarray) Tuple[int, int][source]¶
Compute the internal size of a hypothetical join between a and b. Returns both the number of elements and number of bytes required for the join.
- arkouda.pandas.join.gen_ranges(starts, ends, stride=1, return_lengths=False)[source]¶
Generate a segmented array of variable-length, contiguous ranges between pairs of start- and end-points.
- Parameters:
- Returns:
- segmentspdarray, int64
The starting index of each range in the resulting array
- rangespdarray, int64
The actual ranges, flattened into a single array
- lengthspdarray, int64
The lengths of each segment. Only returned if return_lengths=True.
- Return type:
- arkouda.pandas.join.join_on_eq_with_dt(a1: arkouda.numpy.pdarrayclass.pdarray, a2: arkouda.numpy.pdarrayclass.pdarray, t1: arkouda.numpy.pdarrayclass.pdarray, t2: arkouda.numpy.pdarrayclass.pdarray, dt: int | numpy.int64, pred: str, result_limit: int | numpy.int64 = 1000) Tuple[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.pdarrayclass.pdarray][source]¶
Inner-join on equality between two integer arrays where the time-window predicate is also true.
- Parameters:
a1 (pdarray) – Values to join (must be int64 dtype).
a2 (pdarray) – Values to join (must be int64 dtype).
t1 (pdarray) – timestamps in millis corresponding to the a1 pdarray
t2 (pdarray) – timestamps in millis corresponding to the a2 pdarray
dt (Union[int,np.int64]) – time delta
pred (str) – time window predicate
result_limit (Union[int,np.int64]) – size limit for returned result
- Returns:
- result_array_onepdarray, int64
a1 indices where a1 == a2
- result_array_onepdarray, int64
a2 indices where a2 == a1
- Return type:
- Raises:
TypeError – Raised if a1, a2, t1, or t2 is not a pdarray, or if dt or result_limit is not an int
ValueError – if a1, a2, t1, or t2 dtype is not int64, pred is not ‘true_dt’, ‘abs_dt’, or ‘pos_dt’, or result_limit is < 0