arkouda.groupbyclass
====================

.. py:module:: arkouda.groupbyclass


Attributes
----------

.. autoapisummary::

   arkouda.groupbyclass.GROUPBY_REDUCTION_TYPES
   arkouda.groupbyclass.groupable


Classes
-------

.. autoapisummary::

   arkouda.groupbyclass.GroupBy


Functions
---------

.. autoapisummary::

   arkouda.groupbyclass.broadcast
   arkouda.groupbyclass.unique


Package Contents
----------------

.. py:data:: GROUPBY_REDUCTION_TYPES

.. py:class:: GroupBy(keys: Optional[groupable] = None, assume_sorted: bool = False, dropna: bool = True, **kwargs)

   Group an array or list of arrays by value.

   Usually in preparation
   for aggregating the within-group values of another array.

   :param keys: The array to group by value, or if list, the column arrays to group by row
   :type keys: (list of) pdarray, Strings, or Categorical
   :param assume_sorted: If True, assume keys is already sorted (Default: False)
   :type assume_sorted: bool

   .. attribute:: nkeys

      The number of key arrays (columns)

      :type: int

   .. attribute:: permutation

      The permutation that sorts the keys array(s) by value (row)

      :type: pdarray

   .. attribute:: unique_keys

      The unique values of the keys array(s), in grouped order

      :type: pdarray, Strings, or Categorical

   .. attribute:: ngroups

      The length of the unique_keys array(s), i.e. number of groups

      :type: int_scalars

   .. attribute:: segments

      The start index of each group in the grouped array(s)

      :type: pdarray

   .. attribute:: logger

      Used for all logging operations

      :type: ArkoudaLogger

   .. attribute:: dropna

      If True, and the groupby keys contain NaN values,
      the NaN values together with the corresponding row will be dropped.
      Otherwise, the rows corresponding to NaN values will be kept.
      The default is True

      :type: bool (default=True)

   :raises TypeError: Raised if keys is a pdarray with a dtype other than int64

   .. rubric:: Notes

   Integral pdarrays, Strings, and Categoricals are natively supported, but
   float64 and bool arrays are not.

   For a user-defined class to be groupable, it must inherit from pdarray
   and define or overload the grouping API:
     1) a ._get_grouping_keys() method that returns a list of pdarrays
        that can be (co)argsorted.
     2) (Optional) a .group() method that returns the permutation that
        groups the array
   If the input is a single array with a .group() method defined, method 2
   will be used; otherwise, method 1 will be used.


   .. py:method:: AND(values: arkouda.numpy.pdarrayclass.pdarray) -> Tuple[Union[arkouda.numpy.pdarrayclass.pdarray, List[Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings]]], arkouda.numpy.pdarrayclass.pdarray]

      Bitwise AND of values in each segment.

      Group another array of values and perform a bitwise AND reduction on each group.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and reduce with AND
      :type values: pdarray, int64

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                result : pdarray, int64
                    Bitwise AND of values in segments corresponding to keys
      :rtype: Tuple[Union[pdarray, List[Union[pdarray, Strings]]], pdarray]

      :raises TypeError: Raised if the values array is not a pdarray or if the pdarray
          dtype is not int64
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if all is not supported for the values dtype


   .. py:method:: OR(values: arkouda.numpy.pdarrayclass.pdarray) -> Tuple[Union[arkouda.numpy.pdarrayclass.pdarray, List[Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings]]], arkouda.numpy.pdarrayclass.pdarray]

      Bitwise OR of values in each segment.

      Group another array of values and perform a bitwise OR reduction on each group.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and reduce with OR
      :type values: pdarray, int64

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                result : pdarray, int64
                    Bitwise OR of values in segments corresponding to keys
      :rtype: Tuple[Union[pdarray, List[Union[pdarray, Strings]]], pdarray]

      :raises TypeError: Raised if the values array is not a pdarray or if the pdarray
          dtype is not int64
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if all is not supported for the values dtype


   .. py:attribute:: Reductions


   .. py:method:: XOR(values: arkouda.numpy.pdarrayclass.pdarray) -> Tuple[Union[arkouda.numpy.pdarrayclass.pdarray, List[Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings]]], arkouda.numpy.pdarrayclass.pdarray]

      Bitwise XOR of values in each segment.

      Group another array of values and perform a bitwise XOR reduction on each group.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and reduce with XOR
      :type values: pdarray, int64

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                result : pdarray, int64
                    Bitwise XOR of values in segments corresponding to keys
      :rtype: Tuple[Union[pdarray, List[Union[pdarray, Strings]]], pdarray]

      :raises TypeError: Raised if the values array is not a pdarray or if the pdarray
          dtype is not int64
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if all is not supported for the values dtype


   .. py:method:: aggregate(values: groupable, operator: str, skipna: bool = True, ddof: arkouda.numpy.dtypes.int_scalars = 1) -> Tuple[groupable, groupable]

      Group another array of values and apply a reduction to each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and reduce
      :type values: pdarray
      :param operator: The name of the reduction operator to use
      :type operator: str
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool
      :param ddof: "Delta Degrees of Freedom" used in calculating std
      :type ddof: int_scalars

      :returns:

                unique_keys : groupable
                    The unique keys, in grouped order
                aggregates : groupable
                    One aggregate value per unique key in the GroupBy instance
      :rtype: Tuple[groupable, groupable]

      :raises TypeError: Raised if the values array is not a pdarray
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if the requested operator is not supported for the
          values dtype

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> keys = ak.arange(0, 5)
      >>> vals = ak.linspace(-1, 1, 5)
      >>> g = ak.GroupBy(keys)
      >>> g.aggregate(vals, 'sum')
      (array([0 1 2 3 4]),
       array([-1.00000000000000000 -0.5 0.00000000000000000 0.5 1.00000000000000000]))
      >>> g.aggregate(vals, 'min')
      (array([0 1 2 3 4]),
       array([-1.00000000000000000 -0.5 0.00000000000000000 0.5 1.00000000000000000]))


   .. py:method:: all(values: arkouda.numpy.pdarrayclass.pdarray) -> Tuple[Union[arkouda.numpy.pdarrayclass.pdarray, List[Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings]]], arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and perform an "and" reduction on each group.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and reduce with "and"
      :type values: pdarray, bool

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_any : pdarray, bool
                    One bool per unique key in the GroupBy instance
      :rtype: Tuple[Union[pdarray, List[Union[pdarray, Strings]]], pdarray]

      :raises TypeError: Raised if the values array is not a pdarray or if the pdarray
          dtype is not bool
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if all is not supported for the values dtype


   .. py:method:: any(values: arkouda.numpy.pdarrayclass.pdarray) -> Tuple[Union[arkouda.numpy.pdarrayclass.pdarray, List[Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings]]], arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and perform an "or" reduction on each group.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and reduce with "or"
      :type values: pdarray, bool

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_any : pdarray, bool
                    One bool per unique key in the GroupBy instance
      :rtype: Tuple[Union[pdarray, List[Union[pdarray, Strings]]], pdarray]

      :raises TypeError: Raised if the values array is not a pdarray or if the pdarray
          dtype is not bool
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array


   .. py:method:: argmax(values: arkouda.numpy.pdarrayclass.pdarray) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and return the location of the first maximum of each group.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and find argmax
      :type values: pdarray

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_argmaxima : pdarray, int64
                    One index per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object or if argmax
          is not supported for the values dtype
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array

      .. rubric:: Notes

      The returned indices refer to the original values array as passed in,
      not the permutation applied by the GroupBy instance.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.argmax(b)
      (array([1 2 3 4]), array([4 0 9 1]))


   .. py:method:: argmin(values: arkouda.numpy.pdarrayclass.pdarray) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and return the location of the first minimum of each group.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and find argmin
      :type values: pdarray

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_argminima : pdarray, int64
                    One index per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object or if argmax
          is not supported for the values dtype
      :raises ValueError: Raised if the key array size does not match the values
          size or if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if argmin is not supported for the values dtype

      .. rubric:: Notes

      The returned indices refer to the original values array as
      passed in, not the permutation applied by the GroupBy instance.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.argmin(b)
      (array([1 2 3 4]), array([4 0 9 1]))


   .. py:attribute:: assume_sorted
      :value: False


   .. py:method:: broadcast(values: Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings], permute: bool = True) -> Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings]

      Fill each group's segment with a constant value.

      :param values: The values to put in each group's segment
      :type values: pdarray, Strings
      :param permute: If True (default), permute broadcast values back to the ordering
                      of the original array on which GroupBy was called. If False, the
                      broadcast values are grouped by value.
      :type permute: bool

      :returns: The broadcasted values
      :rtype: pdarray, Strings

      :raises TypeError: Raised if value is not a pdarray object
      :raises ValueError: Raised if the values array does not have one
          value per segment

      .. rubric:: Notes

      This function is a sparse analog of ``np.broadcast``. If a
      GroupBy object represents a sparse matrix (tensor), then
      this function takes a (dense) column vector and replicates
      each value to the non-zero elements in the corresponding row.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.array([0, 1, 0, 1, 0])
      >>> values = ak.array([3, 5])
      >>> g = ak.GroupBy(a)

      By default, result is in original order
      >>> g.broadcast(values)
      array([3 5 3 5 3])

      With permute=False, result is in grouped order
      >>> g.broadcast(values, permute=False)
      array([3 3 3 5 5])
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> keys,counts = g.size()
      >>> g.broadcast(counts > 2)
      array([True True True True False True False True True False])
      >>> g.broadcast(counts == 3)
      array([True False False True False False False True False False])
      >>> g.broadcast(counts < 4)
      array([True False False True True False True True False True])


   .. py:method:: build_from_components(user_defined_name: Optional[str] = None, **kwargs) -> GroupBy
      :staticmethod:


      Build a new GroupBy object from component keys and permutation.

      :param user_defined_name: and assign it the given name
      :type user_defined_name: str (Optional) Passing a name will init the new GroupBy
      :param kwargs: Expected keys are "orig_keys", "permutation", "unique_keys", and "segments"
      :type kwargs: dict Dictionary of components required for rebuilding the GroupBy.

      :returns: The GroupBy object created by using the given components
      :rtype: GroupBy


   .. py:method:: count(values: arkouda.numpy.pdarrayclass.pdarray) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Count the number of elements in each group.

      NaN values will be excluded from the total.

      :param values: The values to be count by group (excluding NaN values).
      :type values: pdarray

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                counts : pdarray, int64
                    The number of times each unique key appears (excluding NaN values).
      :rtype: List[pdarray|Strings], pdarray|int64

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.array([1, 0, -1, 1, 0, -1])
      >>> a
      array([1 0 -1 1 0 -1])
      >>> b = ak.array([1, np.nan, -1, np.nan, np.nan, -1], dtype = "float64")
      >>> b
      array([1.00000000000000000 nan -1.00000000000000000 nan nan -1.00000000000000000])
      >>> g = ak.GroupBy(a)
      >>> keys,counts = g.count(b)
      >>> keys
      array([-1 0 1])
      >>> counts
      array([2 0 1])


   .. py:attribute:: dropna
      :type:  bool


   .. py:method:: first(values: groupable_element_type) -> Tuple[groupable, groupable_element_type]

      First value in each group.

      :param values: The values from which to take the first of each group
      :type values: pdarray-like

      :returns:

                unique_keys : (list of) pdarray-like
                    The unique keys, in grouped order
                result : pdarray-like
                    The first value of each group
      :rtype: Tuple[groupable, groupable_element_type]


   .. py:method:: from_return_msg(rep_msg)
      :staticmethod:


   .. py:method:: head(values: groupable_element_type, n: int = 5, return_indices: bool = True) -> Tuple[groupable, groupable_element_type]

      Return the first n values from each group.

      :param values: The values from which to select, according to their group membership.
      :type values: (list of) pdarray-like
      :param n: Maximum number of items to return for each group.
                If the number of values in a group is less than n,
                all the values from that group will be returned.
      :type n: int, optional, default = 5
      :param return_indices: If True, return the indices of the sampled values.
                             Otherwise, return the selected values.
      :type return_indices: bool, default False

      :returns:

                unique_keys : (list of) pdarray-like
                    The unique keys, in grouped order
                result : pdarray-like
                    The first n items of each group.
                    If return_indices is True, the result are indices.
                    O.W. the result are values.
      :rtype: Tuple[groupable, groupable_element_type]

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.arange(10) %3
      >>> a
      array([0 1 2 0 1 2 0 1 2 0])
      >>> v = ak.arange(10)
      >>> v
      array([0 1 2 3 4 5 6 7 8 9])
      >>> g = GroupBy(a)
      >>> unique_keys, idx = g.head(v, 2, return_indices=True)
      >>> _, values = g.head(v, 2, return_indices=False)
      >>> unique_keys
      array([0 1 2])
      >>> idx
      array([0 3 1 4 2 5])
      >>> values
      array([0 3 1 4 2 5])

      >>> v2 =  -2 * ak.arange(10)
      >>> v2
      array([0 -2 -4 -6 -8 -10 -12 -14 -16 -18])
      >>> _, idx2 = g.head(v2, 2, return_indices=True)
      >>> _, values2 = g.head(v2, 2, return_indices=False)
      >>> idx2
      array([0 3 1 4 2 5])
      >>> values2
      array([0 -6 -2 -8 -4 -10])


   .. py:method:: is_registered() -> bool

      Return True if the object is contained in the registry.

      :returns: Indicates if the object is contained in the registry
      :rtype: bool

      :raises RegistrationError: Raised if there's a server-side error or a mismatch of registered components

      .. seealso:: :py:obj:`register`, :py:obj:`attach`, :py:obj:`unregister`, :py:obj:`unregister_groupby_by_name`

      .. rubric:: Notes

      Objects registered with the server are immune to deletion until
      they are unregistered.


   .. py:attribute:: length


   .. py:attribute:: logger
      :type:  arkouda.logger.ArkoudaLogger


   .. py:method:: max(values: arkouda.numpy.pdarrayclass.pdarray, skipna: bool = True) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and return the maximum of each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and find maxima
      :type values: pdarray
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_maxima : pdarray
                    One maximum per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object or if max is
          not supported for the values dtype
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if max is not supported for the values dtype

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.max(b)
      (array([1 2 3 4]), array([1 2 3 4]))


   .. py:method:: mean(values: arkouda.numpy.pdarrayclass.pdarray, skipna: bool = True) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and compute the mean of each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and average
      :type values: pdarray
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_means : pdarray, float64
                    One mean value per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object
      :raises ValueError: Raised if the key array size does not match the values size
          or if the operator is not in the GroupBy.Reductions array

      .. rubric:: Notes

      The return dtype is always float64.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.mean(b)
      (array([1 2 3 4]),
      array([1.00000000000000000 2.00000000000000000 3.00000000000000000 4.00000000000000000]))


   .. py:method:: median(values: arkouda.numpy.pdarrayclass.pdarray, skipna: bool = True) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and compute the median of each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and find median
      :type values: pdarray
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_medians : pdarray, float64
                    One median value per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object
      :raises ValueError: Raised if the key array size does not match the values size
          or if the operator is not in the GroupBy.Reductions array

      .. rubric:: Notes

      The return dtype is always float64.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 9, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4])
      >>> b = ak.linspace(-5, 5, 9)
      >>> b
      array([-5.00000000000000000 -3.75 -2.5 -1.25 0.00000000000000000
          1.25 2.5 3.75 5.00000000000000000])
      >>> g.median(b)
      (array([1 2 4]), array([1.25 -1.25 -0.625]))


   .. py:method:: min(values: arkouda.numpy.pdarrayclass.pdarray, skipna: bool = True) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and return the minimum of each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and find minima
      :type values: pdarray
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_minima : pdarray
                    One minimum per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object or if min is
          not supported for the values dtype
      :raises ValueError: Raised if the key array size does not match the values size
          or if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if min is not supported for the values dtype

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.min(b)
      (array([1 2 3 4]), array([1 2 3 4]))


   .. py:method:: mode(values: groupable) -> Tuple[groupable, groupable]

      Return the most common value in each group.

      If a group is multi-modal, return the
      modal value that occurs first.

      :param values: The values from which to take the mode of each group
      :type values: (list of) pdarray-like

      :returns:

                unique_keys : (list of) pdarray-like
                    The unique keys, in grouped order
                result : (list of) pdarray-like
                    The most common value of each group
      :rtype: Tuple[groupable, groupable]


   .. py:attribute:: ngroups
      :type:  arkouda.numpy.dtypes.int_scalars


   .. py:attribute:: nkeys
      :type:  int


   .. py:method:: nunique(values: groupable) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and return the number of unique values in each group.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and find unique values
      :type values: pdarray, int64

      :returns:

                unique_keys : groupable
                    The unique keys, in grouped order
                group_nunique : groupable
                    Number of unique values per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the dtype(s) of values array(s) does/do not support
          the nunique method
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if nunique is not supported for the values dtype

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> data = ak.array([3, 4, 3, 1, 1, 4, 3, 4, 1, 4])
      >>> data
      array([3 4 3 1 1 4 3 4 1 4])
      >>> labels = ak.array([1, 1, 1, 2, 2, 2, 3, 3, 3, 4])
      >>> labels
      array([1 1 1 2 2 2 3 3 3 4])
      >>> g = ak.GroupBy(labels)
      >>> g.keys
      array([1 1 1 2 2 2 3 3 3 4])
      >>> g.nunique(data)
      (array([1 2 3 4]), array([2 2 3 1]))

      Group (1,1,1) has values [3,4,3] -> there are 2 unique values 3&4
      Group (2,2,2) has values [1,1,4] -> 2 unique values 1&4
      Group (3,3,3) has values [3,4,1] -> 3 unique values
      Group (4) has values [4] -> 1 unique value


   .. py:attribute:: objType
      :value: 'GroupBy'


   .. py:attribute:: permutation
      :type:  arkouda.numpy.pdarrayclass.pdarray


   .. py:method:: prod(values: arkouda.numpy.pdarrayclass.pdarray, skipna: bool = True) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and compute the product of each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and multiply
      :type values: pdarray
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_products : pdarray, float64
                    One product per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object
      :raises ValueError: Raised if the key array size does not match the values size
          or if the operator is not in the GroupBy.Reductions array
      :raises RuntimeError: Raised if prod is not supported for the values dtype

      .. rubric:: Notes

      The return dtype is always float64.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.prod(b)
      (array([1 2 3 4]),
      array([1.00000000000000000 7.9999999999999982 3.0000000000000004 255.99999999999994]))


   .. py:method:: register(user_defined_name: str) -> GroupBy

      Register this GroupBy object and underlying components with the Arkouda server.

      :param user_defined_name: user defined name the GroupBy is to be registered under,
                                this will be the root name for underlying components
      :type user_defined_name: str

      :returns: The same GroupBy which is now registered with the arkouda server and has an updated name.
                This is an in-place modification, the original is returned to support a
                fluid programming style.
                Please note you cannot register two different GroupBys with the same name.
      :rtype: GroupBy

      :raises TypeError: Raised if user_defined_name is not a str
      :raises RegistrationError: If the server was unable to register the GroupBy with the user_defined_name

      .. seealso:: :py:obj:`unregister`, :py:obj:`attach`, :py:obj:`unregister_groupby_by_name`, :py:obj:`is_registered`

      .. rubric:: Notes

      Objects registered with the server are immune to deletion until
      they are unregistered.


   .. py:attribute:: registered_name
      :type:  Optional[str]
      :value: None


   .. py:method:: sample(values: groupable, n=None, frac=None, replace=False, weights=None, random_state=None, return_indices=False, permute_samples=False)

      Return a random sample from each group.

      You can either specify the number of elements
      or the fraction of elements to be sampled. random_state can be used for reproducibility

      :param values: The values from which to sample, according to their group membership.
      :type values: (list of) pdarray-like
      :param n: Number of items to return for each group.
                Cannot be used with frac and must be no larger than
                the smallest group unless replace is True.
                Default is one if frac is None.
      :type n: int, optional
      :param frac: Fraction of items to return. Cannot be used with n.
      :type frac: float, optional
      :param replace: Allow or disallow sampling of the value more than once.
      :type replace: bool, default False
      :param weights: Default None results in equal probability weighting.
                      If passed a pdarray, then values must have the same length as the groupby keys
                      and will be used as sampling probabilities after normalization within each group.
                      Weights must be non-negative with at least one positive element within each group.
      :type weights: pdarray, optional
      :param random_state: If int, seed for random number generator.
                           If ak.random.Generator, use as given.
      :type random_state: int or ak.random.Generator, optional
      :param return_indices: if True, return the indices of the sampled values.
                             Otherwise, return the sample values.
      :type return_indices: bool, default False
      :param permute_samples: if True, return permute the samples according to group
                              Otherwise, keep samples in original order.
      :type permute_samples: bool, default False

      :returns: if return_indices is True, return the indices of the sampled values.
                Otherwise, return the sample values.
      :rtype: pdarray


   .. py:attribute:: segments
      :type:  arkouda.numpy.pdarrayclass.pdarray


   .. py:method:: size() -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Count the number of elements in each group, i.e. the number of times each key appears.

      This counts the total number of rows (including NaN values).

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                counts : pdarray, int64
                    The number of times each unique key appears
      :rtype: List[pdarray|Strings], pdarray|int64

      .. seealso:: :py:obj:`count`

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> keys,counts = g.size()
      >>> keys
      array([1 2 3 4])
      >>> counts
      array([2 3 1 4])


   .. py:method:: std(values: arkouda.numpy.pdarrayclass.pdarray, skipna: bool = True, ddof: arkouda.numpy.dtypes.int_scalars = 1) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and compute the standard deviation of each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and find standard deviation
      :type values: pdarray
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool
      :param ddof: "Delta Degrees of Freedom" used in calculating std
      :type ddof: int_scalars

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_stds : pdarray, float64
                    One std value per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object
      :raises ValueError: Raised if the key array size does not match the values size
          or if the operator is not in the GroupBy.Reductions array

      .. rubric:: Notes

      The return dtype is always float64.

      The standard deviation is the square root of the average of the squared
      deviations from the mean, i.e., ``std = sqrt(mean((x - x.mean())**2))``.

      The average squared deviation is normally calculated as
      ``x.sum() / N``, where ``N = len(x)``.  If, however, `ddof` is specified,
      the divisor ``N - ddof`` is used instead. In standard statistical
      practice, ``ddof=1`` provides an unbiased estimator of the variance
      of the infinite population. ``ddof=0`` provides a maximum likelihood
      estimate of the variance for normally distributed variables. The
      standard deviation computed in this function is the square root of
      the estimated variance, so even with ``ddof=1``, it will not be an
      unbiased estimate of the standard deviation per se.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.std(b)
      (array([1 2 3 4]), array([0.00000000000000000 0.00000000000000000 nan 0.00000000000000000]))


   .. py:method:: sum(values: arkouda.numpy.pdarrayclass.pdarray, skipna: bool = True) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and sum each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and sum
      :type values: pdarray
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_sums : pdarray
                    One sum per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object
      :raises ValueError: Raised if the key array size does not match the values size or
          if the operator is not in the GroupBy.Reductions array

      .. rubric:: Notes

      The grouped sum of a boolean ``pdarray`` returns integers.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.sum(b)
      (array([1 2 3 4]), array([2 6 3 16]))


   .. py:method:: tail(values: groupable_element_type, n: int = 5, return_indices: bool = True) -> Tuple[groupable, groupable_element_type]

      Return the last n values from each group.

      :param values: The values from which to select, according to their group membership.
      :type values: (list of) pdarray-like
      :param n: Maximum number of items to return for each group.
                If the number of values in a group is less than n,
                all the values from that group will be returned.
      :type n: int, optional, default = 5
      :param return_indices: If True, return the indices of the sampled values.
                             Otherwise, return the selected values.
      :type return_indices: bool, default False

      :returns:

                unique_keys : (list of) pdarray-like
                    The unique keys, in grouped order
                result : pdarray-like
                    The last n items of each group.
                    If return_indices is True, the result are indices.
                    O.W. the result are values.
      :rtype: Tuple[groupable, groupable_element_type]

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.arange(10) %3
      >>> a
      array([0 1 2 0 1 2 0 1 2 0])
      >>> v = ak.arange(10)
      >>> v
      array([0 1 2 3 4 5 6 7 8 9])
      >>> g = GroupBy(a)
      >>> unique_keys, idx = g.tail(v, 2, return_indices=True)
      >>> _, values = g.tail(v, 2, return_indices=False)
      >>> unique_keys
      array([0 1 2])
      >>> idx
      array([6 9 4 7 5 8])
      >>> values
      array([6 9 4 7 5 8])

      >>> v2 =  -2 * ak.arange(10)
      >>> v2
      array([0 -2 -4 -6 -8 -10 -12 -14 -16 -18])
      >>> _, idx2 = g.tail(v2, 2, return_indices=True)
      >>> _, values2 = g.tail(v2, 2, return_indices=False)
      >>> idx2
      array([6 9 4 7 5 8])
      >>> values2
      array([-12 -18 -8 -14 -10 -16])


   .. py:method:: to_hdf(prefix_path, dataset='groupby', mode='truncate', file_type='distribute')

      Save the GroupBy to HDF5.

      The result is a collection of HDF5 files, one file
      per locale of the arkouda server, where each filename starts with prefix_path.

      :param prefix_path: Directory and filename prefix that all output files will share
      :type prefix_path: str
      :param dataset: Name prefix for saved data within the HDF5 file
      :type dataset: str
      :param mode: By default, truncate (overwrite) output files, if they exist.
                   If 'append', add data as a new column to existing files.
      :type mode: str {'truncate' | 'append'}
      :param file_type: Default: "distribute"
                        When set to single, dataset is written to a single file.
                        When distribute, dataset is written on a file per locale.
                        This is only supported by HDF5 files and will have no impact of Parquet Files.
      :type file_type: str ("single" | "distribute")

      .. rubric:: Notes

      GroupBy is not currently supported by Parquet


   .. py:method:: unique(values: groupable)

      Return the set of unique values in each group, as a SegArray.

      :param values: The values to unique
      :type values: (list of) pdarray-like

      :returns:

                unique_keys : (list of) pdarray-like
                    The unique keys, in grouped order
                result : (list of) SegArray
                    The unique values of each group
      :rtype: (list of) pdarray-like, (list of) SegArray

      :raises TypeError: Raised if values is or contains Strings or Categorical


   .. py:attribute:: unique_keys
      :type:  Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings, arkouda.pandas.categorical.Categorical, Tuple[Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings, arkouda.pandas.categorical.Categorical], Ellipsis]]


   .. py:method:: unregister()

      Unregister this GroupBy object.

      Unregister this GroupBy object in the arkouda server which was previously
      registered using register() and/or attached to using attach()

      :raises RegistrationError: If the object is already unregistered or if there is a server error
          when attempting to unregister

      .. seealso:: :py:obj:`register`, :py:obj:`attach`, :py:obj:`unregister_groupby_by_name`, :py:obj:`is_registered`

      .. rubric:: Notes

      Objects registered with the server are immune to deletion until
      they are unregistered.


   .. py:method:: update_hdf(prefix_path: str, dataset: str = 'groupby', repack: bool = True)


   .. py:method:: var(values: arkouda.numpy.pdarrayclass.pdarray, skipna: bool = True, ddof: arkouda.numpy.dtypes.int_scalars = 1) -> Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray]

      Group another array of values and compute the variance of each group's values.

      Group using the permutation stored in the GroupBy instance.

      :param values: The values to group and find variance
      :type values: pdarray
      :param skipna: boolean which determines if NANs should be skipped
      :type skipna: bool
      :param ddof: "Delta Degrees of Freedom" used in calculating var
      :type ddof: int_scalars

      :returns:

                unique_keys : (list of) pdarray or Strings
                    The unique keys, in grouped order
                group_vars : pdarray, float64
                    One var value per unique key in the GroupBy instance
      :rtype: Tuple[groupable, pdarray]

      :raises TypeError: Raised if the values array is not a pdarray object
      :raises ValueError: Raised if the key array size does not match the values size
          or if the operator is not in the GroupBy.Reductions array

      .. rubric:: Notes

      The return dtype is always float64.

      The variance is the average of the squared deviations from the mean,
      i.e.,  ``var = mean((x - x.mean())**2)``.

      The mean is normally calculated as ``x.sum() / N``, where ``N = len(x)``.
      If, however, `ddof` is specified, the divisor ``N - ddof`` is used
      instead.  In standard statistical practice, ``ddof=1`` provides an
      unbiased estimator of the variance of a hypothetical infinite population.
      ``ddof=0`` provides a maximum likelihood estimate of the variance for
      normally distributed variables.

      .. rubric:: Examples

      >>> import arkouda as ak
      >>> a = ak.randint(1, 5, 10, seed=1)
      >>> a
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g = ak.GroupBy(a)
      >>> g.keys
      array([2 4 4 2 1 4 1 2 4 3])
      >>> b = ak.randint(1, 5, 10, seed=1)
      >>> b
      array([2 4 4 2 1 4 1 2 4 3])
      >>> g.var(b)
      (array([1 2 3 4]), array([0.00000000000000000 0.00000000000000000 nan 0.00000000000000000]))


.. py:function:: broadcast(segments: arkouda.numpy.pdarrayclass.pdarray, values: Union[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.strings.Strings], size: Union[int, numpy.int64, numpy.uint64] = -1, permutation: Union[arkouda.numpy.pdarrayclass.pdarray, None] = None)

   Broadcast a dense column vector to the rows of a sparse matrix or grouped array.

   :param segments: Offsets of the start of each row in the sparse matrix or grouped array.
                    Must be sorted in ascending order.
   :type segments: pdarray, int64
   :param values: The values to broadcast, one per row (or group)
   :type values: pdarray, Strings
   :param size: The total number of nonzeros in the matrix. If permutation is given, this
                argument is ignored and the size is inferred from the permutation array.
   :type size: int
   :param permutation: The permutation to go from the original ordering of nonzeros to the ordering
                       grouped by row. To broadcast values back to the original ordering, this
                       permutation will be inverted. If no permutation is supplied, it is assumed
                       that the original nonzeros were already grouped by row. In this case, the
                       size argument must be given.
   :type permutation: pdarray, int64

   :returns: The broadcast values, one per nonzero
   :rtype: pdarray, Strings

   :raises ValueError: - If segments and values are different sizes
       - If segments are empty
       - If number of nonzeros (either user-specified or inferred from permutation)
         is less than one

   .. rubric:: Examples

   >>> import arkouda as ak
   >>>
   # Define a sparse matrix with 3 rows and 7 nonzeros
   >>> row_starts = ak.array([0, 2, 5])
   >>> nnz = 7

   Broadcast the row number to each nonzero element
   >>> row_number = ak.arange(3)
   >>> ak.broadcast(row_starts, row_number, nnz)
   array([0 0 1 1 1 2 2])

   If the original nonzeros were in reverse order...
   >>> permutation = ak.arange(6, -1, -1)
   >>> ak.broadcast(row_starts, row_number, permutation=permutation)
   array([2 2 1 1 1 0 0])


.. py:data:: groupable

.. py:function:: unique(pda: groupable, return_groups: bool = False, assume_sorted: bool = False, return_indices: bool = False) -> Union[groupable, Tuple[groupable, arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.pdarrayclass.pdarray, int]]

   Find the unique elements of an array.

   Returns the unique elements of an array, sorted if the values are integers.
   There is an optional output in addition to the unique elements: the number
   of times each unique value comes up in the input array.

   :param pda: Input array.
   :type pda: (list of) pdarray, Strings, or Categorical
   :param return_groups: If True, also return grouping information for the array.
   :type return_groups: bool, optional
   :param assume_sorted: If True, assume pda is sorted and skip sorting step
   :type assume_sorted: bool, optional
   :param return_indices: Only applicable if return_groups is True.
                          If True, return unique key indices along with other groups
   :type return_indices: bool, optional

   :returns:

             unique : (list of) pdarray, Strings, or Categorical
                 The unique values. If input dtype is int64, return values will be sorted.
             permutation : pdarray, optional
                 Permutation that groups equivalent values together (only when return_groups=True)
             segments : pdarray, optional
                 The offset of each group in the permuted array (only when return_groups=True)
   :rtype: Union[groupable, Tuple[groupable, pdarray, pdarray, int]]

   :raises TypeError: Raised if pda is not a pdarray or Strings object
   :raises RuntimeError: Raised if the pdarray or Strings dtype is unsupported

   .. rubric:: Notes

   For integer arrays, this function checks to see whether `pda` is sorted
   and, if so, whether it is already unique. This step can save considerable
   computation. Otherwise, this function will sort `pda`.

   .. rubric:: Examples

   >>> import arkouda as ak
   >>> A = ak.array([3, 2, 1, 1, 2, 3])
   >>> ak.unique(A)
   array([1 2 3])