arkouda.pdarrayclass ==================== .. py:module:: arkouda.pdarrayclass Exceptions ---------- .. autoapisummary:: arkouda.pdarrayclass.RegistrationError Classes ------- .. autoapisummary:: arkouda.pdarrayclass.pdarray Functions --------- .. autoapisummary:: arkouda.pdarrayclass.all arkouda.pdarrayclass.any arkouda.pdarrayclass.argmax arkouda.pdarrayclass.argmaxk arkouda.pdarrayclass.argmin arkouda.pdarrayclass.argmink arkouda.pdarrayclass.attach_pdarray arkouda.pdarrayclass.broadcast_to_shape arkouda.pdarrayclass.clear arkouda.pdarrayclass.clz arkouda.pdarrayclass.corr arkouda.pdarrayclass.cov arkouda.pdarrayclass.ctz arkouda.pdarrayclass.divmod arkouda.pdarrayclass.dot arkouda.pdarrayclass.fmod arkouda.pdarrayclass.is_sorted arkouda.pdarrayclass.max arkouda.pdarrayclass.maxk arkouda.pdarrayclass.mean arkouda.pdarrayclass.min arkouda.pdarrayclass.mink arkouda.pdarrayclass.mod arkouda.pdarrayclass.parity arkouda.pdarrayclass.popcount arkouda.pdarrayclass.power arkouda.pdarrayclass.prod arkouda.pdarrayclass.rotl arkouda.pdarrayclass.rotr arkouda.pdarrayclass.sqrt arkouda.pdarrayclass.std arkouda.pdarrayclass.sum arkouda.pdarrayclass.unregister_pdarray_by_name arkouda.pdarrayclass.var Module Contents --------------- .. py:exception:: RegistrationError Bases: :py:obj:`Exception` Error/Exception used when the Arkouda Server cannot register an object .. py:function:: all(pda: pdarray) -> numpy.bool_ Return True iff all elements of the array evaluate to True. :param pda: The pdarray instance to be evaluated :type pda: pdarray :returns: Indicates if all pdarray elements evaluate to True :rtype: bool :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: any(pda: pdarray) -> numpy.bool_ Return True iff any element of the array evaluates to True. :param pda: The pdarray instance to be evaluated :type pda: pdarray :returns: Indicates if 1..n pdarray elements evaluate to True :rtype: bool :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: argmax(pda: pdarray) -> Union[numpy.int64, numpy.uint64] Return the index of the first occurrence of the array max value. :param pda: Values for which to calculate the argmax :type pda: pdarray :returns: The index of the argmax calculated from the pda :rtype: Union[np.int64, np.uint64] :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: argmaxk(pda: pdarray, k: arkouda.numpy.dtypes.int_scalars) -> pdarray Find the indices corresponding to the `k` maximum values of an array. Returns the largest `k` values of an array, sorted :param pda: Input array. :type pda: pdarray :param k: The desired count of indices corresponding to maxmum array values :type k: int_scalars :returns: The indices of the maximum `k` values from the pda, sorted :rtype: pdarray, int :raises TypeError: Raised if pda is not a pdarray or k is not an integer :raises ValueError: Raised if the pda is empty or k < 1 .. rubric:: Notes This call is equivalent in value to: ak.argsort(a)[k:] and generally outperforms this operation. This reduction will see a significant drop in performance as `k` grows beyond a certain value. This value is system dependent, but generally about a `k` of 5 million is where performance degradation has been observed. .. rubric:: Examples >>> A = ak.array([10,5,1,3,7,2,9,0]) >>> ak.argmaxk(A, 3) array([4, 6, 0]) >>> ak.argmaxk(A, 4) array([1, 4, 6, 0]) .. py:function:: argmin(pda: pdarray) -> Union[numpy.int64, numpy.uint64] Return the index of the first occurrence of the array min value. :param pda: Values for which to calculate the argmin :type pda: pdarray :returns: The index of the argmin calculated from the pda :rtype: Union[np.int64, np.uint64] :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: argmink(pda: pdarray, k: arkouda.numpy.dtypes.int_scalars) -> pdarray Finds the indices corresponding to the `k` minimum values of an array. :param pda: Input array. :type pda: pdarray :param k: The desired count of indices corresponding to minimum array values :type k: int_scalars :returns: The indices of the minimum `k` values from the pda, sorted :rtype: pdarray, int :raises TypeError: Raised if pda is not a pdarray or k is not an integer :raises ValueError: Raised if the pda is empty or k < 1 .. rubric:: Notes This call is equivalent in value to: ak.argsort(a)[:k] and generally outperforms this operation. This reduction will see a significant drop in performance as `k` grows beyond a certain value. This value is system dependent, but generally about a `k` of 5 million is where performance degradation has been observed. .. rubric:: Examples >>> A = ak.array([10,5,1,3,7,2,9,0]) >>> ak.argmink(A, 3) array([7, 2, 5]) >>> ak.argmink(A, 4) array([7, 2, 5, 3]) .. py:function:: attach_pdarray(user_defined_name: str) -> pdarray class method to return a pdarray attached to the registered name in the arkouda server which was registered using register() :param user_defined_name: user defined name which array was registered under :type user_defined_name: str :returns: pdarray which is bound to the corresponding server side component which was registered with user_defined_name :rtype: pdarray :raises TypeError: Raised if user_defined_name is not a str .. seealso:: :obj:`attach`, :obj:`register`, :obj:`unregister`, :obj:`is_registered`, :obj:`unregister_pdarray_by_name`, :obj:`list_registry` .. rubric:: Notes Registered names/pdarrays in the server are immune to deletion until they are unregistered. .. rubric:: Examples >>> a = zeros(100) >>> a.register("my_zeros") >>> # potentially disconnect from server and reconnect to server >>> b = ak.attach_pdarray("my_zeros") >>> # ...other work... >>> b.unregister() .. py:function:: broadcast_to_shape(pda: pdarray, shape: Tuple[int, Ellipsis]) -> pdarray expand an array's rank to the specified shape using broadcasting .. py:function:: clear() -> None Send a clear message to clear all unregistered data from the server symbol table :rtype: None :raises RuntimeError: Raised if there is a server-side error in executing clear request .. py:function:: clz(pda: pdarray) -> pdarray Count leading zeros for each integer in an array. :param pda: Input array (must be integral). :type pda: pdarray, int64, uint64, bigint :returns: **lz** -- The number of leading zeros of each element. :rtype: pdarray :raises TypeError: If input array is not int64, uint64, or bigint .. rubric:: Examples >>> A = ak.arange(10) >>> ak.clz(A) array([64, 63, 62, 62, 61, 61, 61, 61, 60, 60]) .. py:function:: corr(x: pdarray, y: pdarray) -> numpy.float64 Return the correlation between x and y :param x: One of the pdarrays used to calculate correlation :type x: pdarray :param y: One of the pdarrays used to calculate correlation :type y: pdarray :returns: The scalar correlation of the two pdarrays :rtype: np.float64 :raises TypeError: Raised if x or y is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. seealso:: :obj:`std`, :obj:`cov` .. rubric:: Notes The correlation is calculated by cov(x, y) / (x.std(ddof=1) * y.std(ddof=1)) .. py:function:: cov(x: pdarray, y: pdarray) -> numpy.float64 Return the covariance of x and y :param x: One of the pdarrays used to calculate covariance :type x: pdarray :param y: One of the pdarrays used to calculate covariance :type y: pdarray :returns: The scalar covariance of the two pdarrays :rtype: np.float64 :raises TypeError: Raised if x or y is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. seealso:: :obj:`mean`, :obj:`var` .. rubric:: Notes The covariance is calculated by ``cov = ((x - x.mean()) * (y - y.mean())).sum() / (x.size - 1)``. .. py:function:: ctz(pda: pdarray) -> pdarray Count trailing zeros for each integer in an array. :param pda: Input array (must be integral). :type pda: pdarray, int64, uint64, bigint :returns: **lz** -- The number of trailing zeros of each element. :rtype: pdarray .. rubric:: Notes ctz(0) is defined to be zero. :raises TypeError: If input array is not int64, uint64, or bigint .. rubric:: Examples >>> A = ak.arange(10) >>> ak.ctz(A) array([0, 0, 1, 0, 2, 0, 1, 0, 3, 0]) .. py:function:: divmod(x: Union[arkouda.numpy.dtypes.numeric_scalars, pdarray], y: Union[arkouda.numpy.dtypes.numeric_scalars, pdarray], where: Union[bool, pdarray] = True) -> Tuple[pdarray, pdarray] :param x: The dividend array, the values that will be the numerator of the floordivision and will be acted on by the bases for modular division. :type x: numeric_scalars(float_scalars, int_scalars) or pdarray :param y: The divisor array, the values that will be the denominator of the division and will be the bases for the modular division. :type y: numeric_scalars(float_scalars, int_scalars) or pdarray :param where: This condition is broadcast over the input. At locations where the condition is True, the corresponding value will be divided using floor and modular division. Elsewhere, it will retain its original value. Default set to True. :type where: Boolean or pdarray :returns: Returns a tuple that contains quotient and remainder of the division :rtype: (pdarray, pdarray) :raises TypeError: At least one entry must be a pdarray :raises ValueError: If both inputs are both pdarrays, their size must match :raises ZeroDivisionError: No entry in y is allowed to be 0, to prevent division by zero .. rubric:: Notes The div is calculated by x // y The mod is calculated by x % y .. rubric:: Examples >>> x = ak.arange(5, 10) >>> y = ak.array([2, 1, 4, 5, 8]) >>> ak.divmod(x,y) (array([2 6 1 1 1]), array([1 0 3 3 1])) >>> ak.divmod(x,y, x % 2 == 0) (array([5 6 7 1 9]), array([5 0 7 3 9])) .. py:function:: dot(pda1: Union[numpy.int64, numpy.float64, numpy.uint64, pdarray], pda2: Union[numpy.int64, numpy.float64, numpy.uint64, pdarray]) -> Union[numpy.int64, numpy.float64, numpy.uint64, pdarray] Returns the sum of the elementwise product of two arrays of the same size (the dot product) or the product of a singleton element and an array. :param pda1: :type pda1: Union[numeric_scalars, pdarray] :param pda2: :type pda2: Union[numeric_scalars, pdarray] :returns: The sum of the elementwise product pda1 and pda2 or the product of a singleton element and an array. :rtype: Union[numeric_scalars, pdarray] :raises ValueError: Raised if the size of pda1 is not the same as pda2 .. rubric:: Examples >>> x = ak.array([2, 3]) >>> y = ak.array([4, 5]) >>> ak.dot(x,y) 23 >>> ak.dot(x,2) array([4 6]) .. py:function:: fmod(dividend: Union[pdarray, arkouda.numpy.dtypes.numeric_scalars], divisor: Union[pdarray, arkouda.numpy.dtypes.numeric_scalars]) -> pdarray Returns the element-wise remainder of division. It is equivalent to np.fmod, the remainder has the same sign as the dividend. :param dividend: The array being acted on by the bases for the modular division. :type dividend: numeric scalars or pdarray :param divisor: The array that will be the bases for the modular division. :type divisor: numeric scalars or pdarray :returns: Returns an array that contains the element-wise remainder of division. :rtype: pdarray .. py:function:: is_sorted(pda: pdarray) -> numpy.bool_ Return True iff the array is monotonically non-decreasing. :param pda: The pdarray instance to be evaluated :type pda: pdarray :returns: Indicates if the array is monotonically non-decreasing :rtype: bool :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: max(pda: pdarray) -> arkouda.numpy.dtypes.numpy_scalars Return the maximum value of the array. :param pda: Values for which to calculate the max :type pda: pdarray :returns: The max calculated from the pda :rtype: numpy_scalars :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: maxk(pda: pdarray, k: arkouda.numpy.dtypes.int_scalars) -> pdarray Find the `k` maximum values of an array. Returns the largest `k` values of an array, sorted :param pda: Input array. :type pda: pdarray :param k: The desired count of maximum values to be returned by the output. :type k: int_scalars :returns: The maximum `k` values from pda, sorted :rtype: pdarray, int :raises TypeError: Raised if pda is not a pdarray or k is not an integer :raises ValueError: Raised if the pda is empty or k < 1 .. rubric:: Notes This call is equivalent in value to: a[ak.argsort(a)[k:]] and generally outperforms this operation. This reduction will see a significant drop in performance as `k` grows beyond a certain value. This value is system dependent, but generally about a `k` of 5 million is where performance degredation has been observed. .. rubric:: Examples >>> A = ak.array([10,5,1,3,7,2,9,0]) >>> ak.maxk(A, 3) array([7, 9, 10]) >>> ak.maxk(A, 4) array([5, 7, 9, 10]) .. py:function:: mean(pda: pdarray) -> numpy.float64 Return the mean of the array. :param pda: Values for which to calculate the mean :type pda: pdarray :returns: The mean calculated from the pda sum and size :rtype: np.float64 :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: min(pda: pdarray) -> arkouda.numpy.dtypes.numpy_scalars Return the minimum value of the array. :param pda: Values for which to calculate the min :type pda: pdarray :returns: The min calculated from the pda :rtype: numpy_scalars :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: mink(pda: pdarray, k: arkouda.numpy.dtypes.int_scalars) -> pdarray Find the `k` minimum values of an array. Returns the smallest `k` values of an array, sorted :param pda: Input array. :type pda: pdarray :param k: The desired count of minimum values to be returned by the output. :type k: int_scalars :returns: The minimum `k` values from pda, sorted :rtype: pdarray :raises TypeError: Raised if pda is not a pdarray :raises ValueError: Raised if the pda is empty or k < 1 .. rubric:: Notes This call is equivalent in value to: a[ak.argsort(a)[:k]] and generally outperforms this operation. This reduction will see a significant drop in performance as `k` grows beyond a certain value. This value is system dependent, but generally about a `k` of 5 million is where performance degredation has been observed. .. rubric:: Examples >>> A = ak.array([10,5,1,3,7,2,9,0]) >>> ak.mink(A, 3) array([0, 1, 2]) >>> ak.mink(A, 4) array([0, 1, 2, 3]) .. py:function:: mod(dividend, divisor) -> pdarray Returns the element-wise remainder of division. Computes the remainder complementary to the floor_divide function. It is equivalent to np.mod, the remainder has the same sign as the divisor. :param dividend: The array being acted on by the bases for the modular division. :param divisor: The array that will be the bases for the modular division. :returns: Returns an array that contains the element-wise remainder of division. :rtype: pdarray .. py:function:: parity(pda: pdarray) -> pdarray Find the bit parity (XOR of all bits) for each integer in an array. :param pda: Input array (must be integral). :type pda: pdarray, int64, uint64, bigint :returns: **parity** -- The parity of each element: 0 if even number of bits set, 1 if odd. :rtype: pdarray :raises TypeError: If input array is not int64, uint64, or bigint .. rubric:: Examples >>> A = ak.arange(10) >>> ak.parity(A) array([0, 1, 1, 0, 1, 0, 0, 1, 1, 0]) .. py:class:: pdarray(name: str, mydtype: Union[numpy.dtype, str], size: arkouda.numpy.dtypes.int_scalars, ndim: arkouda.numpy.dtypes.int_scalars, shape: Sequence[int], itemsize: arkouda.numpy.dtypes.int_scalars, max_bits: Optional[int] = None) The basic arkouda array class. This class contains only the attributies of the array; the data resides on the arkouda server. When a server operation results in a new array, arkouda will create a pdarray instance that points to the array data on the server. As such, the user should not initialize pdarray instances directly. .. attribute:: name The server-side identifier for the array :type: str .. attribute:: dtype The element type of the array :type: dtype .. attribute:: size The number of elements in the array :type: int_scalars .. attribute:: ndim The rank of the array (currently only rank 1 arrays supported) :type: int_scalars .. attribute:: shape A list or tuple containing the sizes of each dimension of the array :type: Sequence[int] .. attribute:: itemsize The size in bytes of each element :type: int_scalars .. py:attribute:: BinOps .. py:attribute:: OpEqOps .. py:method:: all() -> numpy.bool_ Return True iff all elements of the array evaluate to True. .. py:method:: any() -> numpy.bool_ Return True iff any element of the array evaluates to True. .. py:method:: argmax() -> Union[numpy.int64, numpy.uint64] Return the index of the first occurrence of the array max value. .. py:method:: argmaxk(k: arkouda.numpy.dtypes.int_scalars) -> pdarray Finds the indices corresponding to the maximum "k" values. :param k: The desired count of maximum values to be returned by the output. :type k: int_scalars :returns: Indices corresponding to the maximum `k` values, sorted :rtype: pdarray, int :raises TypeError: Raised if pda is not a pdarray .. py:method:: argmin() -> Union[numpy.int64, numpy.uint64] Return the index of the first occurrence of the array min value .. py:method:: argmink(k: arkouda.numpy.dtypes.int_scalars) -> pdarray Compute the minimum "k" values. :param k: The desired count of maximum values to be returned by the output. :type k: int_scalars :returns: Indices corresponding to the maximum `k` values from pda :rtype: pdarray, int :raises TypeError: Raised if pda is not a pdarray .. py:method:: astype(dtype) -> pdarray Cast values of pdarray to provided dtype :param dtype: Dtype to cast to :type dtype: np.dtype or str :returns: An arkouda pdarray with values converted to the specified data type :rtype: ak.pdarray .. rubric:: Notes This is essentially shorthand for ak.cast(x, '') where x is a pdarray. .. py:method:: attach(user_defined_name: str) -> pdarray :staticmethod: class method to return a pdarray attached to the registered name in the arkouda server which was registered using register() :param user_defined_name: user defined name which array was registered under :type user_defined_name: str :returns: pdarray which is bound to the corresponding server side component which was registered with user_defined_name :rtype: pdarray :raises TypeError: Raised if user_defined_name is not a str .. seealso:: :obj:`register`, :obj:`unregister`, :obj:`is_registered`, :obj:`unregister_pdarray_by_name`, :obj:`list_registry` .. rubric:: Notes Registered names/pdarrays in the server are immune to deletion until they are unregistered. .. rubric:: Examples >>> a = zeros(100) >>> a.register("my_zeros") >>> # potentially disconnect from server and reconnect to server >>> b = ak.pdarray.attach("my_zeros") >>> # ...other work... >>> b.unregister() .. py:method:: bigint_to_uint_arrays() -> List[pdarray] Creates a list of uint pdarrays from a bigint pdarray. The first item in return will be the highest 64 bits of the bigint pdarray and the last item will be the lowest 64 bits. :returns: A list of uint pdarrays where: The first item in return will be the highest 64 bits of the bigint pdarray and the last item will be the lowest 64 bits. :rtype: List[pdarrays] :raises RuntimeError: Raised if there is a server-side error thrown .. seealso:: :obj:`pdarraycreation.bigint_from_uint_arrays` .. rubric:: Examples >>> a = ak.arange(2**64, 2**64 + 5) >>> a array(["18446744073709551616" "18446744073709551617" "18446744073709551618" "18446744073709551619" "18446744073709551620"]) >>> a.bigint_to_uint_arrays() [array([1 1 1 1 1]), array([0 1 2 3 4])] .. py:method:: clz() -> pdarray Count the number of leading zeros in each element. See `ak.clz`. .. py:method:: corr(y: pdarray) -> numpy.float64 Compute the correlation between self and y using pearson correlation coefficient. :param y: Other pdarray used to calculate correlation :type y: pdarray :returns: The scalar correlation of the two arrays :rtype: np.float64 :raises TypeError: Raised if y is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:method:: cov(y: pdarray) -> numpy.float64 Compute the covariance between self and y. :param y: Other pdarray used to calculate covariance :type y: pdarray :returns: The scalar covariance of the two arrays :rtype: np.float64 :raises TypeError: Raised if y is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:method:: ctz() -> pdarray Count the number of trailing zeros in each element. See `ak.ctz`. .. py:attribute:: dtype .. py:method:: equals(other) -> bool Whether pdarrays are the same size and all entries are equal. :param other: object to compare. :type other: object :returns: True if the pdarrays are the same, o.w. False. :rtype: bool .. rubric:: Examples >>> import arkouda as ak >>> ak.connect() >>> a = ak.array([1, 2, 3]) >>> a_cpy = ak.array([1, 2, 3]) >>> a.equals(a_cpy) True >>> a2 = ak.array([1, 2, 5) >>> a.equals(a2) False .. py:method:: fill(value: arkouda.numpy.dtypes.numeric_scalars) -> None Fill the array (in place) with a constant value. :param value: :type value: numeric_scalars :raises TypeError: Raised if value is not an int, int64, float, or float64 .. py:method:: format_other(other) -> str Attempt to cast scalar other to the element dtype of this pdarray, and print the resulting value to a string (e.g. for sending to a server command). The user should not call this function directly. :param other: The scalar to be cast to the pdarray.dtype :type other: object :rtype: string representation of np.dtype corresponding to the other parameter :raises TypeError: Raised if the other parameter cannot be converted to Numpy dtype .. py:property:: inferred_type :type: Union[str, None] Return a string of the type inferred from the values. .. py:method:: info() -> str Returns a JSON formatted string containing information about all components of self :param None: :returns: JSON string containing information about all components of self :rtype: str .. py:method:: is_registered() -> numpy.bool_ Return True iff the object is contained in the registry :param None: :returns: Indicates if the object is contained in the registry :rtype: bool :raises RuntimeError: Raised if there's a server-side error thrown .. note:: This will return True if the object is registered itself or as a component of another object .. py:method:: is_sorted() -> numpy.bool_ Return True iff the array is monotonically non-decreasing. :param None: :returns: Indicates if the array is monotonically non-decreasing :rtype: bool :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:attribute:: itemsize .. py:method:: max() -> arkouda.numpy.dtypes.numpy_scalars Return the maximum value of the array. .. py:property:: max_bits .. py:method:: maxk(k: arkouda.numpy.dtypes.int_scalars) -> pdarray Compute the maximum "k" values. :param k: The desired count of maximum values to be returned by the output. :type k: int_scalars :returns: The maximum `k` values from pda :rtype: pdarray, int :raises TypeError: Raised if pda is not a pdarray .. py:method:: mean() -> numpy.float64 Return the mean of the array. .. py:method:: min() -> arkouda.numpy.dtypes.numpy_scalars Return the minimum value of the array. .. py:method:: mink(k: arkouda.numpy.dtypes.int_scalars) -> pdarray Compute the minimum "k" values. :param k: The desired count of maximum values to be returned by the output. :type k: int_scalars :returns: The maximum `k` values from pda :rtype: pdarray, int :raises TypeError: Raised if pda is not a pdarray .. py:attribute:: name .. py:property:: nbytes The size of the pdarray in bytes. :returns: The size of the pdarray in bytes. :rtype: int .. py:attribute:: ndim .. py:attribute:: objType :value: 'pdarray' .. py:method:: opeq(other, op) .. py:method:: parity() -> pdarray Find the parity (XOR of all bits) in each element. See `ak.parity`. .. py:method:: popcount() -> pdarray Find the population (number of bits set) in each element. See `ak.popcount`. .. py:method:: pretty_print_info() -> None Prints information about all components of self in a human readable format :param None: :rtype: None .. py:method:: prod() -> numpy.float64 Return the product of all elements in the array. Return value is always a np.float64 or np.int64. .. py:method:: register(user_defined_name: str) -> pdarray Register this pdarray with a user defined name in the arkouda server so it can be attached to later using pdarray.attach() This is an in-place operation, registering a pdarray more than once will update the name in the registry and remove the previously registered name. A name can only be registered to one pdarray at a time. :param user_defined_name: user defined name array is to be registered under :type user_defined_name: str :returns: The same pdarray which is now registered with the arkouda server and has an updated name. This is an in-place modification, the original is returned to support a fluid programming style. Please note you cannot register two different pdarrays with the same name. :rtype: pdarray :raises TypeError: Raised if user_defined_name is not a str :raises RegistrationError: If the server was unable to register the pdarray with the user_defined_name If the user is attempting to register more than one pdarray with the same name, the former should be unregistered first to free up the registration name. .. seealso:: :obj:`attach`, :obj:`unregister`, :obj:`is_registered`, :obj:`list_registry`, :obj:`unregister_pdarray_by_name` .. rubric:: Notes Registered names/pdarrays in the server are immune to deletion until they are unregistered. .. rubric:: Examples >>> a = zeros(100) >>> a.register("my_zeros") >>> # potentially disconnect from server and reconnect to server >>> b = ak.pdarray.attach("my_zeros") >>> # ...other work... >>> b.unregister() .. py:attribute:: registered_name :type: Optional[str] :value: None .. py:method:: reshape(*shape) Gives a new shape to an array without changing its data. :param shape: The new shape should be compatible with the original shape. :type shape: int, tuple of ints, or pdarray :returns: a pdarray with the same data, reshaped to the new shape :rtype: pdarray .. py:method:: rotl(other) -> pdarray Rotate bits left by . .. py:method:: rotr(other) -> pdarray Rotate bits right by . .. py:method:: save(prefix_path: str, dataset: str = 'array', mode: str = 'truncate', compression: Optional[str] = None, file_format: str = 'HDF5', file_type: str = 'distribute') -> str DEPRECATED Save the pdarray to HDF5 or Parquet. The result is a collection of files, one file per locale of the arkouda server, where each filename starts with prefix_path. HDF5 support single files, in which case the file name will only be that provided. Each locale saves its chunk of the array to its corresponding file. :param prefix_path: Directory and filename prefix that all output files share :type prefix_path: str :param dataset: Name of the dataset to create in files (must not already exist) :type dataset: str :param mode: By default, truncate (overwrite) output files, if they exist. If 'append', attempt to create new dataset in existing files. :type mode: str {'truncate' | 'append'} :param compression: (None | "snappy" | "gzip" | "brotli" | "zstd" | "lz4") Sets the compression type used with Parquet files :type compression: str (Optional) :param file_format: By default, saved files will be written to the HDF5 file format. If 'Parquet', the files will be written to the Parquet file format. This is case insensitive. :type file_format: str {'HDF5', 'Parquet'} :param file_type: Default: "distribute" When set to single, dataset is written to a single file. When distribute, dataset is written on a file per locale. This is only supported by HDF5 files and will have no impact of Parquet Files. :type file_type: str ("single" | "distribute") :rtype: string message indicating result of save operation :raises RuntimeError: Raised if a server-side error is thrown saving the pdarray :raises ValueError: Raised if there is an error in parsing the prefix path pointing to file write location or if the mode parameter is neither truncate nor append :raises TypeError: Raised if any one of the prefix_path, dataset, or mode parameters is not a string .. seealso:: :obj:`save_all`, :obj:`load`, :obj:`read`, :obj:`to_parquet`, :obj:`to_hdf` .. rubric:: Notes The prefix_path must be visible to the arkouda server and the user must have write permission. Output files have names of the form ``_LOCALE``, where ```` ranges from 0 to ``numLocales``. If any of the output files already exist and the mode is 'truncate', they will be overwritten. If the mode is 'append' and the number of output files is less than the number of locales or a dataset with the same name already exists, a ``RuntimeError`` will result. Previously all files saved in Parquet format were saved with a ``.parquet`` file extension. This will require you to use load as if you saved the file with the extension. Try this if an older file is not being found. Any file extension can be used.The file I/O does not rely on the extension to determine the file format. .. rubric:: Examples >>> a = ak.arange(25) >>> # Saving without an extension >>> a.save('path/prefix', dataset='array') Saves the array to numLocales HDF5 files with the name ``cwd/path/name_prefix_LOCALE####`` >>> # Saving with an extension (HDF5) >>> a.save('path/prefix.h5', dataset='array') Saves the array to numLocales HDF5 files with the name ``cwd/path/name_prefix_LOCALE####.h5`` where #### is replaced by each locale number >>> # Saving with an extension (Parquet) >>> a.save('path/prefix.parquet', dataset='array', file_format='Parquet') Saves the array in numLocales Parquet files with the name ``cwd/path/name_prefix_LOCALE####.parquet`` where #### is replaced by each locale number .. py:attribute:: shape .. py:attribute:: size .. py:method:: slice_bits(low, high) -> pdarray Returns a pdarray containing only bits from low to high of self. This is zero indexed and inclusive on both ends, so slicing the bottom 64 bits is pda.slice_bits(0, 63) :param low: The lowest bit included in the slice (inclusive) zero indexed, so the first bit is 0 :type low: int :param high: The highest bit included in the slice (inclusive) :type high: int :returns: A new pdarray containing the bits of self from low to high :rtype: pdarray :raises RuntimeError: Raised if there is a server-side error thrown .. rubric:: Examples >>> p = ak.array([2**65 + (2**64 - 1)]) >>> bin(p[0]) '0b101111111111111111111111111111111111111111111111111111111111111111' >>> bin(p.slice_bits(64, 65)[0]) '0b10' .. py:method:: std(ddof: arkouda.numpy.dtypes.int_scalars = 0) -> numpy.float64 Compute the standard deviation. See ``arkouda.std`` for details. :param ddof: "Delta Degrees of Freedom" used in calculating std :type ddof: int_scalars :returns: The scalar standard deviation of the array :rtype: np.float64 :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:method:: sum() -> arkouda.numpy.dtypes.numeric_and_bool_scalars Return the sum of all elements in the array. .. py:method:: to_csv(prefix_path: str, dataset: str = 'array', col_delim: str = ',', overwrite: bool = False) Write pdarray to CSV file(s). File will contain a single column with the pdarray data. All CSV Files written by Arkouda include a header denoting data types of the columns. Parameters ----------- prefix_path: str The filename prefix to be used for saving files. Files will have _LOCALE#### appended when they are written to disk. dataset: str Column name to save the pdarray under. Defaults to "array". col_delim: str Defaults to ",". Value to be used to separate columns within the file. Please be sure that the value used DOES NOT appear in your dataset. overwrite: bool Defaults to False. If True, any existing files matching your provided prefix_path will be overwritten. If False, an error will be returned if existing files are found. Returns -------- str reponse message Raises ------ ValueError Raised if all datasets are not present in all parquet files or if one or more of the specified files do not exist RuntimeError Raised if one or more of the specified files cannot be opened. If `allow_errors` is true this may be raised if no values are returned from the server. TypeError Raised if we receive an unknown arkouda_type returned from the server Notes ------ - CSV format is not currently supported by load/load_all operations - The column delimiter is expected to be the same for column names and data - Be sure that column delimiters are not found within your data. - All CSV files must delimit rows using newline (` `) at this time. .. py:method:: to_cuda() Convert the array to a Numba DeviceND array, transferring array data from the arkouda server to Python via ndarray. If the array exceeds a builtin size limit, a RuntimeError is raised. :returns: A Numba ndarray with the same attributes and data as the pdarray; on GPU :rtype: numba.DeviceNDArray :raises ImportError: Raised if CUDA is not available :raises ModuleNotFoundError: Raised if Numba is either not installed or not enabled :raises RuntimeError: Raised if there is a server-side error thrown in the course of retrieving the pdarray. .. rubric:: Notes The number of bytes in the array cannot exceed ``client.maxTransferBytes``, otherwise a ``RuntimeError`` will be raised. This is to protect the user from overflowing the memory of the system on which the Python client is running, under the assumption that the server is running on a distributed system with much more memory than the client. The user may override this limit by setting client.maxTransferBytes to a larger value, but proceed with caution. .. seealso:: :obj:`array` .. rubric:: Examples >>> a = ak.arange(0, 5, 1) >>> a.to_cuda() array([0, 1, 2, 3, 4]) >>> type(a.to_cuda()) numpy.devicendarray .. py:method:: to_hdf(prefix_path: str, dataset: str = 'array', mode: str = 'truncate', file_type: str = 'distribute') -> str Save the pdarray to HDF5. The object can be saved to a collection of files or single file. :param prefix_path: Directory and filename prefix that all output files share :type prefix_path: str :param dataset: Name of the dataset to create in files (must not already exist) :type dataset: str :param mode: By default, truncate (overwrite) output files, if they exist. If 'append', attempt to create new dataset in existing files. :type mode: str {'truncate' | 'append'} :param file_type: Default: "distribute" When set to single, dataset is written to a single file. When distribute, dataset is written on a file per locale. This is only supported by HDF5 files and will have no impact of Parquet Files. :type file_type: str ("single" | "distribute") :rtype: string message indicating result of save operation :raises RuntimeError: Raised if a server-side error is thrown saving the pdarray .. rubric:: Notes - The prefix_path must be visible to the arkouda server and the user must have write permission. - Output files have names of the form ``_LOCALE``, where ```` ranges from 0 to ``numLocales`` for `file_type='distribute'`. Otherwise, the file name will be `prefix_path`. - If any of the output files already exist and the mode is 'truncate', they will be overwritten. If the mode is 'append' and the number of output files is less than the number of locales or a dataset with the same name already exists, a ``RuntimeError`` will result. - Any file extension can be used.The file I/O does not rely on the extension to determine the file format. .. rubric:: Examples >>> a = ak.arange(25) >>> # Saving without an extension >>> a.to_hdf('path/prefix', dataset='array') Saves the array to numLocales HDF5 files with the name ``cwd/path/name_prefix_LOCALE####`` >>> # Saving with an extension (HDF5) >>> a.to_hdf('path/prefix.h5', dataset='array') Saves the array to numLocales HDF5 files with the name ``cwd/path/name_prefix_LOCALE####.h5`` where #### is replaced by each locale number >>> # Saving to a single file >>> a.to_hdf('path/prefix.hdf5', dataset='array', file_type='single') Saves the array in to single hdf5 file on the root node. ``cwd/path/name_prefix.hdf5`` .. py:method:: to_list() -> List Convert the array to a list, transferring array data from the Arkouda server to client-side Python. Note: if the pdarray size exceeds client.maxTransferBytes, a RuntimeError is raised. :returns: A list with the same data as the pdarray :rtype: list :raises RuntimeError: Raised if there is a server-side error thrown, if the pdarray size exceeds the built-in client.maxTransferBytes size limit, or if the bytes received does not match expected number of bytes .. rubric:: Notes The number of bytes in the array cannot exceed ``client.maxTransferBytes``, otherwise a ``RuntimeError`` will be raised. This is to protect the user from overflowing the memory of the system on which the Python client is running, under the assumption that the server is running on a distributed system with much more memory than the client. The user may override this limit by setting client.maxTransferBytes to a larger value, but proceed with caution. .. seealso:: :obj:`to_ndarray` .. rubric:: Examples >>> a = ak.arange(0, 5, 1) >>> a.to_list() [0, 1, 2, 3, 4] >>> type(a.to_list()) list .. py:method:: to_ndarray() -> numpy.ndarray Convert the array to a np.ndarray, transferring array data from the Arkouda server to client-side Python. Note: if the pdarray size exceeds client.maxTransferBytes, a RuntimeError is raised. :returns: A numpy ndarray with the same attributes and data as the pdarray :rtype: np.ndarray :raises RuntimeError: Raised if there is a server-side error thrown, if the pdarray size exceeds the built-in client.maxTransferBytes size limit, or if the bytes received does not match expected number of bytes .. rubric:: Notes The number of bytes in the array cannot exceed ``client.maxTransferBytes``, otherwise a ``RuntimeError`` will be raised. This is to protect the user from overflowing the memory of the system on which the Python client is running, under the assumption that the server is running on a distributed system with much more memory than the client. The user may override this limit by setting client.maxTransferBytes to a larger value, but proceed with caution. .. seealso:: :obj:`array`, :obj:`to_list` .. rubric:: Examples >>> a = ak.arange(0, 5, 1) >>> a.to_ndarray() array([0, 1, 2, 3, 4]) >>> type(a.to_ndarray()) numpy.ndarray .. py:method:: to_parquet(prefix_path: str, dataset: str = 'array', mode: str = 'truncate', compression: Optional[str] = None) -> str Save the pdarray to Parquet. The result is a collection of files, one file per locale of the arkouda server, where each filename starts with prefix_path. Each locale saves its chunk of the array to its corresponding file. :param prefix_path: Directory and filename prefix that all output files share :type prefix_path: str :param dataset: Name of the dataset to create in files (must not already exist) :type dataset: str :param mode: By default, truncate (overwrite) output files, if they exist. If 'append', attempt to create new dataset in existing files. :type mode: str {'truncate' | 'append'} :param compression: (None | "snappy" | "gzip" | "brotli" | "zstd" | "lz4") Sets the compression type used with Parquet files :type compression: str (Optional) :rtype: string message indicating result of save operation :raises RuntimeError: Raised if a server-side error is thrown saving the pdarray .. rubric:: Notes - The prefix_path must be visible to the arkouda server and the user must have write permission. - Output files have names of the form ``_LOCALE``, where ```` ranges from 0 to ``numLocales`` for `file_type='distribute'`. - 'append' write mode is supported, but is not efficient. - If any of the output files already exist and the mode is 'truncate', they will be overwritten. If the mode is 'append' and the number of output files is less than the number of locales or a dataset with the same name already exists, a ``RuntimeError`` will result. - Any file extension can be used.The file I/O does not rely on the extension to determine the file format. .. rubric:: Examples >>> a = ak.arange(25) >>> # Saving without an extension >>> a.to_parquet('path/prefix', dataset='array') Saves the array to numLocales HDF5 files with the name ``cwd/path/name_prefix_LOCALE####`` >>> # Saving with an extension (HDF5) >>> a.to_parqet('path/prefix.parquet', dataset='array') Saves the array to numLocales HDF5 files with the name ``cwd/path/name_prefix_LOCALE####.parquet`` where #### is replaced by each locale number .. py:method:: transfer(hostname: str, port: arkouda.numpy.dtypes.int_scalars) Sends a pdarray to a different Arkouda server :param hostname: The hostname where the Arkouda server intended to receive the pdarray is running. :type hostname: str :param port: The port to send the array over. This needs to be an open port (i.e., not one that the Arkouda server is running on). This will open up `numLocales` ports, each of which in succession, so will use ports of the range {port..(port+numLocales)} (e.g., running an Arkouda server of 4 nodes, port 1234 is passed as `port`, Arkouda will use ports 1234, 1235, 1236, and 1237 to send the array data). This port much match the port passed to the call to `ak.receive_array()`. :type port: int_scalars :rtype: A message indicating a complete transfer :raises ValueError: Raised if the op is not within the pdarray.BinOps set :raises TypeError: Raised if other is not a pdarray or the pdarray.dtype is not a supported dtype .. py:method:: unregister() -> None Unregister a pdarray in the arkouda server which was previously registered using register() and/or attahced to using attach() :rtype: None :raises RuntimeError: Raised if the server could not find the internal name/symbol to remove .. seealso:: :obj:`register`, :obj:`unregister`, :obj:`is_registered`, :obj:`unregister_pdarray_by_name`, :obj:`list_registry` .. rubric:: Notes Registered names/pdarrays in the server are immune to deletion until they are unregistered. .. rubric:: Examples >>> a = zeros(100) >>> a.register("my_zeros") >>> # potentially disconnect from server and reconnect to server >>> b = ak.pdarray.attach("my_zeros") >>> # ...other work... >>> b.unregister() .. py:method:: update_hdf(prefix_path: str, dataset: str = 'array', repack: bool = True) Overwrite the dataset with the name provided with this pdarray. If the dataset does not exist it is added :param prefix_path: Directory and filename prefix that all output files share :type prefix_path: str :param dataset: Name of the dataset to create in files :type dataset: str :param repack: Default: True HDF5 does not release memory on delete. When True, the inaccessible data (that was overwritten) is removed. When False, the data remains, but is inaccessible. Setting to false will yield better performance, but will cause file sizes to expand. :type repack: bool :rtype: str - success message if successful :raises RuntimeError: Raised if a server-side error is thrown saving the pdarray .. rubric:: Notes - If file does not contain File_Format attribute to indicate how it was saved, the file name is checked for _LOCALE#### to determine if it is distributed. - If the dataset provided does not exist, it will be added .. py:method:: value_counts() Count the occurrences of the unique values of self. :returns: * **unique_values** (*pdarray*) -- The unique values, sorted in ascending order * **counts** (*pdarray, int64*) -- The number of times the corresponding unique value occurs .. rubric:: Examples >>> ak.array([2, 0, 2, 4, 0, 0]).value_counts() (array([0, 2, 4]), array([3, 2, 1])) .. py:method:: var(ddof: arkouda.numpy.dtypes.int_scalars = 0) -> numpy.float64 Compute the variance. See ``arkouda.var`` for details. :param ddof: "Delta Degrees of Freedom" used in calculating var :type ddof: int_scalars :returns: The scalar variance of the array :rtype: np.float64 :raises TypeError: Raised if pda is not a pdarray instance :raises ValueError: Raised if the ddof >= pdarray size :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: popcount(pda: pdarray) -> pdarray Find the population (number of bits set) for each integer in an array. :param pda: Input array (must be integral). :type pda: pdarray, int64, uint64, bigint :returns: **population** -- The number of bits set (1) in each element :rtype: pdarray :raises TypeError: If input array is not int64, uint64, or bigint .. rubric:: Examples >>> A = ak.arange(10) >>> ak.popcount(A) array([0, 1, 1, 2, 1, 2, 2, 3, 1, 2]) .. py:function:: power(pda: pdarray, pwr: Union[int, float, pdarray], where: Union[bool, pdarray] = True) -> pdarray Raises an array to a power. If where is given, the operation will only take place in the positions where the where condition is True. Note: Our implementation of the where argument deviates from numpy. The difference in behavior occurs at positions where the where argument contains a False. In numpy, these position will have uninitialized memory (which can contain anything and will vary between runs). We have chosen to instead return the value of the original array in these positions. :param pda: A pdarray of values that will be raised to a power (pwr) :type pda: pdarray :param pwr: The power(s) that pda is raised to :type pwr: integer, float, or pdarray :param where: This condition is broadcast over the input. At locations where the condition is True, the corresponding value will be raised to the respective power. Elsewhere, it will retain its original value. Default set to True. :type where: Boolean or pdarray :returns: pdarray Returns a pdarray of values raised to a power, under the boolean where condition. .. rubric:: Examples >>> a = ak.arange(5) >>> ak.power(a, 3) array([0, 1, 8, 27, 64]) >>> ak.power(a), 3, a % 2 == 0) array([0, 1, 8, 3, 64]) .. py:function:: prod(pda: pdarray) -> numpy.float64 Return the product of all elements in the array. Return value is always a np.float64 or np.int64 :param pda: Values for which to calculate the product :type pda: pdarray :returns: The product calculated from the pda :rtype: numpy_scalars :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: rotl(x, rot) -> pdarray Rotate bits of to the left by . :param x: Value(s) to rotate left. :type x: pdarray(int64/uint64) or integer :param rot: Amount(s) to rotate by. :type rot: pdarray(int64/uint64) or integer :returns: **rotated** -- The rotated elements of x. :rtype: pdarray(int64/uint64) :raises TypeError: If input array is not int64 or uint64 .. rubric:: Examples >>> A = ak.arange(10) >>> ak.rotl(A, A) array([0, 2, 8, 24, 64, 160, 384, 896, 2048, 4608]) .. py:function:: rotr(x, rot) -> pdarray Rotate bits of to the left by . :param x: Value(s) to rotate left. :type x: pdarray(int64/uint64) or integer :param rot: Amount(s) to rotate by. :type rot: pdarray(int64/uint64) or integer :returns: **rotated** -- The rotated elements of x. :rtype: pdarray(int64/uint64) :raises TypeError: If input array is not int64 or uint64 .. rubric:: Examples >>> A = ak.arange(10) >>> ak.rotr(1024 * A, A) array([0, 512, 512, 384, 256, 160, 96, 56, 32, 18]) .. py:function:: sqrt(pda: pdarray, where: Union[bool, pdarray] = True) -> pdarray Takes the square root of array. If where is given, the operation will only take place in the positions where the where condition is True. :param pda: A pdarray of values that will be square rooted :type pda: pdarray :param where: This condition is broadcast over the input. At locations where the condition is True, the corresponding value will be square rooted. Elsewhere, it will retain its original value. Default set to True. :type where: Boolean or pdarray :returns: * pdarray Returns a pdarray of square rooted values, under the boolean where condition. * *Examples* * *>>> a = ak.arange(5)* * *>>> ak.sqrt(a)* * *array([0 1 1.4142135623730951 1.7320508075688772 2])* * *>>> ak.sqrt(a, ak.sqrt([True, True, False, False, True]))* * *array([0, 1, 2, 3, 2])* .. py:function:: std(pda: pdarray, ddof: arkouda.numpy.dtypes.int_scalars = 0) -> numpy.float64 Return the standard deviation of values in the array. The standard deviation is implemented as the square root of the variance. :param pda: values for which to calculate the standard deviation :type pda: pdarray :param ddof: "Delta Degrees of Freedom" used in calculating std :type ddof: int_scalars :returns: The scalar standard deviation of the array :rtype: np.float64 :raises TypeError: Raised if pda is not a pdarray instance or ddof is not an integer :raises ValueError: Raised if ddof is an integer < 0 :raises RuntimeError: Raised if there's a server-side error thrown .. seealso:: :obj:`mean`, :obj:`var` .. rubric:: Notes The standard deviation is the square root of the average of the squared deviations from the mean, i.e., ``std = sqrt(mean((x - x.mean())**2))``. The average squared deviation is normally calculated as ``x.sum() / N``, where ``N = len(x)``. If, however, `ddof` is specified, the divisor ``N - ddof`` is used instead. In standard statistical practice, ``ddof=1`` provides an unbiased estimator of the variance of the infinite population. ``ddof=0`` provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even with ``ddof=1``, it will not be an unbiased estimate of the standard deviation per se. .. py:function:: sum(pda: pdarray) -> arkouda.numpy.dtypes.numeric_and_bool_scalars Return the sum of all elements in the array. :param pda: Values for which to calculate the sum :type pda: pdarray :returns: The sum of all elements in the array :rtype: np.float64 :raises TypeError: Raised if pda is not a pdarray instance :raises RuntimeError: Raised if there's a server-side error thrown .. py:function:: unregister_pdarray_by_name(user_defined_name: str) -> None Unregister a named pdarray in the arkouda server which was previously registered using register() and/or attahced to using attach_pdarray() :param user_defined_name: user defined name which array was registered under :type user_defined_name: str :rtype: None :raises RuntimeError: Raised if the server could not find the internal name/symbol to remove .. seealso:: :obj:`register`, :obj:`unregister`, :obj:`is_registered`, :obj:`list_registry`, :obj:`attach` .. rubric:: Notes Registered names/pdarrays in the server are immune to deletion until they are unregistered. .. rubric:: Examples >>> a = zeros(100) >>> a.register("my_zeros") >>> # potentially disconnect from server and reconnect to server >>> b = ak.attach_pdarray("my_zeros") >>> # ...other work... >>> ak.unregister_pdarray_by_name(b) .. py:function:: var(pda: pdarray, ddof: arkouda.numpy.dtypes.int_scalars = 0) -> numpy.float64 Return the variance of values in the array. :param pda: Values for which to calculate the variance :type pda: pdarray :param ddof: "Delta Degrees of Freedom" used in calculating var :type ddof: int_scalars :returns: The scalar variance of the array :rtype: np.float64 :raises TypeError: Raised if pda is not a pdarray instance :raises ValueError: Raised if the ddof >= pdarray size :raises RuntimeError: Raised if there's a server-side error thrown .. seealso:: :obj:`mean`, :obj:`std` .. rubric:: Notes The variance is the average of the squared deviations from the mean, i.e., ``var = mean((x - x.mean())**2)``. The mean is normally calculated as ``x.sum() / N``, where ``N = len(x)``. If, however, `ddof` is specified, the divisor ``N - ddof`` is used instead. In standard statistical practice, ``ddof=1`` provides an unbiased estimator of the variance of a hypothetical infinite population. ``ddof=0`` provides a maximum likelihood estimate of the variance for normally distributed variables.