Creating Arrays¶

There are several ways to initialize arkouda pdarray objects, most of which come from NumPy.

Constant¶

arkouda.zeros(size, dtype=<class 'numpy.float64'>, max_bits=None)[source]¶

Create a pdarray filled with zeros.

Parameters:

size (int_scalars or tuple of int_scalars) – Size or shape of the array
dtype (all_scalars) – Type of resulting array, default ak.float64
max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays Included for consistency, as zeros are represented as all zeros, regardless of the value of max_bits

Returns:

Zeros of the requested size or shape and dtype

Return type:

pdarray

Raises:

TypeError – Raised if the supplied dtype is not supported
RuntimeError – Raised if the size parameter is neither an int nor a str that is parseable to an int.
ValueError – Raised if the rank of the given shape is not in get_array_ranks() or is empty Raised if max_bits is not NONE and ndim does not equal 1

See also

ones, zeros_like

Examples

>>> import arkouda as ak
>>> ak.zeros(5, dtype=ak.int64)
array([0 0 0 0 0])

>>> ak.zeros(5, dtype=ak.float64)
array([0.00000000000000000 0.00000000000000000 0.00000000000000000
       0.00000000000000000 0.00000000000000000])

>>> ak.zeros(5, dtype=ak.bool_)
array([False False False False False])

arkouda.ones(size, dtype=<class 'numpy.float64'>, max_bits=None)[source]¶

Create a pdarray filled with ones.

Parameters:

size (int_scalars or tuple of int_scalars) – Size or shape of the array
dtype (Union[float64, int64, bool]) – Resulting array type, default ak.float64
max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays Included for consistency, as ones are all zeros ending on a one, regardless of max_bits

Returns:

Ones of the requested size or shape and dtype

Return type:

pdarray

Raises:

TypeError – Raised if the supplied dtype is not supported
RuntimeError – Raised if the size parameter is neither an int nor a str that is parseable to an int.
ValueError – Raised if the rank of the given shape is not in get_array_ranks() or is empty

See also

zeros, ones_like

Examples

>>> import arkouda as ak
>>> ak.ones(5, dtype=ak.int64)
array([1 1 1 1 1])

>>> ak.ones(5, dtype=ak.float64)
array([1.00000000000000000 1.00000000000000000 1.00000000000000000
       1.00000000000000000 1.00000000000000000])

>>> ak.ones(5, dtype=ak.bool_)
array([True True True True True])

Notes

Logic for generating the pdarray is delegated to the ak.full method.

arkouda.zeros_like(pda)[source]¶

Create a zero-filled pdarray of the same size and dtype as an existing pdarray.

Parameters:: pda (pdarray) – Array to use for size and dtype
Returns:: Equivalent to ak.zeros(pda.size, pda.dtype)
Return type:: pdarray
Raises:: TypeError – Raised if the pda parameter is not a pdarray.

See also

zeros, ones_like

Examples

>>> import arkouda as ak
>>> ak.zeros_like(ak.ones(5,dtype=ak.int64))
array([0 0 0 0 0])

>>> ak.zeros_like(ak.ones(5,dtype=ak.float64))
array([0.00000000000000000 0.00000000000000000 0.00000000000000000
       0.00000000000000000 0.00000000000000000])

>>> ak.zeros_like(ak.ones(5,dtype=ak.bool_))
array([False False False False False])

arkouda.ones_like(pda)[source]¶

Create a one-filled pdarray of the same size and dtype as an existing pdarray.

Parameters:: pda (pdarray) – Array to use for size and dtype
Returns:: Equivalent to ak.ones(pda.size, pda.dtype)
Return type:: pdarray
Raises:: TypeError – Raised if the pda parameter is not a pdarray.

See also

ones, zeros_like

Notes

Logic for generating the pdarray is delegated to the ak.ones method. Accordingly, the supported dtypes match are defined by the ak.ones method.

Examples

>>> import arkouda as ak
>>> ak.ones_like(ak.zeros(5,dtype=ak.int64))
array([1 1 1 1 1])

>>> ak.ones_like(ak.zeros(5,dtype=ak.float64))
array([1.00000000000000000 1.00000000000000000 1.00000000000000000
       1.00000000000000000 1.00000000000000000])

>>> ak.ones_like(ak.zeros(5,dtype=ak.bool_))
array([True True True True True])

Regular¶

arkouda.arange(*args, dtype=None, max_bits=None)[source]¶

# noqa: DAR102 arange([start,] stop[, step,] dtype=int64)

Create a pdarray of consecutive integers within the interval [start, stop). If only one arg is given then arg is the stop parameter. If two args are given, then the first arg is start and second is stop. If three args are given, then the first arg is start, second is stop, third is step.

The return value is cast to type dtype

Parameters:

start (int_scalars, optional)
stop (int_scalars, optional)
step (int_scalars, optional) – if one of these three is supplied, it’s used as stop, and start = 0, step = 1 if two of them are supplied, start = start, stop = stop, step = 1 if all three are supplied, start = start, stop = stop, step = step
dtype (np.dtype, type, or str) – The target dtype to cast values to
max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

Integers from start (inclusive) to stop (exclusive) by step

Return type:

pdarray

Raises:

ValueError – Raised if none of start, stop, step was supplied
TypeError – Raised if start, stop, or step is not an int object
ZeroDivisionError – Raised if step == 0

See also

linspace, zeros, ones, randint

Notes

Negative steps result in decreasing values. Currently, only int64 pdarrays can be created with this method. For float64 arrays, use the linspace method.

Examples

>>> import arkouda as ak
>>> ak.arange(0, 5, 1)
array([0 1 2 3 4])

>>> ak.arange(5, 0, -1)
array([5 4 3 2 1])

>>> ak.arange(0, 10, 2)
array([0 2 4 6 8])

>>> ak.arange(-5, -10, -1)
array([-5 -6 -7 -8 -9])

arkouda.linspace(start, stop, num=50, endpoint=True, dtype=<class 'numpy.float64'>, axis=0)[source]¶

Return evenly spaced numbers over a specified interval.

Returns num evenly spaced samples, calculated over the interval [start, stop].

The endpoint of the interval can optionally be excluded.

Parameters:

start (Union[numeric_scalars, pdarray]) – The starting value of the sequence.
stop (Union[numeric_scalars, pdarray]) – The end value of the sequence, unless endpoint is set to False. In that case, the sequence consists of all but the last of num + 1 evenly spaced samples, so that stop is excluded. Note that the step size changes when endpoint is False.
num (int, optional) – Number of samples to generate. Default is 50. Must be non-negative.
endpoint (bool, optional) – If True, stop is the last sample. Otherwise, it is not included. Default is True.
dtype (dtype, optional) – Allowed for compatibility with numpy linspace, but anything entered is ignored. The output is always ak.float64.
axis (int, optional) – The axis in the result to store the samples. Relevant only if start or stop are array-like. By default (0), the samples will be along a new axis inserted at the beginning. Use -1 to get an axis at the end.

Returns:

There are num equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop) (depending on whether endpoint is True or False).

Return type:

pdarray

Raises:

TypeError – Raised if start or stop is not a float or a pdarray, or if num is not an int, or if endpoint is not a bool, or if dtype is anything other than None or float64, or axis is not an integer.
ValueError – Raised if axis is not a valid axis for the given data.

Examples

>>> import arkouda as ak
>>> ak.linspace(0,1,3)
array([0.00000000000000000 0.5 1.00000000000000000])
>>> ak.linspace(1,0,3)
array([1.00000000000000000 0.5 0.00000000000000000])
>>> ak.linspace(0,1,3,endpoint=False)
array([0.00000000000000000 0.33333333333333331 0.66666666666666663])
>>> ak.linspace(0,ak.array([2,3]),3)
array([array([0.00000000000000000 0.00000000000000000])
    array([1.00000000000000000 1.5]) array([2.00000000000000000 3.00000000000000000])])
>>> ak.linspace(ak.array([0,1]),3,3)
array([array([0.00000000000000000 1.00000000000000000])
    array([1.5 2.00000000000000000]) array([3.00000000000000000 3.00000000000000000])])
>>> ak.linspace(ak.array([0,1]),ak.array([2,3]),3)
array([array([0.00000000000000000 1.00000000000000000])
    array([1.00000000000000000 2.00000000000000000])
    array([2.00000000000000000 3.00000000000000000])])

Random¶

arkouda.randint(low, high, size=1, dtype=<class 'numpy.int64'>, seed=None)[source]¶

Generate a pdarray of randomized int, float, or bool values in a specified range bounded by the low and high parameters.

Parameters:

low (numeric_scalars) – The low value (inclusive) of the range
high (numeric_scalars) – The high value (exclusive for int, inclusive for float) of the range
size (int_scalars or tuple of int_scalars) – The size or shape of the returned array
dtype (Union[int64, float64, bool]) – The dtype of the array
seed (int_scalars, optional) – Index for where to pull the first returned value

Returns:

Values drawn uniformly from the specified range having the desired dtype

Return type:

pdarray

Raises:

TypeError – Raised if dtype.name not in DTypes, size is not an int, low or high is not an int or float, or seed is not an int
ValueError – Raised if size < 0 or if high < low

Notes

Calling randint with dtype=float64 will result in uniform non-integral floating point values.

Ranges >= 2**64 in size is undefined behavior because it exceeds the maximum value that can be stored on the server (uint64)

Examples

>>> import arkouda as ak
>>> ak.randint(0, 10, 5, seed=1701)
array([6 5 1 6 3])

>>> ak.randint(0, 1, 3, seed=1701, dtype=ak.float64)
array([0.011410423448327005 0.73618171558685619 0.12367222192448891])

>>> ak.randint(0, 1, 5, seed=1701, dtype=ak.bool_)
array([False True False True False])

Concatenation¶

Performance note: in multi-locale settings, the default (ordered) mode of concatenate is very communication-intensive because the distribution of the original and resulting arrays are unrelated and most data must be moved non-locally. If the application does not require the concatenated array to be ordered (e.g. if the result is simply going to be sorted anyway), then using the keyword ordered=False will greatly speed up concatenation by minimizing non-local data movement.

arkouda.concatenate(arrays, axis=0, ordered=True)[source]¶

Concatenate a list or tuple of pdarray or Strings objects into one pdarray or Strings object, respectively.

Parameters:

arrays (Sequence[Union[pdarray,Strings,Categorical]]) – The arrays to concatenate. Must all have same dtype.
axis (int, default = 0) – The axis along which the arrays will be joined. If axis is None, arrays are flattened before use. Only for use with pdarray, and when ordered is True. Default is 0.
ordered (bool) – If True (default), the arrays will be appended in the order given. If False, array data may be interleaved in blocks, which can greatly improve performance but results in non-deterministic ordering of elements.

Returns:

Single pdarray or Strings object containing all values, returned in the original order

Return type:

Union[pdarray,Strings,Categorical]

Raises:

ValueError – Raised if arrays is empty or if pdarrays have differing dtypes
TypeError – Raised if arrays is not a pdarrays or Strings python Sequence such as a list or tuple
RuntimeError – Raised if any array elements are dtypes for which concatenate has not been implemented.

Examples

>>> import arkouda as ak
>>> ak.concatenate([ak.array([1, 2, 3]), ak.array([4, 5, 6])])
array([1 2 3 4 5 6])

>>> ak.concatenate([ak.array([True,False,True]),ak.array([False,True,True])])
array([True False True False True True])

>>> ak.concatenate([ak.array(['one','two']),ak.array(['three','four','five'])])
array(['one', 'two', 'three', 'four', 'five'])