Creating Arrays

There are several ways to initialize arkouda pdarray objects, most of which come from NumPy.

Constant

arkouda.zeros(size: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | str, dtype: dtype | type | str | BigInt = dtype('float64'), max_bits: int | None = None) pdarray[source]

Create a pdarray filled with zeros.

Parameters:
  • size (int_scalars) – Size of the array (only rank-1 arrays supported)

  • dtype (all_scalars) – Type of resulting array, default float64

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

Zeros of the requested size and dtype

Return type:

pdarray

Raises:

TypeError – Raised if the supplied dtype is not supported or if the size parameter is neither an int nor a str that is parseable to an int.

See also

ones, zeros_like

Examples

>>> ak.zeros(5, dtype=ak.int64)
array([0, 0, 0, 0, 0])
>>> ak.zeros(5, dtype=ak.float64)
array([0, 0, 0, 0, 0])
>>> ak.zeros(5, dtype=ak.bool)
array([False, False, False, False, False])
arkouda.ones(size: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | str, dtype: dtype | type | str | BigInt = dtype('float64'), max_bits: int | None = None) pdarray[source]

Create a pdarray filled with ones.

Parameters:
  • size (int_scalars) – Size of the array (only rank-1 arrays supported)

  • dtype (Union[float64, int64, bool]) – Resulting array type, default float64

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

Ones of the requested size and dtype

Return type:

pdarray

Raises:

TypeError – Raised if the supplied dtype is not supported or if the size parameter is neither an int nor a str that is parseable to an int.

See also

zeros, ones_like

Examples

>>> ak.ones(5, dtype=ak.int64)
array([1, 1, 1, 1, 1])
>>> ak.ones(5, dtype=ak.float64)
array([1, 1, 1, 1, 1])
>>> ak.ones(5, dtype=ak.bool)
array([True, True, True, True, True])
arkouda.zeros_like(pda: pdarray) pdarray[source]

Create a zero-filled pdarray of the same size and dtype as an existing pdarray.

Parameters:

pda (pdarray) – Array to use for size and dtype

Returns:

Equivalent to ak.zeros(pda.size, pda.dtype)

Return type:

pdarray

Raises:

TypeError – Raised if the pda parameter is not a pdarray.

See also

zeros, ones_like

Examples

>>> zeros = ak.zeros(5, dtype=ak.int64)
>>> ak.zeros_like(zeros)
array([0, 0, 0, 0, 0])
>>> zeros = ak.zeros(5, dtype=ak.float64)
>>> ak.zeros_like(zeros)
array([0, 0, 0, 0, 0])
>>> zeros = ak.zeros(5, dtype=ak.bool)
>>> ak.zeros_like(zeros)
array([False, False, False, False, False])
arkouda.ones_like(pda: pdarray) pdarray[source]

Create a one-filled pdarray of the same size and dtype as an existing pdarray.

Parameters:

pda (pdarray) – Array to use for size and dtype

Returns:

Equivalent to ak.ones(pda.size, pda.dtype)

Return type:

pdarray

Raises:

TypeError – Raised if the pda parameter is not a pdarray.

See also

ones, zeros_like

Notes

Logic for generating the pdarray is delegated to the ak.ones method. Accordingly, the supported dtypes match are defined by the ak.ones method.

Examples

>>> ones = ak.ones(5, dtype=ak.int64)
 >>> ak.ones_like(ones)
array([1, 1, 1, 1, 1])
>>> ones = ak.ones(5, dtype=ak.float64)
>>> ak.ones_like(ones)
array([1, 1, 1, 1, 1])
>>> ones = ak.ones(5, dtype=ak.bool)
>>> ak.ones_like(ones)
array([True, True, True, True, True])

Regular

arkouda.arange([start, ]stop, [stride, ]dtype=int64)[source]

Create a pdarray of consecutive integers within the interval [start, stop). If only one arg is given then arg is the stop parameter. If two args are given, then the first arg is start and second is stop. If three args are given, then the first arg is start, second is stop, third is stride.

The return value is cast to type dtype

Parameters:
  • start (int_scalars, optional) – Starting value (inclusive)

  • stop (int_scalars) – Stopping value (exclusive)

  • stride (int_scalars, optional) – The difference between consecutive elements, the default stride is 1, if stride is specified then start must also be specified.

  • dtype (np.dtype, type, or str) – The target dtype to cast values to

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

Integers from start (inclusive) to stop (exclusive) by stride

Return type:

pdarray, dtype

Raises:
  • TypeError – Raised if start, stop, or stride is not an int object

  • ZeroDivisionError – Raised if stride == 0

See also

linspace, zeros, ones, randint

Notes

Negative strides result in decreasing values. Currently, only int64 pdarrays can be created with this method. For float64 arrays, use the linspace method.

Examples

>>> ak.arange(0, 5, 1)
array([0, 1, 2, 3, 4])
>>> ak.arange(5, 0, -1)
array([5, 4, 3, 2, 1])
>>> ak.arange(0, 10, 2)
array([0, 2, 4, 6, 8])
>>> ak.arange(-5, -10, -1)
array([-5, -6, -7, -8, -9])
arkouda.linspace(start: float | float64 | float32 | int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64, stop: float | float64 | float32 | int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64, length: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64) pdarray[source]

Create a pdarray of linearly-spaced floats in a closed interval.

Parameters:
  • start (numeric_scalars) – Start of interval (inclusive)

  • stop (numeric_scalars) – End of interval (inclusive)

  • length (int_scalars) – Number of points

Returns:

Array of evenly spaced float values along the interval

Return type:

pdarray, float64

Raises:

TypeError – Raised if start or stop is not a float or int or if length is not an int

See also

arange

Notes

If that start is greater than stop, the pdarray values are generated in descending order.

Examples

>>> ak.linspace(0, 1, 5)
array([0, 0.25, 0.5, 0.75, 1])
>>> ak.linspace(start=1, stop=0, length=5)
array([1, 0.75, 0.5, 0.25, 0])
>>> ak.linspace(start=-5, stop=0, length=5)
array([-5, -3.75, -2.5, -1.25, 0])

Random

arkouda.randint(low: float | float64 | float32 | int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64, high: float | float64 | float32 | int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64, size: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | Tuple[int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64, ...] = 1, dtype=dtype('int64'), seed: int | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | None = None) pdarray[source]

Generate a pdarray of randomized int, float, or bool values in a specified range bounded by the low and high parameters.

Parameters:
  • low (numeric_scalars) – The low value (inclusive) of the range

  • high (numeric_scalars) – The high value (exclusive for int, inclusive for float) of the range

  • size (int_scalars) – The length of the returned array

  • dtype (Union[int64, float64, bool]) – The dtype of the array

  • seed (int_scalars, optional) – Seed to allow for reproducible random number generation

Returns:

Values drawn uniformly from the specified range having the desired dtype

Return type:

pdarray

Raises:
  • TypeError – Raised if dtype.name not in DTypes, size is not an int, low or high is not an int or float, or seed is not an int

  • ValueError – Raised if size < 0 or if high < low

Notes

Calling randint with dtype=float64 will result in uniform non-integral floating point values.

Ranges >= 2**64 in size is undefined behavior because it exceeds the maximum value that can be stored on the server (uint64)

Examples

>>> ak.randint(0, 10, 5)
array([5, 7, 4, 8, 3])
>>> ak.randint(0, 1, 3, dtype=ak.float64)
array([0.92176432277231968, 0.083130710959903542, 0.68894208386667544])
>>> ak.randint(0, 1, 5, dtype=ak.bool)
array([True, False, True, True, True])
>>> ak.randint(1, 5, 10, seed=2)
array([4, 3, 1, 3, 4, 4, 2, 4, 3, 2])
>>> ak.randint(1, 5, 3, dtype=ak.float64, seed=2)
array([2.9160772326374946, 4.353429832157099, 4.5392023718621486])
>>> ak.randint(1, 5, 10, dtype=ak.bool, seed=2)
array([False, True, True, True, True, False, True, True, True, True])

Concatenation

Performance note: in multi-locale settings, the default (ordered) mode of concatenate is very communication-intensive because the distribution of the original and resulting arrays are unrelated and most data must be moved non-locally. If the application does not require the concatenated array to be ordered (e.g. if the result is simply going to be sorted anyway), then using the keyword ordered=False will greatly speed up concatenation by minimizing non-local data movement.

arkouda.concatenate(arrays: Sequence[pdarray | Strings | Categorical], ordered: bool = True) pdarray | Strings | Categorical[source]

Concatenate a list or tuple of pdarray or Strings objects into one pdarray or Strings object, respectively.

Parameters:
  • arrays (Sequence[Union[pdarray,Strings,Categorical]]) – The arrays to concatenate. Must all have same dtype.

  • ordered (bool) – If True (default), the arrays will be appended in the order given. If False, array data may be interleaved in blocks, which can greatly improve performance but results in non-deterministic ordering of elements.

Returns:

Single pdarray or Strings object containing all values, returned in the original order

Return type:

Union[pdarray,Strings,Categorical]

Raises:
  • ValueError – Raised if arrays is empty or if 1..n pdarrays have differing dtypes

  • TypeError – Raised if arrays is not a pdarrays or Strings python Sequence such as a list or tuple

  • RuntimeError – Raised if 1..n array elements are dtypes for which concatenate has not been implemented.

Examples

>>> ak.concatenate([ak.array([1, 2, 3]), ak.array([4, 5, 6])])
array([1, 2, 3, 4, 5, 6])
>>> ak.concatenate([ak.array([True,False,True]),ak.array([False,True,True])])
array([True, False, True, False, True, True])
>>> ak.concatenate([ak.array(['one','two']),ak.array(['three','four','five'])])
array(['one', 'two', 'three', 'four', 'five'])