Creating Arrays¶
There are several ways to initialize arkouda pdarray
objects, most of which come from NumPy.
Constant¶
- arkouda.zeros(size, dtype=<class 'numpy.float64'>, max_bits=None)[source]¶
Create a pdarray filled with zeros.
- Parameters:
size (int_scalars or tuple of int_scalars) – Size of the array (only rank-1 arrays supported)
dtype (all_scalars) – Type of resulting array, default float64
max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays
- Returns:
Zeros of the requested size and dtype
- Return type:
- Raises:
TypeError – Raised if the supplied dtype is not supported or if the size parameter is neither an int nor a str that is parseable to an int.
See also
Examples
>>> ak.zeros(5, dtype=ak.int64) array([0, 0, 0, 0, 0])
>>> ak.zeros(5, dtype=ak.float64) array([0, 0, 0, 0, 0])
>>> ak.zeros(5, dtype=ak.bool_) array([False, False, False, False, False])
- arkouda.ones(size, dtype=<class 'numpy.float64'>, max_bits=None)[source]¶
Create a pdarray filled with ones.
- Parameters:
size (int_scalars or tuple of int_scalars) – Size of the array (only rank-1 arrays supported)
dtype (Union[float64, int64, bool]) – Resulting array type, default float64
max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays
- Returns:
Ones of the requested size and dtype
- Return type:
- Raises:
TypeError – Raised if the supplied dtype is not supported or if the size parameter is neither an int nor a str that is parseable to an int.
Examples
>>> ak.ones(5, dtype=ak.int64) array([1, 1, 1, 1, 1])
>>> ak.ones(5, dtype=ak.float64) array([1, 1, 1, 1, 1])
>>> ak.ones(5, dtype=ak.bool_) array([True, True, True, True, True])
- arkouda.zeros_like(pda)[source]¶
Create a zero-filled pdarray of the same size and dtype as an existing pdarray.
- Parameters:
pda (pdarray) – Array to use for size and dtype
- Returns:
Equivalent to ak.zeros(pda.size, pda.dtype)
- Return type:
- Raises:
TypeError – Raised if the pda parameter is not a pdarray.
Examples
>>> zeros = ak.zeros(5, dtype=ak.int64) >>> ak.zeros_like(zeros) array([0, 0, 0, 0, 0])
>>> zeros = ak.zeros(5, dtype=ak.float64) >>> ak.zeros_like(zeros) array([0, 0, 0, 0, 0])
>>> zeros = ak.zeros(5, dtype=ak.bool_) >>> ak.zeros_like(zeros) array([False, False, False, False, False])
- arkouda.ones_like(pda)[source]¶
Create a one-filled pdarray of the same size and dtype as an existing pdarray.
- Parameters:
pda (pdarray) – Array to use for size and dtype
- Returns:
Equivalent to ak.ones(pda.size, pda.dtype)
- Return type:
- Raises:
TypeError – Raised if the pda parameter is not a pdarray.
See also
Notes
Logic for generating the pdarray is delegated to the ak.ones method. Accordingly, the supported dtypes match are defined by the ak.ones method.
Examples
>>> ones = ak.ones(5, dtype=ak.int64) >>> ak.ones_like(ones) array([1, 1, 1, 1, 1])
>>> ones = ak.ones(5, dtype=ak.float64) >>> ak.ones_like(ones) array([1, 1, 1, 1, 1])
>>> ones = ak.ones(5, dtype=ak.bool_) >>> ak.ones_like(ones) array([True, True, True, True, True])
Regular¶
- arkouda.arange(*args, **kwargs)[source]¶
Create a pdarray of consecutive integers within the interval [start, stop). If only one arg is given then arg is the stop parameter. If two args are given, then the first arg is start and second is stop. If three args are given, then the first arg is start, second is stop, third is stride.
The return value is cast to type dtype
- Parameters:
start (int_scalars, optional) – Starting value (inclusive)
stop (int_scalars) – Stopping value (exclusive)
stride (int_scalars, optional) – The difference between consecutive elements, the default stride is 1, if stride is specified then start must also be specified.
dtype (np.dtype, type, or str) – The target dtype to cast values to
max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays
- Returns:
Integers from start (inclusive) to stop (exclusive) by stride
- Return type:
pdarray, dtype
- Raises:
TypeError – Raised if start, stop, or stride is not an int object
ZeroDivisionError – Raised if stride == 0
Notes
Negative strides result in decreasing values. Currently, only int64 pdarrays can be created with this method. For float64 arrays, use the linspace method.
Examples
>>> ak.arange(0, 5, 1) array([0, 1, 2, 3, 4])
>>> ak.arange(5, 0, -1) array([5, 4, 3, 2, 1])
>>> ak.arange(0, 10, 2) array([0, 2, 4, 6, 8])
>>> ak.arange(-5, -10, -1) array([-5, -6, -7, -8, -9])
- arkouda.linspace(start, stop, length)[source]¶
Create a pdarray of linearly-spaced floats in a closed interval.
- Parameters:
start (numeric_scalars) – Start of interval (inclusive)
stop (numeric_scalars) – End of interval (inclusive)
length (int_scalars) – Number of points
- Returns:
Array of evenly spaced float values along the interval
- Return type:
- Raises:
TypeError – Raised if start or stop is not a float or int or if length is not an int
See also
Notes
If that start is greater than stop, the pdarray values are generated in descending order.
Examples
>>> ak.linspace(0, 1, 5) array([0, 0.25, 0.5, 0.75, 1])
>>> ak.linspace(start=1, stop=0, length=5) array([1, 0.75, 0.5, 0.25, 0])
>>> ak.linspace(start=-5, stop=0, length=5) array([-5, -3.75, -2.5, -1.25, 0])
Random¶
- arkouda.randint(low, high, size=1, dtype=<class 'numpy.int64'>, seed=None)[source]¶
Generate a pdarray of randomized int, float, or bool values in a specified range bounded by the low and high parameters.
- Parameters:
low (numeric_scalars) – The low value (inclusive) of the range
high (numeric_scalars) – The high value (exclusive for int, inclusive for float) of the range
size (int_scalars) – The length of the returned array
dtype (Union[int64, float64, bool]) – The dtype of the array
seed (int_scalars, optional) – Index for where to pull the first returned value
- Returns:
Values drawn uniformly from the specified range having the desired dtype
- Return type:
- Raises:
TypeError – Raised if dtype.name not in DTypes, size is not an int, low or high is not an int or float, or seed is not an int
ValueError – Raised if size < 0 or if high < low
Notes
Calling randint with dtype=float64 will result in uniform non-integral floating point values.
Ranges >= 2**64 in size is undefined behavior because it exceeds the maximum value that can be stored on the server (uint64)
Examples
>>> ak.randint(0, 10, 5) array([5, 7, 4, 8, 3])
>>> ak.randint(0, 1, 3, dtype=ak.float64) array([0.92176432277231968, 0.083130710959903542, 0.68894208386667544])
>>> ak.randint(0, 1, 5, dtype=ak.bool_) array([True, False, True, True, True])
>>> ak.randint(1, 5, 10, seed=2) array([4, 3, 1, 3, 4, 4, 2, 4, 3, 2])
>>> ak.randint(1, 5, 3, dtype=ak.float64, seed=2) array([2.9160772326374946, 4.353429832157099, 4.5392023718621486])
>>> ak.randint(1, 5, 10, dtype=ak.bool, seed=2) array([False, True, True, True, True, False, True, True, True, True])
Concatenation¶
Performance note: in multi-locale settings, the default (ordered) mode of concatenate
is very communication-intensive because the distribution of the original and resulting arrays are unrelated and most data must be moved non-locally. If the application does not require the concatenated array to be ordered (e.g. if the result is simply going to be sorted anyway), then using the keyword ordered=False
will greatly speed up concatenation by minimizing non-local data movement.
- arkouda.concatenate(arrays, ordered=True)[source]¶
Concatenate a list or tuple of
pdarray
orStrings
objects into onepdarray
orStrings
object, respectively.- Parameters:
arrays (Sequence[Union[pdarray,Strings,Categorical]]) – The arrays to concatenate. Must all have same dtype.
ordered (bool) – If True (default), the arrays will be appended in the order given. If False, array data may be interleaved in blocks, which can greatly improve performance but results in non-deterministic ordering of elements.
- Returns:
Single pdarray or Strings object containing all values, returned in the original order
- Return type:
- Raises:
ValueError – Raised if arrays is empty or if 1..n pdarrays have differing dtypes
TypeError – Raised if arrays is not a pdarrays or Strings python Sequence such as a list or tuple
RuntimeError – Raised if 1..n array elements are dtypes for which concatenate has not been implemented.
Examples
>>> ak.concatenate([ak.array([1, 2, 3]), ak.array([4, 5, 6])]) array([1, 2, 3, 4, 5, 6])
>>> ak.concatenate([ak.array([True,False,True]),ak.array([False,True,True])]) array([True, False, True, False, True, True])
>>> ak.concatenate([ak.array(['one','two']),ak.array(['three','four','five'])]) array(['one', 'two', 'three', 'four', 'five'])