arkouda.pdarraycreation

Module Contents

Functions

arange(→ arkouda.pdarrayclass.pdarray)

arange([start,] stop[, stride,] dtype=int64)

array(→ Union[arkouda.pdarrayclass.pdarray, ...)

Convert a Python or Numpy Iterable to a pdarray or Strings object, sending

bigint_from_uint_arrays(arrays[, max_bits])

Create a bigint pdarray from an iterable of uint pdarrays.

from_series(→ Union[arkouda.pdarrayclass.pdarray, ...)

Converts a Pandas Series to an Arkouda pdarray or Strings object. If

full(→ Union[arkouda.pdarrayclass.pdarray, ...)

Create a pdarray filled with fill_value.

full_like(→ arkouda.pdarrayclass.pdarray)

Create a pdarray filled with fill_value of the same size and dtype as an existing

linspace(→ arkouda.pdarrayclass.pdarray)

Create a pdarray of linearly-spaced floats in a closed interval.

ones(→ arkouda.pdarrayclass.pdarray)

Create a pdarray filled with ones.

ones_like(→ arkouda.pdarrayclass.pdarray)

Create a one-filled pdarray of the same size and dtype as an existing

randint(→ arkouda.pdarrayclass.pdarray)

Generate a pdarray of randomized int, float, or bool values in a

random_strings_lognormal(→ arkouda.strings.Strings)

Generate random strings with log-normally distributed lengths and

random_strings_uniform(→ arkouda.strings.Strings)

Generate random strings with lengths uniformly distributed between

standard_normal(→ arkouda.pdarrayclass.pdarray)

Draw real numbers from the standard normal distribution.

uniform(, high, seed, ...)

Generate a pdarray with uniformly distributed random float values

zeros(→ arkouda.pdarrayclass.pdarray)

Create a pdarray filled with zeros.

zeros_like(→ arkouda.pdarrayclass.pdarray)

Create a zero-filled pdarray of the same size and dtype as an existing

arkouda.pdarraycreation.arange(*args, **kwargs) arkouda.pdarrayclass.pdarray[source]

arange([start,] stop[, stride,] dtype=int64)

Create a pdarray of consecutive integers within the interval [start, stop). If only one arg is given then arg is the stop parameter. If two args are given, then the first arg is start and second is stop. If three args are given, then the first arg is start, second is stop, third is stride.

The return value is cast to type dtype

Parameters:
  • start (int_scalars, optional) – Starting value (inclusive)

  • stop (int_scalars) – Stopping value (exclusive)

  • stride (int_scalars, optional) – The difference between consecutive elements, the default stride is 1, if stride is specified then start must also be specified.

  • dtype (np.dtype, type, or str) – The target dtype to cast values to

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

Integers from start (inclusive) to stop (exclusive) by stride

Return type:

pdarray, dtype

Raises:
  • TypeError – Raised if start, stop, or stride is not an int object

  • ZeroDivisionError – Raised if stride == 0

See also

linspace, zeros, ones, randint

Notes

Negative strides result in decreasing values. Currently, only int64 pdarrays can be created with this method. For float64 arrays, use the linspace method.

Examples

>>> ak.arange(0, 5, 1)
array([0, 1, 2, 3, 4])
>>> ak.arange(5, 0, -1)
array([5, 4, 3, 2, 1])
>>> ak.arange(0, 10, 2)
array([0, 2, 4, 6, 8])
>>> ak.arange(-5, -10, -1)
array([-5, -6, -7, -8, -9])
arkouda.pdarraycreation.array(a: arkouda.pdarrayclass.pdarray | numpy.ndarray | Iterable, dtype: numpy.dtype | type | str | None = None, max_bits: int = -1) arkouda.pdarrayclass.pdarray | arkouda.strings.Strings[source]

Convert a Python or Numpy Iterable to a pdarray or Strings object, sending the corresponding data to the arkouda server.

Parameters:
  • a (Union[pdarray, np.ndarray]) – Rank-1 array of a supported dtype

  • dtype (np.dtype, type, or str) – The target dtype to cast values to

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

A pdarray instance stored on arkouda server or Strings instance, which is composed of two pdarrays stored on arkouda server

Return type:

pdarray or Strings

Raises:
  • TypeError – Raised if a is not a pdarray, np.ndarray, or Python Iterable such as a list, array, tuple, or deque

  • RuntimeError – Raised if a is not one-dimensional, nbytes > maxTransferBytes, a.dtype is not supported (not in DTypes), or if the product of a size and a.itemsize > maxTransferBytes

  • ValueError – Raised if the returned message is malformed or does not contain the fields required to generate the array.

See also

pdarray.to_ndarray

Notes

The number of bytes in the input array cannot exceed ak.client.maxTransferBytes, otherwise a RuntimeError will be raised. This is to protect the user from overwhelming the connection between the Python client and the arkouda server, under the assumption that it is a low-bandwidth connection. The user may override this limit by setting ak.client.maxTransferBytes to a larger value, but should proceed with caution.

If the pdrray or ndarray is of type U, this method is called twice recursively to create the Strings object and the two corresponding pdarrays for string bytes and offsets, respectively.

Examples

>>> ak.array(np.arange(1,10))
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> ak.array(range(1,10))
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> strings = ak.array([f'string {i}' for i in range(0,5)])
>>> type(strings)
<class 'arkouda.strings.Strings'>
arkouda.pdarraycreation.bigint_from_uint_arrays(arrays, max_bits=-1)[source]

Create a bigint pdarray from an iterable of uint pdarrays. The first item in arrays will be the highest 64 bits and the last item will be the lowest 64 bits.

Parameters:
  • arrays (Sequence[pdarray]) – An iterable of uint pdarrays used to construct the bigint pdarray. The first item in arrays will be the highest 64 bits and the last item will be the lowest 64 bits.

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

bigint pdarray constructed from uint arrays

Return type:

pdarray

Raises:
  • TypeError – Raised if any pdarray in arrays has a dtype other than uint or if the pdarrays are not the same size.

  • RuntimeError – Raised if there is a server-side error thrown

See also

pdarray.bigint_to_uint_arrays

Examples

>>> a = ak.bigint_from_uint_arrays([ak.ones(5, dtype=ak.uint64), ak.arange(5, dtype=ak.uint64)])
>>> a
array(["18446744073709551616" "18446744073709551617" "18446744073709551618"
"18446744073709551619" "18446744073709551620"])
>>> a.dtype
dtype(bigint)
>>> all(a[i] == 2**64 + i for i in range(5))
True
arkouda.pdarraycreation.from_series(series: pandas.Series, dtype: type | str | None = None) arkouda.pdarrayclass.pdarray | arkouda.strings.Strings[source]

Converts a Pandas Series to an Arkouda pdarray or Strings object. If dtype is None, the dtype is inferred from the Pandas Series. Otherwise, the dtype parameter is set if the dtype of the Pandas Series is to be overridden or is unknown (for example, in situations where the Series dtype is object).

Parameters:
  • series (Pandas Series) – The Pandas Series with a dtype of bool, float64, int64, or string

  • dtype (Optional[type]) – The valid dtype types are np.bool, np.float64, np.int64, and np.str

Return type:

Union[pdarray,Strings]

Raises:
  • TypeError – Raised if series is not a Pandas Series object

  • ValueError – Raised if the Series dtype is not bool, float64, int64, string, datetime, or timedelta

Examples

>>> ak.from_series(pd.Series(np.random.randint(0,10,5)))
array([9, 0, 4, 7, 9])
>>> ak.from_series(pd.Series(['1', '2', '3', '4', '5']),dtype=np.int64)
array([1, 2, 3, 4, 5])
>>> ak.from_series(pd.Series(np.random.uniform(low=0.0,high=1.0,size=3)))
array([0.57600036956445599, 0.41619265571741659, 0.6615356693784662])
>>> ak.from_series(pd.Series(['0.57600036956445599', '0.41619265571741659',
                   '0.6615356693784662']), dtype=np.float64)
array([0.57600036956445599, 0.41619265571741659, 0.6615356693784662])
>>> ak.from_series(pd.Series(np.random.choice([True, False],size=5)))
array([True, False, True, True, True])
>>> ak.from_series(pd.Series(['True', 'False', 'False', 'True', 'True']), dtype=np.bool)
array([True, True, True, True, True])
>>> ak.from_series(pd.Series(['a', 'b', 'c', 'd', 'e'], dtype="string"))
array(['a', 'b', 'c', 'd', 'e'])
>>> ak.from_series(pd.Series(['a', 'b', 'c', 'd', 'e']),dtype=np.str)
array(['a', 'b', 'c', 'd', 'e'])
>>> ak.from_series(pd.Series(pd.to_datetime(['1/1/2018', np.datetime64('2018-01-01')])))
array([1514764800000000000, 1514764800000000000])

Notes

The supported datatypes are bool, float64, int64, string, and datetime64[ns]. The data type is either inferred from the the Series or is set via the dtype parameter.

Series of datetime or timedelta are converted to Arkouda arrays of dtype int64 (nanoseconds)

A Pandas Series containing strings has a dtype of object. Arkouda assumes the Series contains strings and sets the dtype to str

arkouda.pdarraycreation.full(size: arkouda.dtypes.int_scalars | str, fill_value: arkouda.dtypes.numeric_scalars | str, dtype: numpy.dtype | type | str | arkouda.dtypes.BigInt = float64, max_bits: int | None = None) arkouda.pdarrayclass.pdarray | arkouda.strings.Strings[source]

Create a pdarray filled with fill_value.

Parameters:
  • size (int_scalars) – Size of the array (only rank-1 arrays supported)

  • fill_value (int_scalars) – Value with which the array will be filled

  • dtype (all_scalars) – Resulting array type, default float64

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

array of the requested size and dtype filled with fill_value

Return type:

pdarray or Strings

Raises:

TypeError – Raised if the supplied dtype is not supported or if the size parameter is neither an int nor a str that is parseable to an int.

See also

zeros, ones

Examples

>>> ak.full(5, 7, dtype=ak.int64)
array([7, 7, 7, 7, 7])
>>> ak.full(5, 9, dtype=ak.float64)
array([9, 9, 9, 9, 9])
>>> ak.full(5, 5, dtype=ak.bool)
array([True, True, True, True, True])
arkouda.pdarraycreation.full_like(pda: arkouda.pdarrayclass.pdarray, fill_value: arkouda.dtypes.numeric_scalars) arkouda.pdarrayclass.pdarray[source]

Create a pdarray filled with fill_value of the same size and dtype as an existing pdarray.

Parameters:
  • pda (pdarray) – Array to use for size and dtype

  • fill_value (int_scalars) – Value with which the array will be filled

Returns:

Equivalent to ak.full(pda.size, fill_value, pda.dtype)

Return type:

pdarray

Raises:

TypeError – Raised if the pda parameter is not a pdarray.

See also

ones_like, zeros_like

Notes

Logic for generating the pdarray is delegated to the ak.full method. Accordingly, the supported dtypes match are defined by the ak.full method.

Examples

>>> full = ak.full(5, 7, dtype=ak.int64)
>>> ak.full_like(full)
array([7, 7, 7, 7, 7])
>>> full = ak.full(5, 9, dtype=ak.float64)
>>> ak.full_like(full)
array([9, 9, 9, 9, 9])
>>> full = ak.full(5, 5, dtype=ak.bool)
>>> ak.full_like(full)
array([True, True, True, True, True])
arkouda.pdarraycreation.linspace(start: arkouda.dtypes.numeric_scalars, stop: arkouda.dtypes.numeric_scalars, length: arkouda.dtypes.int_scalars) arkouda.pdarrayclass.pdarray[source]

Create a pdarray of linearly-spaced floats in a closed interval.

Parameters:
  • start (numeric_scalars) – Start of interval (inclusive)

  • stop (numeric_scalars) – End of interval (inclusive)

  • length (int_scalars) – Number of points

Returns:

Array of evenly spaced float values along the interval

Return type:

pdarray, float64

Raises:

TypeError – Raised if start or stop is not a float or int or if length is not an int

See also

arange

Notes

If that start is greater than stop, the pdarray values are generated in descending order.

Examples

>>> ak.linspace(0, 1, 5)
array([0, 0.25, 0.5, 0.75, 1])
>>> ak.linspace(start=1, stop=0, length=5)
array([1, 0.75, 0.5, 0.25, 0])
>>> ak.linspace(start=-5, stop=0, length=5)
array([-5, -3.75, -2.5, -1.25, 0])
arkouda.pdarraycreation.ones(size: arkouda.dtypes.int_scalars | str, dtype: numpy.dtype | type | str | arkouda.dtypes.BigInt = float64, max_bits: int | None = None) arkouda.pdarrayclass.pdarray[source]

Create a pdarray filled with ones.

Parameters:
  • size (int_scalars) – Size of the array (only rank-1 arrays supported)

  • dtype (Union[float64, int64, bool]) – Resulting array type, default float64

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

Ones of the requested size and dtype

Return type:

pdarray

Raises:

TypeError – Raised if the supplied dtype is not supported or if the size parameter is neither an int nor a str that is parseable to an int.

See also

zeros, ones_like

Examples

>>> ak.ones(5, dtype=ak.int64)
array([1, 1, 1, 1, 1])
>>> ak.ones(5, dtype=ak.float64)
array([1, 1, 1, 1, 1])
>>> ak.ones(5, dtype=ak.bool)
array([True, True, True, True, True])
arkouda.pdarraycreation.ones_like(pda: arkouda.pdarrayclass.pdarray) arkouda.pdarrayclass.pdarray[source]

Create a one-filled pdarray of the same size and dtype as an existing pdarray.

Parameters:

pda (pdarray) – Array to use for size and dtype

Returns:

Equivalent to ak.ones(pda.size, pda.dtype)

Return type:

pdarray

Raises:

TypeError – Raised if the pda parameter is not a pdarray.

See also

ones, zeros_like

Notes

Logic for generating the pdarray is delegated to the ak.ones method. Accordingly, the supported dtypes match are defined by the ak.ones method.

Examples

>>> ones = ak.ones(5, dtype=ak.int64)
 >>> ak.ones_like(ones)
array([1, 1, 1, 1, 1])
>>> ones = ak.ones(5, dtype=ak.float64)
>>> ak.ones_like(ones)
array([1, 1, 1, 1, 1])
>>> ones = ak.ones(5, dtype=ak.bool)
>>> ak.ones_like(ones)
array([True, True, True, True, True])
arkouda.pdarraycreation.randint(low: arkouda.dtypes.numeric_scalars, high: arkouda.dtypes.numeric_scalars, size: arkouda.dtypes.int_scalars | Tuple[arkouda.dtypes.int_scalars, Ellipsis] = 1, dtype=akint64, seed: arkouda.dtypes.int_scalars | None = None) arkouda.pdarrayclass.pdarray[source]

Generate a pdarray of randomized int, float, or bool values in a specified range bounded by the low and high parameters.

Parameters:
  • low (numeric_scalars) – The low value (inclusive) of the range

  • high (numeric_scalars) – The high value (exclusive for int, inclusive for float) of the range

  • size (int_scalars) – The length of the returned array

  • dtype (Union[int64, float64, bool]) – The dtype of the array

  • seed (int_scalars, optional) – Index for where to pull the first returned value

Returns:

Values drawn uniformly from the specified range having the desired dtype

Return type:

pdarray

Raises:
  • TypeError – Raised if dtype.name not in DTypes, size is not an int, low or high is not an int or float, or seed is not an int

  • ValueError – Raised if size < 0 or if high < low

Notes

Calling randint with dtype=float64 will result in uniform non-integral floating point values.

Ranges >= 2**64 in size is undefined behavior because it exceeds the maximum value that can be stored on the server (uint64)

Examples

>>> ak.randint(0, 10, 5)
array([5, 7, 4, 8, 3])
>>> ak.randint(0, 1, 3, dtype=ak.float64)
array([0.92176432277231968, 0.083130710959903542, 0.68894208386667544])
>>> ak.randint(0, 1, 5, dtype=ak.bool)
array([True, False, True, True, True])
>>> ak.randint(1, 5, 10, seed=2)
array([4, 3, 1, 3, 4, 4, 2, 4, 3, 2])
>>> ak.randint(1, 5, 3, dtype=ak.float64, seed=2)
array([2.9160772326374946, 4.353429832157099, 4.5392023718621486])
>>> ak.randint(1, 5, 10, dtype=ak.bool, seed=2)
array([False, True, True, True, True, False, True, True, True, True])
arkouda.pdarraycreation.random_strings_lognormal(logmean: arkouda.dtypes.numeric_scalars, logstd: arkouda.dtypes.numeric_scalars, size: arkouda.dtypes.int_scalars, characters: str = 'uppercase', seed: arkouda.dtypes.int_scalars | None = None) arkouda.strings.Strings[source]

Generate random strings with log-normally distributed lengths and with characters drawn from a specified set.

Parameters:
  • logmean (numeric_scalars) – The log-mean of the length distribution

  • logstd (numeric_scalars) – The log-standard-deviation of the length distribution

  • size (int_scalars) – The number of strings to generate

  • characters ((uppercase, lowercase, numeric, printable, binary)) – The set of characters to draw from

  • seed (int_scalars, optional) – Value used to initialize the random number generator

Returns:

The Strings object encapsulating a pdarray of random strings

Return type:

Strings

Raises:
  • TypeError – Raised if logmean is neither a float nor a int, logstd is not a float, size is not an int, or if characters is not a str

  • ValueError – Raised if logstd <= 0 or size < 0

Notes

The lengths of the generated strings are distributed $Lognormal(mu, sigma^2)$, with \(\mu = logmean\) and \(\sigma = logstd\). Thus, the strings will have an average length of \(exp(\mu + 0.5*\sigma^2)\), a minimum length of zero, and a heavy tail towards longer strings.

Examples

>>> ak.random_strings_lognormal(2, 0.25, 5, seed=1)
array(['TVKJTE', 'ABOCORHFM', 'LUDMMGTB', 'KWOQNPHZ', 'VSXRRL'])
>>> ak.random_strings_lognormal(2, 0.25, 5, seed=1, characters='printable')
array(['+5"fp-', ']3Q4kC~HF', '=F=`,IE!', 'DjkBa'9(', '5oZ1)='])
arkouda.pdarraycreation.random_strings_uniform(minlen: arkouda.dtypes.int_scalars, maxlen: arkouda.dtypes.int_scalars, size: arkouda.dtypes.int_scalars, characters: str = 'uppercase', seed: None | arkouda.dtypes.int_scalars = None) arkouda.strings.Strings[source]

Generate random strings with lengths uniformly distributed between minlen and maxlen, and with characters drawn from a specified set.

Parameters:
  • minlen (int_scalars) – The minimum allowed length of string

  • maxlen (int_scalars) – The maximum allowed length of string

  • size (int_scalars) – The number of strings to generate

  • characters ((uppercase, lowercase, numeric, printable, binary)) – The set of characters to draw from

  • seed (Union[None, int_scalars], optional) – Value used to initialize the random number generator

Returns:

The array of random strings

Return type:

Strings

Raises:

ValueError – Raised if minlen < 0, maxlen < minlen, or size < 0

Examples

>>> ak.random_strings_uniform(minlen=1, maxlen=5, seed=1, size=5)
array(['TVKJ', 'EWAB', 'CO', 'HFMD', 'U'])
>>> ak.random_strings_uniform(minlen=1, maxlen=5, seed=1, size=5,
... characters='printable')
array(['+5"f', '-P]3', '4k', '~HFF', 'F'])
arkouda.pdarraycreation.standard_normal(size: arkouda.dtypes.int_scalars, seed: None | arkouda.dtypes.int_scalars = None) arkouda.pdarrayclass.pdarray[source]

Draw real numbers from the standard normal distribution.

Parameters:
  • size (int_scalars) – The number of samples to draw (size of the returned array)

  • seed (int_scalars) – Value used to initialize the random number generator

Returns:

The array of random numbers

Return type:

pdarray, float64

Raises:
  • TypeError – Raised if size is not an int

  • ValueError – Raised if size < 0

See also

randint

Notes

For random samples from \(N(\mu, \sigma^2)\), use:

(sigma * standard_normal(size)) + mu

Examples

>>> ak.standard_normal(3,1)
array([-0.68586185091150265, 1.1723810583573375, 0.567584107142031])
arkouda.pdarraycreation.uniform(size: arkouda.dtypes.int_scalars, low: arkouda.dtypes.numeric_scalars = float(0.0), high: arkouda.dtypes.numeric_scalars = 1.0, seed: None | arkouda.dtypes.int_scalars = None) arkouda.pdarrayclass.pdarray[source]

Generate a pdarray with uniformly distributed random float values in a specified range.

Parameters:
  • low (float_scalars) – The low value (inclusive) of the range, defaults to 0.0

  • high (float_scalars) – The high value (inclusive) of the range, defaults to 1.0

  • size (int_scalars) – The length of the returned array

  • seed (int_scalars, optional) – Value used to initialize the random number generator

Returns:

Values drawn uniformly from the specified range

Return type:

pdarray, float64

Raises:
  • TypeError – Raised if dtype.name not in DTypes, size is not an int, or if either low or high is not an int or float

  • ValueError – Raised if size < 0 or if high < low

Notes

The logic for uniform is delegated to the ak.randint method which is invoked with a dtype of float64

Examples

>>> ak.uniform(3)
array([0.92176432277231968, 0.083130710959903542, 0.68894208386667544])
>>> ak.uniform(size=3,low=0,high=5,seed=0)
array([0.30013431967121934, 0.47383036230759112, 1.0441791878997098])
arkouda.pdarraycreation.zeros(size: arkouda.dtypes.int_scalars | str, dtype: numpy.dtype | type | str | arkouda.dtypes.BigInt = float64, max_bits: int | None = None) arkouda.pdarrayclass.pdarray[source]

Create a pdarray filled with zeros.

Parameters:
  • size (int_scalars) – Size of the array (only rank-1 arrays supported)

  • dtype (all_scalars) – Type of resulting array, default float64

  • max_bits (int) – Specifies the maximum number of bits; only used for bigint pdarrays

Returns:

Zeros of the requested size and dtype

Return type:

pdarray

Raises:

TypeError – Raised if the supplied dtype is not supported or if the size parameter is neither an int nor a str that is parseable to an int.

See also

ones, zeros_like

Examples

>>> ak.zeros(5, dtype=ak.int64)
array([0, 0, 0, 0, 0])
>>> ak.zeros(5, dtype=ak.float64)
array([0, 0, 0, 0, 0])
>>> ak.zeros(5, dtype=ak.bool)
array([False, False, False, False, False])
arkouda.pdarraycreation.zeros_like(pda: arkouda.pdarrayclass.pdarray) arkouda.pdarrayclass.pdarray[source]

Create a zero-filled pdarray of the same size and dtype as an existing pdarray.

Parameters:

pda (pdarray) – Array to use for size and dtype

Returns:

Equivalent to ak.zeros(pda.size, pda.dtype)

Return type:

pdarray

Raises:

TypeError – Raised if the pda parameter is not a pdarray.

See also

zeros, ones_like

Examples

>>> zeros = ak.zeros(5, dtype=ak.int64)
>>> ak.zeros_like(zeros)
array([0, 0, 0, 0, 0])
>>> zeros = ak.zeros(5, dtype=ak.float64)
>>> ak.zeros_like(zeros)
array([0, 0, 0, 0, 0])
>>> zeros = ak.zeros(5, dtype=ak.bool)
>>> ak.zeros_like(zeros)
array([False, False, False, False, False])