arkouda.array_view

Module Contents

Classes

ArrayView

A multi-dimensional view of a pdarray. Arkouda ArraryView behaves similarly to numpy's ndarray.

class arkouda.array_view.ArrayView(base: arkouda.pdarrayclass.pdarray, shape, order='row_major')[source]

A multi-dimensional view of a pdarray. Arkouda ArraryView behaves similarly to numpy’s ndarray. The base pdarray is stored in 1-dimension but can be indexed and treated logically as if it were multi-dimensional

base

The base pdarray that is being viewed as a multi-dimensional object

Type:

pdarray

dtype

The element type of the base pdarray (equivalent to base.dtype)

Type:

dtype

size

The number of elements in the base pdarray (equivalent to base.size)

Type:

int_scalars

shape

A pdarray specifying the sizes of each dimension of the array

Type:

pdarray[int]

ndim

Number of dimensions (equivalent to shape.size)

Type:

int_scalars

itemsize

The size in bytes of each element (equivalent to base.itemsize)

Type:

int_scalars

order

Index order to read and write the elements. By default or if ‘C’/’row_major’, read and write data in row_major order If ‘F’/’column_major’, read and write data in column_major order

Type:

str {‘C’/’row_major’ | ‘F’/’column_major’}

objType = 'ArrayView'
to_hdf(prefix_path: str, dataset: str = 'ArrayView', mode: str = 'truncate', file_type: str = 'distribute')[source]

Save the current ArrayView object to hdf5 file

Parameters:
  • prefix_path (str) – Path to the file to write the dataset to

  • dataset (str) – Name of the dataset to write

  • mode (str (truncate | append)) – Default: truncate Mode to write the dataset in. Truncate will overwrite any existing files. Append will add the dataset to an existing file.

  • file_type (str (single|distribute)) – Default: distribute Indicates the format to save the file. Single will store in a single file. Distribute will store the date in a file per locale.

to_list() list[source]

Convert the ArrayView to a list, transferring array data from the Arkouda server to client-side Python. Note: if the ArrayView size exceeds client.maxTransferBytes, a RuntimeError is raised.

Returns:

A list with the same data as the ArrayView

Return type:

list

Raises:

RuntimeError – Raised if there is a server-side error thrown, if the ArrayView size exceeds the built-in client.maxTransferBytes size limit, or if the bytes received does not match expected number of bytes

Notes

The number of bytes in the array cannot exceed client.maxTransferBytes, otherwise a RuntimeError will be raised. This is to protect the user from overflowing the memory of the system on which the Python client is running, under the assumption that the server is running on a distributed system with much more memory than the client. The user may override this limit by setting client.maxTransferBytes to a larger value, but proceed with caution.

See also

to_ndarray

Examples

>>> a = ak.arange(6).reshape(2,3)
>>> a.to_list()
[[0, 1, 2], [3, 4, 5]]
>>> type(a.to_list())
list
to_ndarray() numpy.ndarray[source]

Convert the ArrayView to a np.ndarray, transferring array data from the Arkouda server to client-side Python. Note: if the ArrayView size exceeds client.maxTransferBytes, a RuntimeError is raised.

Returns:

A numpy ndarray with the same attributes and data as the ArrayView

Return type:

np.ndarray

Raises:

RuntimeError – Raised if there is a server-side error thrown, if the ArrayView size exceeds the built-in client.maxTransferBytes size limit, or if the bytes received does not match expected number of bytes

Notes

The number of bytes in the array cannot exceed client.maxTransferBytes, otherwise a RuntimeError will be raised. This is to protect the user from overflowing the memory of the system on which the Python client is running, under the assumption that the server is running on a distributed system with much more memory than the client. The user may override this limit by setting client.maxTransferBytes to a larger value, but proceed with caution.

See also

array, to_list

Examples

>>> a = ak.arange(6).reshape(2,3)
>>> a.to_ndarray()
array([[0, 1, 2],
       [3, 4, 5]])
>>> type(a.to_ndarray())
numpy.ndarray
update_hdf(prefix_path: str, dataset: str = 'ArrayView', repack: bool = True)[source]

Overwrite the dataset with the name provided with this array view object. If the dataset does not exist it is added.

Parameters:
  • prefix_path (str) – Directory and filename prefix that all output files share

  • dataset (str) – Name of the dataset to create in files

  • repack (bool) – Default: True HDF5 does not release memory on delete. When True, the inaccessible data (that was overwritten) is removed. When False, the data remains, but is inaccessible. Setting to false will yield better performance, but will cause file sizes to expand.

Return type:

str - success message if successful

Raises:

RuntimeError – Raised if a server-side error is thrown saving the array view

Notes

  • If file does not contain File_Format attribute to indicate how it was saved, the file name is checked for _LOCALE#### to determine if it is distributed.

  • If the dataset provided does not exist, it will be added

  • Because HDF5 deletes do not release memory, this will create a copy of the file with the new data