arkouda.client_dtypes

Arkouda client-defined dtypes and helper utilities for structured or specialized array semantics.

This module introduces specialized subclasses of pdarray for handling and representing data with specific interpretations or domain semantics. These include:

  • BitVector: For representing integers as sets of binary flags.

  • Fields: For displaying and interacting with named binary flags.

  • IPv4: For storing and displaying 32-bit integers as IPv4 addresses.

These classes enhance usability and improve readability when working with encoded or domain-specific data while preserving Arkouda’s performance model and distributed data structures.

Functions

  • BitVectorizer: Creates a partially applied BitVector constructor.

  • ip_address: Converts various formats to an Arkouda IPv4 object.

  • is_ipv4: Returns a boolean array indicating IPv4 addresses.

  • is_ipv6: Returns a boolean array indicating IPv6 addresses.

Examples

>>> import arkouda as ak
>>> from arkouda.client_dtypes import BitVector, Fields, IPv4, ip_address, is_ipv4

Create and use BitVectors: >>> a = ak.array([3, 5, 7]) >>> bv = BitVector(a, width=4) >>> print(bv) BitVector([..||,

.|.|, .|||],

width=4, reverse=False)

Create Fields with named binary flags: >>> f = Fields(ak.array([1, 2, 3]), names=[‘read’, ‘write’, ‘exec’], separator=’:’) >>> print(f[0]) # doctest: +SKIP –:–:read (1)

Convert and work with IP addresses: >>> ips = ip_address([‘192.168.0.1’, ‘10.0.0.1’]) >>> print(ips) IPv4([192.168.0.1,

10.0.0.1],

)

>>> is_ipv4(ips)
array([True True])

Classes

BitVector

Represent integers as bit vectors, e.g. a set of flags.

Fields

An integer-backed representation of a set of named binary fields, e.g. flags.

IPv4

Represent integers as IPv4 addresses.

Functions

BitVectorizer([width, reverse])

Make a callback (i.e. function) that can be called on an array to create a BitVector.

ip_address(values)

Convert values to an Arkouda array of IP addresses.

is_ipv4(→ arkouda.numpy.pdarrayclass.pdarray)

Indicate which values are ipv4 when passed data containing IPv4 and IPv6 values.

is_ipv6(→ arkouda.numpy.pdarrayclass.pdarray)

Indicate which values are ipv6 when passed data containing IPv4 and IPv6 values.

Module Contents

class arkouda.client_dtypes.BitVector(values, width=64, reverse=False)[source]

Bases: arkouda.numpy.pdarrayclass.pdarray

Represent integers as bit vectors, e.g. a set of flags.

Parameters:
  • values (pdarray, int64) – The integers to represent as bit vectors

  • width (int) – The number of bit fields in the vector

  • reverse (bool) – If True, display bits from least significant (left) to most significant (right). By default, the most significant bit is the left-most bit.

Returns:

bitvectors – The array of binary vectors

Return type:

BitVector

Notes

This class is a thin wrapper around pdarray that mostly affects how values are displayed to the user. Operators and methods will typically treat this class like a uint64 pdarray.

conserves
format(x)[source]

Format a single binary vector as a string.

classmethod from_return_msg(rep_msg)[source]
opeq(other, op)[source]
register(user_defined_name)[source]

Register this BitVector object and underlying components with the Arkouda server.

Parameters:

user_defined_name (str) – user defined name the BitVector is to be registered under, this will be the root name for underlying components

Returns:

The same BitVector which is now registered with the arkouda server and has an updated name. This is an in-place modification, the original is returned to support a fluid programming style. Please note you cannot register two different BitVectors with the same name.

Return type:

BitVector

Raises:
  • TypeError – Raised if user_defined_name is not a str

  • RegistrationError – If the server was unable to register the BitVector with the user_defined_name

See also

unregister, attach, is_registered

Notes

Objects registered with the server are immune to deletion until they are unregistered.

registered_name = None
reverse = False
special_objType = 'BitVector'
to_list()[source]

Export data to a list of string-formatted bit vectors.

to_ndarray()[source]

Export data to a numpy array of string-formatted bit vectors.

values
width = 64
arkouda.client_dtypes.BitVectorizer(width=64, reverse=False)[source]

Make a callback (i.e. function) that can be called on an array to create a BitVector.

Parameters:
  • width (int) – The number of bit fields in the vector

  • reverse (bool) – If True, display bits from least significant (left) to most significant (right). By default, the most significant bit is the left-most bit.

Returns:

bitvectorizer – A function that takes an array and returns a BitVector instance

Return type:

callable

class arkouda.client_dtypes.Fields(values, names, MSB_left=True, pad='-', separator='', show_int=True)[source]

Bases: BitVector

An integer-backed representation of a set of named binary fields, e.g. flags.

Parameters:
  • values (pdarray or Strings) – The array of field values. If (u)int64, the values are used as-is for the binary representation of fields. If Strings, the values are converted to binary according to the mapping defined by the names and MSB_left arguments.

  • names (str or sequence of str) – The names of the fields, in order. A string will be treated as a list of single-character field names. Multi-character field names are allowed, but must be passed as a list or tuple and user must specify a separator.

  • MSB_left (bool) – Controls how field names are mapped to binary values. If True (default), the left-most field name corresponds to the most significant bit in the binary representation. If False, the left-most field name corresponds to the least significant bit.

  • pad (str) – Character to display when field is not present. Use empty string if no padding is desired.

  • separator (str) – Substring that separates fields. Used to parse input values (if ak.Strings) and to display output.

  • show_int (bool) – If True (default), display the integer value of the binary fields in output.

Returns:

fields – The array of field values

Return type:

Fields

Notes

This class is a thin wrapper around pdarray that mostly affects how values are displayed to the user. Operators and methods will typically treat this class like an int64 pdarray.

MSB_left = True
format(x)[source]

Format a single binary value as a string of named fields.

name = None
names
namewidth
opeq(other, op)[source]
pad
padchar = '-'
separator = ''
show_int = True
class arkouda.client_dtypes.IPv4(values)[source]

Bases: arkouda.numpy.pdarrayclass.pdarray

Represent integers as IPv4 addresses.

Parameters:

values (pdarray, int64) – The integer IP addresses

Returns:

The same IP addresses

Return type:

IPv4

Notes

This class is a thin wrapper around pdarray that mostly affects how values are displayed to the user. Operators and methods will typically treat this class like an int64 pdarray.

export_uint()[source]
format(x)[source]

Format a single integer IP address as a string.

normalize(x)[source]

Normalize IP adress.

Take in an IP address as a string, integer, or IPAddress object, and convert it to an integer.

opeq(other, op)[source]
register(user_defined_name)[source]

Register this IPv4 object and underlying components with the Arkouda server.

Parameters:

user_defined_name (str) – user defined name the IPv4 is to be registered under, this will be the root name for underlying components

Returns:

The same IPv4 which is now registered with the arkouda server and has an updated name. This is an in-place modification, the original is returned to support a fluid programming style. Please note you cannot register two different IPv4s with the same name.

Return type:

IPv4

Raises:
  • TypeError – Raised if user_defined_name is not a str

  • RegistrationError – If the server was unable to register the IPv4 with the user_defined_name

See also

unregister, attach, is_registered

Notes

Objects registered with the server are immune to deletion until they are unregistered.

special_objType = 'IPv4'
to_hdf(prefix_path: str, dataset: str = 'array', mode: str = 'truncate', file_type: str = 'distribute')[source]

Override of the pdarray to_hdf to store the special object type.

to_list()[source]

Export array as a list of integers.

to_ndarray()[source]

Export array as a numpy array of integers.

update_hdf(prefix_path: str, dataset: str = 'array', repack: bool = True)[source]

Override the pdarray implementation so that the special object type will be used.

values
arkouda.client_dtypes.ip_address(values)[source]

Convert values to an Arkouda array of IP addresses.

Parameters:

values (list-like, integer pdarray, or IPv4) – The integer IP addresses or IPv4 object.

Returns:

The same IP addresses as an Arkouda array

Return type:

IPv4

Notes

This helper is intended to help future proof changes made to accomodate IPv6 and to prevent errors if a user inadvertently casts a IPv4 instead of a int64 pdarray. It can also be used for importing Python lists of IP addresses into Arkouda.

arkouda.client_dtypes.is_ipv4(ip: arkouda.numpy.pdarrayclass.pdarray | IPv4, ip2: arkouda.numpy.pdarrayclass.pdarray | None = None) arkouda.numpy.pdarrayclass.pdarray[source]

Indicate which values are ipv4 when passed data containing IPv4 and IPv6 values.

Parameters:
  • ip (pdarray (int64) or ak.IPv4) – IPv4 value. High Bits of IPv6 if IPv6 is passed in.

  • ip2 (pdarray (int64), Optional) – Low Bits of IPv6. This is added for support when dealing with data that contains IPv6 as well.

Return type:

pdarray of bools indicating which indexes are IPv4.

See also

ak.is_ipv6

arkouda.client_dtypes.is_ipv6(ip: arkouda.numpy.pdarrayclass.pdarray | IPv4, ip2: arkouda.numpy.pdarrayclass.pdarray | None = None) arkouda.numpy.pdarrayclass.pdarray[source]

Indicate which values are ipv6 when passed data containing IPv4 and IPv6 values.

Parameters:
Return type:

pdarray of bools indicating which indexes are IPv6.

See also

ak.is_ipv4