arkouda.series

Classes

Series

One-dimensional arkouda array with axis labels.

Package Contents

class arkouda.series.Series(data: Tuple | List | arkouda.pandas.groupbyclass.groupable_element_type | Series | arkouda.numpy.segarray.SegArray | pandas.Series | pandas.Categorical, name=None, index: arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | Tuple | List | arkouda.pandas.index.Index | None = None)[source]

One-dimensional arkouda array with axis labels.

Parameters:
  • index (pdarray, Strings) – an array of indices associated with the data array. If empty, it will default to a range of ints whose size match the size of the data. optional

  • data (Tuple, List, groupable_element_type, Series, SegArray) – a 1D array. Must not be None.

Raises:
  • TypeError – Raised if index is not a pdarray or Strings object Raised if data is not a pdarray, Strings, or Categorical object

  • ValueError – Raised if the index size does not match data size

Notes

The Series class accepts either positional arguments or keyword arguments. If entering positional arguments,

2 arguments entered:

argument 1 - data argument 2 - index

1 argument entered:

argument 1 - data

If entering 1 positional argument, it is assumed that this is the data argument. If only ‘data’ argument is passed in, Index will automatically be generated. If entering keywords,

‘data’ (see Parameters) ‘index’ (optional) must match size of ‘data’

add(b: Series) Series[source]
property at: _LocIndexer

Accesses entries of a Series by label.

Returns:

An indexer for label-based access to Series entries.

Return type:

_LocIndexer

static concat(arrays: List, axis: int = 0, index_labels: List[str] | None = None, value_labels: List[str] | None = None, ordered: bool = False) arkouda.pandas.dataframe.DataFrame | Series[source]

Concatenate a list of Arkouda Series or grouped arrays horizontally or vertically.

If a list of grouped Arkouda arrays is passed, they are converted to Series. Each grouping is a 2-tuple where the first item is the key(s) and the second is the value. If concatenating horizontally (axis=1), all series/groupings must have the same length and the same index. The index is converted to a column in the resulting DataFrame; if it’s a MultiIndex, each level is converted to a separate column.

Parameters:
  • arrays (List) – A list of Series or groupings (tuples of index and values) to concatenate.

  • axis (int, default=0) – The axis to concatenate along: - 0 = vertical (stack series into one) - 1 = horizontal (align by index and produce a DataFrame)

  • index_labels (List of str or None, optional) – Column name(s) to label the index when axis=1.

  • value_labels (List of str or None, optional) – Column names to label the values of each Series.

  • ordered (bool, default=False) – Unused parameter. Reserved for future support of deterministic vs. performance-optimized concatenation.

Returns:

  • If axis=0: a new Series

  • If axis=1: a new DataFrame

Return type:

Series or DataFrame

diff() Series[source]

Diffs consecutive values of the series.

Returns a new series with the same index and length. First value is set to NaN.

dt
property dtype: numpy.dtype
fillna(value: supported_scalars | Series | arkouda.numpy.pdarrayclass.pdarray) Series[source]

Fill NA/NaN values using the specified method.

Parameters:

value (supported_scalars, Series, or pdarray) – Value to use to fill holes (e.g. 0), alternately a Series of values specifying which value to use for each index. Values not in the Series will not be filled. This value cannot be a list.

Returns:

Object with missing values filled.

Return type:

Series

Examples

>>> import arkouda as ak
>>> ak.connect()
>>> from arkouda import Series
>>> data = ak.Series([1, np.nan, 3, np.nan, 5])
>>> data

0

0

1.0

1

nan

2

3.0

3

nan

4

5.0

>>> fill_values1 = ak.ones(5)
>>> data.fillna(fill_values1)

0

0

1.0

1

1.0

2

3.0

3

1.0

4

5.0

>>> fill_values2 = Series(ak.ones(5))
>>> data.fillna(fill_values2)

0

0

1.0

1

1.0

2

3.0

3

1.0

4

5.0

>>> fill_values3 = 100.0
>>> data.fillna(fill_values3)

0

0

1.0

1

100.0

2

3.0

3

100.0

4

5.0

classmethod from_return_msg(repMsg: str) Series[source]

Return a Series instance pointing to components created by the arkouda server. The user should not call this function directly.

Parameters:

repMsg (str) –

  • delimited string containing the values and indexes

Returns:

A Series representing a set of pdarray components on the server

Return type:

Series

Raises:

RuntimeError – Raised if a server-side error is thrown in the process of creating the Series instance

has_repeat_labels() bool[source]

Return whether the Series has any labels that appear more than once.

hasnans() arkouda.numpy.dtypes.bool_scalars[source]

Return True if there are any NaNs.

Return type:

bool

Examples

>>> import arkouda as ak
>>> ak.connect()
>>> from arkouda import Series
>>> import numpy as np
>>> s = ak.Series(ak.array([1, 2, 3, np.nan]))
>>> s

0

0

1.0

1

2.0

2

3.0

3

nan

>>> s.hasnans()
True
head(n: int = 10) Series[source]

Return the first n values of the series

property iat: _iLocIndexer

Accesses entries of a Series by position.

Returns:

An indexer for position-based access to a single element.

Return type:

_iLocIndexer

property iloc: _iLocIndexer

Accesses entries of a Series by position.

Returns:

An indexer for position-based access to Series entries.

Return type:

_iLocIndexer

is_registered() bool[source]

Return True iff the object is contained in the registry or is a component of a registered object.

Returns:

Indicates if the object is contained in the registry

Return type:

bool

Raises:

RegistrationError – Raised if there’s a server-side error or a mis-match of registered components

See also

register, attach, unregister

Notes

Objects registered with the server are immune to deletion until they are unregistered.

isin(lst: arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | List) Series[source]

Find Series elements whose values are in the specified list.

Parameters:

lst (pdarray, Strings, or List) – Either a Python list or an Arkouda array to check membership against.

Returns:

A Series of booleans that is True for elements found in the list, and False otherwise.

Return type:

Series

isna() Series[source]

Detect missing values.

Return a boolean same-sized object indicating if the values are NA. NA values, such as numpy.NaN, gets mapped to True values. Everything else gets mapped to False values. Characters such as empty strings ‘’ are not considered NA values.

Returns:

Mask of bool values for each element in Series that indicates whether an element is an NA value.

Return type:

Series

Examples

>>> import arkouda as ak
>>> ak.connect()
>>> from arkouda import Series
>>> import numpy as np
>>> s = Series(ak.array([1, 2, np.nan]), index = ak.array([1, 2, 4]))
>>> s.isna()

0

1

False

2

False

4

True

isnull() Series[source]

Series.isnull is an alias for Series.isna.

Detect missing values.

Return a boolean same-sized object indicating if the values are NA. NA values, such as numpy.NaN, gets mapped to True values. Everything else gets mapped to False values. Characters such as empty strings ‘’ are not considered NA values.

Returns:

Mask of bool values for each element in Series that indicates whether an element is an NA value.

Return type:

Series

Examples

>>> import arkouda as ak
>>> ak.connect()
>>> from arkouda import Series
>>> import numpy as np
>>> s = Series(ak.array([1, 2, np.nan]), index = ak.array([1, 2, 4]))
>>> s.isnull()

0

1

False

2

False

4

True

property loc: _LocIndexer

Accesses entries of a Series by label.

Returns:

An indexer for label-based access to Series entries.

Return type:

_LocIndexer

locate(key: int | arkouda.numpy.pdarrayclass.pdarray | arkouda.pandas.index.Index | Series | List | Tuple) Series[source]

Lookup values by index label.

Parameters:

key (int, pdarray, Index, Series, List, or Tuple) –

The key or keys to look up. This can be: - A scalar - A list of scalars - A list of lists (for MultiIndex) - A Series (in which case labels are preserved, and its values are used as keys)

Keys will be converted to Arkouda arrays as needed.

Returns:

A Series containing the values corresponding to the key.

Return type:

Series

map(arg: dict | arkouda.Series) arkouda.Series[source]

Map values of Series according to an input mapping.

Parameters:

arg (dict or Series) – The mapping correspondence.

Returns:

A new series with the same index as the caller. When the input Series has Categorical values, the return Series will have Strings values. Otherwise, the return type will match the input type.

Return type:

Series

Raises:

TypeError – Raised if arg is not of type dict or arkouda.Series. Raised if series values not of type pdarray, Categorical, or Strings.

Examples

>>> import arkouda as ak
>>> ak.connect()
>>> s = ak.Series(ak.array([2, 3, 2, 3, 4]))
>>> s

0

0

2

1

3

2

2

3

3

4

4

>>> s.map({4: 25.0, 2: 30.0, 1: 7.0, 3: 5.0})

0

0

30.0

1

5.0

2

30.0

3

5.0

4

25.0

>>> s2 = ak.Series(ak.array(["a","b","c","d"]), index = ak.array([4,2,1,3]))
>>> s.map(s2)

0

0

b

1

d

2

b

3

d

4

a

memory_usage(index: bool = True, unit: Literal['B', 'KB', 'MB', 'GB'] = 'B') int[source]

Return the memory usage of the Series.

The memory usage can optionally include the contribution of the index.

Parameters:
  • index (bool, default=True) – Specifies whether to include the memory usage of the Series index.

  • unit ({"B", "KB", "MB", "GB"}, default = "B") – Unit to return. One of {‘B’, ‘KB’, ‘MB’, ‘GB’}.

Returns:

Bytes of memory consumed.

Return type:

int

See also

arkouda.numpy.pdarrayclass.nbytes, arkouda.Index.memory_usage, arkouda.pandas.series.Series.memory_usage, arkouda.pandas.datafame.DataFrame.memory_usage

Examples

>>> import arkouda as ak
>>> from arkouda.series import Series
>>> s = ak.Series(ak.arange(3))
>>> s.memory_usage()
48

Not including the index gives the size of the rest of the data, which is necessarily smaller:

>>> s.memory_usage(index=False)
24

Select the units:

>>> s = ak.Series(ak.arange(3000))
>>> s.memory_usage(unit="KB")
46.875
property ndim: int
notna() Series[source]

Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings ‘’ are not considered NA values. NA values, such as numpy.NaN, get mapped to False values.

Returns:

Mask of bool values for each element in Series that indicates whether an element is not an NA value.

Return type:

Series

Examples

>>> import arkouda as ak
>>> ak.connect()
>>> from arkouda import Series
>>> import numpy as np
>>> s = Series(ak.array([1, 2, np.nan]), index = ak.array([1, 2, 4]))
>>> s.notna()

0

1

True

2

True

4

False

notnull() Series[source]

Series.notnull is an alias for Series.notna.

Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings ‘’ are not considered NA values. NA values, such as numpy.NaN, get mapped to False values.

Returns:

Mask of bool values for each element in Series that indicates whether an element is not an NA value.

Return type:

Series

Examples

>>> import arkouda as ak
>>> ak.connect()
>>> from arkouda import Series
>>> import numpy as np
>>> s = Series(ak.array([1, 2, np.nan]), index = ak.array([1, 2, 4]))
>>> s.notnull()

0

1

True

2

True

4

False

objType = 'Series'
static pdconcat(arrays: List, axis: int = 0, labels: arkouda.numpy.strings.Strings | None = None) pandas.Series | pandas.DataFrame[source]

Concatenate a list of Arkouda Series or grouped arrays, returning a local pandas object.

If a list of grouped Arkouda arrays is passed, they are converted to Series. Each grouping is a 2-tuple with the first item being the key(s) and the second the value.

If axis=1 (horizontal), each Series or grouping must have the same length and the same index. The index is converted to a column in the resulting DataFrame. If it is a MultiIndex, each level is converted to a separate column.

Parameters:
  • arrays (List) – A list of Series or groupings (tuples of index and values) to concatenate.

  • axis (int, default=0) – The axis along which to concatenate: - 0 = vertical (stack into a Series) - 1 = horizontal (align by index into a DataFrame)

  • labels (Strings or None, optional) – Names to assign to the resulting columns in the DataFrame.

Returns:

  • If axis=0: a local pandas Series

  • If axis=1: a local pandas DataFrame

Return type:

Series or DataFrame

register(user_defined_name: str)[source]

Register this Series object and underlying components with the Arkouda server

Parameters:

user_defined_name (str) – user defined name the Series is to be registered under, this will be the root name for underlying components

Returns:

The same Series which is now registered with the arkouda server and has an updated name. This is an in-place modification, the original is returned to support a fluid programming style. Please note you cannot register two different Series with the same name.

Return type:

Series

Raises:
  • TypeError – Raised if user_defined_name is not a str

  • RegistrationError – If the server was unable to register the Series with the user_defined_name

See also

unregister, attach, is_registered

Notes

Objects registered with the server are immune to deletion until they are unregistered.

registered_name: str | None = None
property shape: Tuple[int]
size
sort_index(ascending: bool = True) Series[source]

Sort the Series by its index.

Parameters:

ascending (bool, default=True) – Whether to sort the index in ascending (default) or descending order.

Returns:

A new Series sorted by index.

Return type:

Series

sort_values(ascending: bool = True) Series[source]

Sort the Series by its values.

Parameters:

ascending (bool, default=True) – Whether to sort values in ascending (default) or descending order.

Returns:

A new Series sorted by its values.

Return type:

Series

str
tail(n: int = 10) Series[source]

Return the last n values of the series

to_dataframe(index_labels: List[str] | None = None, value_label: str | None = None) arkouda.pandas.dataframe.DataFrame[source]

Convert the Series to an Arkouda DataFrame.

Parameters:
  • index_labels (list of str or None, optional) – Column name(s) to label the index.

  • value_label (str or None, optional) – Column name to label the values.

Returns:

An Arkouda DataFrame representing the Series.

Return type:

DataFrame

to_markdown(mode='wt', index=True, tablefmt='grid', storage_options=None, **kwargs)[source]

Print Series in Markdown-friendly format.

Parameters:
  • mode (str, optional) – Mode in which file is opened, “wt” by default.

  • index (bool, optional, default True) – Add index (row) labels.

  • tablefmt (str = "grid") – Table format to call from tablulate: https://pypi.org/project/tabulate/

  • storage_options (dict, optional) – Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc., if using a URL that will be parsed by fsspec, e.g., starting “s3://”, “gcs://”. An error will be raised if providing this argument with a non-fsspec URL. See the fsspec and backend storage implementation docs for the set of allowed keys and values.

  • **kwargs – These parameters will be passed to tabulate.

Note

This function should only be called on small Series as it calls pandas.Series.to_markdown: https://pandas.pydata.org/docs/reference/api/pandas.Series.to_markdown.html

Examples

>>> import arkouda as ak
>>> ak.connect()
>>> s = ak.Series(["elk", "pig", "dog", "quetzal"], name="animal")
>>> print(s.to_markdown())
|    | animal   |
|---:|:---------|
|  0 | elk      |
|  1 | pig      |
|  2 | dog      |
|  3 | quetzal  |

Output markdown with a tabulate option.

>>> print(s.to_markdown(tablefmt="grid"))
+----+----------+
|    | animal   |
+====+==========+
|  0 | elk      |
+----+----------+
|  1 | pig      |
+----+----------+
|  2 | dog      |
+----+----------+
|  3 | quetzal  |
+----+----------+
to_ndarray() numpy.ndarray[source]
to_pandas() pandas.Series[source]

Convert the series to a local PANDAS series

tolist() list[source]
topn(n: int = 10) Series[source]

Return the top values of the Series.

Parameters:

n (int, default=10) – Number of values to return. The default of 10 returns the top 10 values.

Returns:

A new Series containing the top n values.

Return type:

Series

unregister()[source]

Unregister this Series object in the arkouda server which was previously registered using register() and/or attached to using attach()

Raises:

RegistrationError – If the object is already unregistered or if there is a server error when attempting to unregister

See also

register, attach, is_registered

Notes

Objects registered with the server are immune to deletion until they are unregistered.

validate_key(key: Series | arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | arkouda.pandas.categorical.Categorical | List | supported_scalars | arkouda.numpy.segarray.SegArray) arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | arkouda.pandas.categorical.Categorical | supported_scalars | arkouda.numpy.segarray.SegArray[source]

Validate type requirements for keys when reading or writing the Series. Also converts list and tuple arguments into pdarrays.

Parameters:

key (Series, pdarray, Strings, Categorical, List, supported_scalars, or SegArray) – The key or container of keys that might be used to index into the Series.

Return type:

The validated key(s), with lists and tuples converted to pdarrays

Raises:
  • TypeError – Raised if keys are not boolean values or the type of the labels Raised if key is not one of the supported types

  • KeyError – Raised if container of keys has keys not present in the Series

  • IndexError – Raised if the length of a boolean key array is different from the Series

validate_val(val: arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | supported_scalars | List) arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | supported_scalars[source]

Validate type requirements for values being written into the Series. Also converts list and tuple arguments into pdarrays.

Parameters:

val (pdarray, Strings, supported_scalars, or List) – The value or container of values that might be assigned into the Series.

Return type:

The validated value, with lists converted to pdarrays

Raises:

TypeError

Raised if val is not the same type or a container with elements

of the same time as the Series

Raised if val is a string or Strings type. Raised if val is not one of the supported types

value_counts(sort: bool = True) Series[source]

Return a Series containing counts of unique values.

Parameters:

sort (bool, default=True) – Whether to sort the result by count in descending order. If False, the order of the results is not guaranteed.

Returns:

A Series where the index contains the unique values and the values are their counts in the original Series.

Return type:

Series