arkouda.series¶
Classes¶
One-dimensional arkouda array with axis labels. |
Module Contents¶
- class arkouda.series.Series[source]¶
One-dimensional arkouda array with axis labels.
- Parameters:
- Raises:
TypeError – Raised if index is not a pdarray or Strings object Raised if data is not a pdarray, Strings, or Categorical object
ValueError – Raised if the index size does not match data size
Notes
The Series class accepts either positional arguments or keyword arguments. If entering positional arguments,
- 2 arguments entered:
argument 1 - data argument 2 - index
- 1 argument entered:
argument 1 - data
If entering 1 positional argument, it is assumed that this is the data argument. If only ‘data’ argument is passed in, Index will automatically be generated. If entering keywords,
‘data’ (see Parameters) ‘index’ (optional) must match size of ‘data’
- argmax()¶
- argmin()¶
- property at¶
- Accesses entries of a Series by label
- attach(label: str, nkeys: int = 1) Series [source]¶
DEPRECATED Retrieve a series registered with arkouda
- Parameters:
label (name used to register the series)
nkeys (number of keys, if a multi-index was registerd)
- concat(arrays: List, axis: int = 0, index_labels: List[str] | None = None, value_labels: List[str] | None = None, ordered=False) arkouda.dataframe.DataFrame | Series [source]¶
- Concatenate in arkouda a list of arkouda Series or grouped arkouda arrays horizontally or
vertically. If a list of grouped arkouda arrays is passed they are converted to a series. Each grouping is a 2-tuple with the first item being the key(s) and the second being the value. If horizontal, each series or grouping must have the same length and the same index. The index of the series is converted to a column in the dataframe. If it is a multi-index,each level is converted to a column.
arrays: The list of series/groupings to concat. axis : Whether or not to do a verticle (axis=0) or horizontal (axis=1) concatenation index_labels: column names(s) to label the index. value_labels: column names to label values of each series. ordered: If True (default), the arrays will be appended in the order given. If False, array
data may be interleaved in blocks, which can greatly improve performance but results in non-deterministic ordering of elements.
axis=0: an arkouda series. axis=1: an arkouda dataframe.
- diff() Series [source]¶
Diffs consecutive values of the series.
Returns a new series with the same index and length. First value is set to NaN.
- dt(series)¶
- property dtype¶
- fillna(value) Series [source]¶
Fill NA/NaN values using the specified method.
- Parameters:
value (scalar, Series, or pdarray) – Value to use to fill holes (e.g. 0), alternately a Series of values specifying which value to use for each index. Values not in the Series will not be filled. This value cannot be a list.
- Returns:
Object with missing values filled.
- Return type:
Examples
>>> import arkouda as ak >>> ak.connect() >>> from arkouda import Series
>>> data = ak.Series([1, np.nan, 3, np.nan, 5]) >>> data
0
0
1
1
nan
2
3
3
nan
4
5
>>> fill_values1 = ak.ones(5) >>> data.fillna(fill_values1)
0
0
1
1
1
2
3
3
1
4
5
>>> fill_values2 = Series(ak.ones(5)) >>> data.fillna(fill_values2)
0
0
1
1
1
2
3
3
1
4
5
>>> fill_values3 = 100.0 >>> data.fillna(fill_values3)
0
0
1
1
100
2
3
3
100
4
5
- from_return_msg(repMsg: str) Series [source]¶
Return a Series instance pointing to components created by the arkouda server. The user should not call this function directly.
- Parameters:
repMsg (str) –
delimited string containing the values and indexes
- Returns:
A Series representing a set of pdarray components on the server
- Return type:
- Raises:
RuntimeError – Raised if a server-side error is thrown in the process of creating the Series instance
- has_repeat_labels() bool [source]¶
Returns whether the Series has any labels that appear more than once
- hasnans() bool [source]¶
Return True if there are any NaNs.
- Return type:
bool
Examples
>>> import arkouda as ak >>> ak.connect() >>> from arkouda import Series >>> import numpy as np
>>> s = ak.Series(ak.array([1, 2, 3, np.nan])) >>> s
>>> s.hasnans True
- property iat: Series¶
Accesses entries of a Series by position
- Parameters:
key (int) – The positions or container of positions to access entries for
- property iloc: Series¶
Accesses entries of a Series by position
- Parameters:
key (int) – The positions or container of positions to access entries for
- is_registered() bool [source]¶
Return True iff the object is contained in the registry or is a component of a registered object.
- Returns:
Indicates if the object is contained in the registry
- Return type:
numpy.bool
- Raises:
RegistrationError – Raised if there’s a server-side error or a mis-match of registered components
See also
Notes
Objects registered with the server are immune to deletion until they are unregistered.
- isin(lst: pdarray | Strings | List) Series [source]¶
Find series elements whose values are in the specified list
Either a python list or an arkouda array.
Arkouda boolean which is true for elements that are in the list and false otherwise.
- isna() Series [source]¶
Detect missing values.
Return a boolean same-sized object indicating if the values are NA. NA values, such as numpy.NaN, gets mapped to True values. Everything else gets mapped to False values. Characters such as empty strings ‘’ are not considered NA values.
- Returns:
Mask of bool values for each element in Series that indicates whether an element is an NA value.
- Return type:
Examples
>>> import arkouda as ak >>> ak.connect() >>> from arkouda import Series >>> import numpy as np
>>> s = Series(ak.array([1, 2, np.nan]), index = ak.array([1, 2, 4])) >>> s.isna()
0
1
False
2
False
4
True
- isnull() Series [source]¶
Series.isnull is an alias for Series.isna.
Detect missing values.
Return a boolean same-sized object indicating if the values are NA. NA values, such as numpy.NaN, gets mapped to True values. Everything else gets mapped to False values. Characters such as empty strings ‘’ are not considered NA values.
- Returns:
Mask of bool values for each element in Series that indicates whether an element is an NA value.
- Return type:
Examples
>>> import arkouda as ak >>> ak.connect() >>> from arkouda import Series >>> import numpy as np
>>> s = Series(ak.array([1, 2, np.nan]), index = ak.array([1, 2, 4])) >>> s.isnull()
0
1
False
2
False
4
True
- locate(key: int | pdarray | Index | Series | List | Tuple) Series [source]¶
Lookup values by index label
The input can be a scalar, a list of scalers, or a list of lists (if the series has a MultiIndex). As a special case, if a Series is used as the key, the series labels are preserved with its values use as the key.
Keys will be turned into arkouda arrays as needed.
A Series containing the values corresponding to the key.
- map(arg: dict | Series) Series [source]¶
Map values of Series according to an input mapping.
- Parameters:
arg (dict or Series) – The mapping correspondence.
- Returns:
A new series with the same index as the caller. When the input Series has Categorical values, the return Series will have Strings values. Otherwise, the return type will match the input type.
- Return type:
- Raises:
TypeError – Raised if arg is not of type dict or arkouda.Series. Raised if series values not of type pdarray, Categorical, or Strings.
Examples
>>> import arkouda as ak >>> ak.connect() >>> s = ak.Series(ak.array([2, 3, 2, 3, 4])) >>> display(s)
0
0
2
1
3
2
2
3
3
4
4
>>> s.map({4: 25.0, 2: 30.0, 1: 7.0, 3: 5.0})
0
0
30.0
1
5.0
2
30.0
3
5.0
4
25.0
>>> s2 = ak.Series(ak.array(["a","b","c","d"]), index = ak.array([4,2,1,3])) >>> s.map(s2)
0
0
b
1
b
2
d
3
d
4
a
- max()¶
- mean()¶
- memory_usage(index: bool = True, unit='B') int [source]¶
Return the memory usage of the Series.
The memory usage can optionally include the contribution of the index.
- Parameters:
index (bool, default True) – Specifies whether to include the memory usage of the Series index.
unit (str, default = "B") – Unit to return. One of {‘B’, ‘KB’, ‘MB’, ‘GB’}.
- Returns:
Bytes of memory consumed.
- Return type:
int
See also
arkouda.pdarrayclass.nbytes
,arkouda.index.Index.memory_usage
,arkouda.series.Series.memory_usage
,arkouda.dataframe.DataFrame.memory_usage
Examples
>>> from arkouda.series import Series >>> s = ak.Series(ak.arange(3)) >>> s.memory_usage() 48
Not including the index gives the size of the rest of the data, which is necessarily smaller:
>>> s.memory_usage(index=False) 24
Select the units:
>>> s = ak.Series(ak.arange(3000)) >>> s.memory_usage(unit="KB") 46.875
- min()¶
- property ndim¶
- notna() Series [source]¶
Detect existing (non-missing) values.
Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings ‘’ are not considered NA values. NA values, such as numpy.NaN, get mapped to False values.
- Returns:
Mask of bool values for each element in Series that indicates whether an element is not an NA value.
- Return type:
Examples
>>> import arkouda as ak >>> ak.connect() >>> from arkouda import Series >>> import numpy as np
>>> s = Series(ak.array([1, 2, np.nan]), index = ak.array([1, 2, 4])) >>> s.notna()
0
1
True
2
True
4
False
- notnull() Series [source]¶
Series.notnull is an alias for Series.notna.
Detect existing (non-missing) values.
Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings ‘’ are not considered NA values. NA values, such as numpy.NaN, get mapped to False values.
- Returns:
Mask of bool values for each element in Series that indicates whether an element is not an NA value.
- Return type:
Examples
>>> import arkouda as ak >>> ak.connect() >>> from arkouda import Series >>> import numpy as np
>>> s = Series(ak.array([1, 2, np.nan]), index = ak.array([1, 2, 4])) >>> s.notnull()
0
1
True
2
True
4
False
- objType(*args, **kwargs)¶
str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.
- pdconcat(arrays: List, axis: int = 0, labels: Strings | None = None) pd.Series | pd.DataFrame [source]¶
Concatenate a list of arkouda Series or grouped arkouda arrays, returning a PANDAS object.
If a list of grouped arkouda arrays is passed they are converted to a series. Each grouping is a 2-tuple with the first item being the key(s) and the second being the value.
If horizontal, each series or grouping must have the same length and the same index. The index of the series is converted to a column in the dataframe. If it is a multi-index,each level is converted to a column.
arrays: The list of series/groupings to concat. axis : Whether or not to do a verticle (axis=0) or horizontal (axis=1) concatenation labels: names to give the columns of the data frame.
axis=0: a local PANDAS series axis=1: a local PANDAS dataframe
- prod()¶
- register(user_defined_name: str)[source]¶
Register this Series object and underlying components with the Arkouda server
- Parameters:
user_defined_name (str) – user defined name the Series is to be registered under, this will be the root name for underlying components
- Returns:
The same Series which is now registered with the arkouda server and has an updated name. This is an in-place modification, the original is returned to support a fluid programming style. Please note you cannot register two different Series with the same name.
- Return type:
- Raises:
TypeError – Raised if user_defined_name is not a str
RegistrationError – If the server was unable to register the Series with the user_defined_name
See also
Notes
Objects registered with the server are immune to deletion until they are unregistered.
- property shape¶
- sort_index(ascending: bool = True) Series [source]¶
Sort the series by its index
- ascendingbool
Sort values in ascending (default) or descending order.
A new Series sorted.
- sort_values(ascending: bool = True) Series [source]¶
Sort the series numerically
- ascendingbool
Sort values in ascending (default) or descending order.
A new Series sorted smallest to largest
- std()¶
- str_acc(series)¶
- sum()¶
- to_dataframe(index_labels: List[str] | None = None, value_label: str | None = None) arkouda.dataframe.DataFrame [source]¶
Converts series to an arkouda data frame
index_labels: column names(s) to label the index. value_label: column name to label values.
An arkouda dataframe.
- to_markdown(mode='wt', index=True, tablefmt='grid', storage_options=None, **kwargs)[source]¶
Print Series in Markdown-friendly format.
- Parameters:
mode (str, optional) – Mode in which file is opened, “wt” by default.
index (bool, optional, default True) – Add index (row) labels.
tablefmt (str = "grid") – Table format to call from tablulate: https://pypi.org/project/tabulate/
storage_options (dict, optional) – Extra options that make sense for a particular storage connection, e.g. host, port, username, password, etc., if using a URL that will be parsed by fsspec, e.g., starting “s3://”, “gcs://”. An error will be raised if providing this argument with a non-fsspec URL. See the fsspec and backend storage implementation docs for the set of allowed keys and values.
**kwargs – These parameters will be passed to tabulate.
Note
This function should only be called on small Series as it calls pandas.Series.to_markdown: https://pandas.pydata.org/docs/reference/api/pandas.Series.to_markdown.html
Examples
>>> import arkouda as ak >>> ak.connect() >>> s = ak.Series(["elk", "pig", "dog", "quetzal"], name="animal") >>> print(s.to_markdown()) | | animal | |---:|:---------| | 0 | elk | | 1 | pig | | 2 | dog | | 3 | quetzal |
Output markdown with a tabulate option.
>>> print(s.to_markdown(tablefmt="grid")) +----+----------+ | | animal | +====+==========+ | 0 | elk | +----+----------+ | 1 | pig | +----+----------+ | 2 | dog | +----+----------+ | 3 | quetzal | +----+----------+
- topn(n: int = 10) Series [source]¶
Return the top values of the series
n: Number of values to return
A new Series with the top values
- unregister()[source]¶
Unregister this Series object in the arkouda server which was previously registered using register() and/or attached to using attach()
- Raises:
RegistrationError – If the object is already unregistered or if there is a server error when attempting to unregister
See also
Notes
Objects registered with the server are immune to deletion until they are unregistered.
- validate_key(key: Series | pdarray | Strings | Categorical | List | supported_scalars | SegArray) pdarray | Strings | Categorical | supported_scalars | SegArray [source]¶
Validates type requirements for keys when reading or writing the Series. Also converts list and tuple arguments into pdarrays.
- Parameters:
key (Series, pdarray, Strings, Categorical, List, supported_scalars) – The key or container of keys that might be used to index into the Series.
- Return type:
The validated key(s), with lists and tuples converted to pdarrays
- Raises:
TypeError – Raised if keys are not boolean values or the type of the labels Raised if key is not one of the supported types
KeyError – Raised if container of keys has keys not present in the Series
IndexError – Raised if the length of a boolean key array is different from the Series
- validate_val(val: pdarray | Strings | supported_scalars | List) pdarray | Strings | supported_scalars [source]¶
Validates type requirements for values being written into the Series. Also converts list and tuple arguments into pdarrays.
- Parameters:
val (pdarray, Strings, list, supported_scalars) – The value or container of values that might be assigned into the Series.
- Return type:
The validated value, with lists converted to pdarrays
- Raises:
TypeError –
- Raised if val is not the same type or a container with elements
of the same time as the Series
Raised if val is a string or Strings type. Raised if val is not one of the supported types
- value_counts(sort: bool = True) Series [source]¶
Return a Series containing counts of unique values.
The resulting object will be in descending order so that the first element is the most frequently-occurring element.
sort : Boolean. Whether or not to sort the results. Default is true.
- var()¶