arkouda.pandas.conversion¶
Functions¶
|
Convert a pandas |
Module Contents¶
- arkouda.pandas.conversion.from_series(series: pandas.Series, dtype: type | str | None = None) arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings[source]¶
Convert a pandas
Seriesto an ArkoudapdarrayorStrings.If
dtypeis not provided, the dtype is inferred from the pandasSeries(using pandas dtype metadata). Ifdtypeis provided, it is used as an override and normalized via Arkouda’s dtype resolution rules.In addition to the core numeric and boolean types, this function supports datetime and timedelta
Seriesof any resolution (ns,us,ms, etc.) by converting them to anint64pdarrayof nanoseconds.- Parameters:
series (pd.Series) – The pandas
Seriesto convert.dtype (Optional[Union[type, str]], optional) –
Optional dtype override. This may be a Python type (e.g.
bool), a NumPy scalar type (e.g.np.int64), or a dtype string.String-like spellings are normalized to Arkouda string dtype, including
"object","str","string","string[python]", and"string[pyarrow]".
- Returns:
An Arkouda
pdarrayfor numeric, boolean, datetime, or timedelta inputs, or an ArkoudaStringsfor string inputs.- Return type:
- Raises:
ValueError – Raised if the dtype cannot be interpreted or is unsupported for conversion.
Examples
>>> import arkouda as ak >>> import numpy as np >>> import pandas as pd
Integers:
>>> np.random.seed(1701) >>> ak.from_series(pd.Series(np.random.randint(0, 10, 5))) array([4 3 3 5 0])
>>> ak.from_series(pd.Series(['1', '2', '3', '4', '5']), dtype=np.int64) array([1 2 3 4 5])
Floats:
>>> np.random.seed(1701) >>> ak.from_series(pd.Series(np.random.uniform(low=0.0, high=1.0, size=3))) array([0.089433234324597599 0.1153776854774361 0.51874393620990389])
Booleans:
>>> np.random.seed(1864) >>> ak.from_series(pd.Series(np.random.choice([True, False], size=5))) array([True True True False False])
Strings (pandas dtype spellings normalized to Arkouda
Strings):>>> ak.from_series(pd.Series(['a', 'b', 'c', 'd', 'e'], dtype="string")) array(['a', 'b', 'c', 'd', 'e'])
>>> ak.from_series(pd.Series(['a', 'b', 'c'], dtype="string[pyarrow]")) array(['a', 'b', 'c'])
Datetime (any resolution is accepted and returned as
int64nanoseconds):>>> ak.from_series(pd.Series(pd.to_datetime(['1/1/2018', np.datetime64('2018-01-01')]))) array([1514764800000000000 1514764800000000000])
Notes
Datetime and timedelta
Seriesare converted toint64nanoseconds.String-like pandas dtypes (including
object) are treated as string and converted to ArkoudaStrings.