arkouda.numpy.strings¶
Classes¶
Represents an array of strings whose data resides on the |
Module Contents¶
- class arkouda.numpy.strings.Strings(strings_pdarray: arkouda.numpy.pdarrayclass.pdarray, bytes_size: arkouda.numpy.dtypes.int_scalars)[source]¶
Represents an array of strings whose data resides on the arkouda server. The user should not call this class directly; rather its instances are created by other arkouda functions.
- entry¶
Encapsulation of a Segmented Strings array contained on the arkouda server. This is a composite of
offsets array: starting indices for each string
bytes array: raw bytes of all strings joined by nulls
- Type:
- size¶
The number of strings in the array
- Type:
- nbytes¶
The total number of bytes in all strings
- Type:
- ndim¶
The rank of the array (currently only rank 1 arrays supported)
- Type:
- shape¶
The sizes of each dimension of the array
- Type:
tuple
- logger¶
Used for all logging operations
- Type:
ArkoudaLogger
Notes
Strings is composed of two pdarrays: (1) offsets, which contains the starting indices for each string and (2) bytes, which contains the raw bytes of all strings, delimited by nulls.
- BinOps¶
- argsort(algorithm: arkouda.numpy.sorting.SortingAlgorithm = SortingAlgorithm.RadixSortLSD, ascending: bool = True) arkouda.numpy.pdarrayclass.pdarray[source]¶
Return the permutation that sorts the Strings.
- Parameters:
algorithm (SortingAlgorithm, default SortingAlgorithm.RadixSortLSD) – The algorithm to use for sorting.
ascending (bool, default True) – Whether to sort in ascending order.
- Returns:
The indices that sort the Strings.
- Return type:
- astype(dtype: numpy.dtype | str) arkouda.numpy.pdarrayclass.pdarray | Strings[source]¶
Cast values of Strings object to provided dtype.
- Parameters:
dtype (np.dtype or str) – Dtype to cast to
- Returns:
An arkouda pdarray with values converted to the specified data type
- Return type:
Notes
This is essentially shorthand for ak.cast(x, ‘<dtype>’) where x is a pdarray.
- cached_regex_patterns() List[source]¶
Returns the regex patterns for which Match objects have been cached.
- capitalize() Strings[source]¶
Return a new Strings from the original replaced with the first letter capitilzed and the remaining letters lowercase.
- Returns:
Strings from the original replaced with the capitalized equivalent.
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown.
See also
Strings.lower,String.upper,String.titleExamples
>>> import arkouda as ak >>> strings = ak.array([f'StrINgS aRe Here {i}' for i in range(5)]) >>> strings array(['StrINgS aRe Here 0', 'StrINgS aRe Here 1', 'StrINgS aRe Here 2', 'StrINgS aRe Here 3', 'StrINgS aRe Here 4']) >>> strings.title() array(['Strings Are Here 0', 'Strings Are Here 1', 'Strings Are Here 2', 'Strings Are Here 3', 'Strings Are Here 4'])
- static concatenate_uniquely(strings: List[Strings]) Strings[source]¶
Concatenates a list of Strings into a single Strings object containing only unique strings. Order may not be preserved.
- contains(substr: bytes | arkouda.numpy.dtypes.str_scalars, regex: bool = False) arkouda.numpy.pdarrayclass.pdarray[source]¶
Check whether each element contains the given substring.
- Parameters:
substr (bytes or str_scalars) – The substring in the form of string or byte array to search for
regex (bool, default=False) – Indicates whether substr is a regular expression Note: only handles regular expressions supported by re2 (does not support lookaheads/lookbehinds)
- Returns:
True for elements that contain substr, False otherwise
- Return type:
- Raises:
TypeError – Raised if the substr parameter is not bytes or str_scalars
ValueError – Rasied if substr is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings = ak.array([f'{i} string {i}' for i in range(1, 6)]) >>> strings array(['1 string 1', '2 string 2', '3 string 3', '4 string 4', '5 string 5']) >>> strings.contains('string') array([True True True True True]) >>> strings.contains('string \\d', regex=True) array([True True True True True])
- copy() Strings[source]¶
Return a deep copy of the Strings object.
- Returns:
A deep copy of the Strings.
- Return type:
- decode(fromEncoding: str, toEncoding: str = 'UTF-8') Strings[source]¶
Return a new strings object in fromEncoding, expecting that the current Strings is encoded in toEncoding.
- Parameters:
- Returns:
A new Strings object in toEncoding
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
- property dtype: numpy.dtype¶
Return the dtype object of the underlying data.
- encode(toEncoding: str, fromEncoding: str = 'UTF-8') Strings[source]¶
Return a new strings object in toEncoding, expecting that the current Strings is encoded in fromEncoding.
- Parameters:
- Returns:
A new Strings object in toEncoding
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
- endswith(substr: bytes | arkouda.numpy.dtypes.str_scalars, regex: bool = False) arkouda.numpy.pdarrayclass.pdarray[source]¶
Check whether each element ends with the given substring.
- Parameters:
substr (bytes or str_scalars) – The suffix to search for
regex (bool, default=False) – Indicates whether substr is a regular expression Note: only handles regular expressions supported by re2 (does not support lookaheads/lookbehinds)
- Returns:
True for elements that end with substr, False otherwise
- Return type:
- Raises:
TypeError – Raised if the substr parameter is not bytes or str_scalars
ValueError – Rasied if substr is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings_start = ak.array([f'{i} string' for i in range(1,6)]) >>> strings_start array(['1 string', '2 string', '3 string', '4 string', '5 string']) >>> strings_start.endswith('ing') array([True True True True True]) >>> strings_end = ak.array([f'string {i}' for i in range(1, 6)]) >>> strings_end array(['string 1', 'string 2', 'string 3', 'string 4', 'string 5']) >>> strings_end.endswith('ing \\d', regex = True) array([True True True True True])
- equals(other) arkouda.numpy.dtypes.bool_scalars[source]¶
Whether Strings are the same size and all entries are equal.
- Parameters:
other (Any) – object to compare.
- Returns:
True if the Strings are the same, o.w. False.
- Return type:
Examples
>>> import arkouda as ak >>> s = ak.array(["a", "b", "c"]) >>> s_cpy = ak.array(["a", "b", "c"]) >>> s.equals(s_cpy) np.True_ >>> s2 = ak.array(["a", "x", "c"]) >>> s.equals(s2) np.False_
- find_locations(pattern: bytes | arkouda.numpy.dtypes.str_scalars) Tuple[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.pdarrayclass.pdarray][source]¶
Finds pattern matches and returns pdarrays containing the number, start postitions, and lengths of matches.
- Parameters:
pattern (bytes or str_scalars) – The regex pattern used to find matches
- Returns:
- pdarray, int64
For each original string, the number of pattern matches
- pdarray, int64
The start positons of pattern matches
- pdarray, int64
The lengths of pattern matches
- Return type:
- Raises:
TypeError – Raised if the pattern parameter is not bytes or str_scalars
ValueError – Raised if pattern is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings = ak.array([f'{i} string {i}' for i in range(1, 6)]) >>> num_matches, starts, lens = strings.find_locations('\\d') >>> num_matches array([2 2 2 2 2]) >>> starts array([0 9 0 9 0 9 0 9 0 9]) >>> lens array([1 1 1 1 1 1 1 1 1 1])
- findall(pattern: bytes | arkouda.numpy.dtypes.str_scalars, return_match_origins: bool = False) Strings | Tuple[source]¶
Return a new Strings containg all non-overlapping matches of pattern.
- Parameters:
pattern (bytes or str_scalars) – Regex used to find matches
return_match_origins (bool, default=False) – If True, return a pdarray containing the index of the original string each pattern match is from
- Returns:
- Strings
Strings object containing only pattern matches
- pdarray, int64 (optional)
The index of the original string each pattern match is from
- Return type:
- Raises:
TypeError – Raised if the pattern parameter is not bytes or str_scalars
ValueError – Raised if pattern is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.findall('_+', return_match_origins=True) (array(['_', '___', '____', '__', '___', '____', '___']), array([0 0 1 3 3 3 3]))
- flatten() Strings[source]¶
Return a copy of the array collapsed into one dimension.
- Return type:
A copy of the input array, flattened to one dimension.
Note
As multidimensional Strings are currently supported, flatten on a Strings object will always return itself.
- static from_parts(offset_attrib: arkouda.numpy.pdarrayclass.pdarray | str, bytes_attrib: arkouda.numpy.pdarrayclass.pdarray | str) Strings[source]¶
Assemble a Strings object from separate offset and bytes arrays.
This factory method constructs a segmented Strings array by sending two separate components—offsets and values—to the Arkouda server and instructing it to assemble them into a single Strings object. Use this when offsets and byte data are created or transported independently.
- Parameters:
offset_attrib (pdarray or str) – The array of starting positions for each string, or a string expression that can be passed to create_pdarray to build it.
bytes_attrib (pdarray or str) – The array of raw byte values (e.g., uint8 character codes), or a string expression that can be passed to create_pdarray to build it.
- Returns:
A Strings object representing the assembled segmented strings array on the Arkouda server.
- Return type:
- Raises:
RuntimeError – If conversion of offset_attrib or bytes_attrib to pdarray fails, or if the server is unable to assemble the parts into a Strings.
Notes
Both inputs can be existing pdarray instances or arguments suitable for create_pdarray.
Internally uses the CMD_ASSEMBLE command to merge offsets and values.
- static from_return_msg(rep_msg: str) Strings[source]¶
Create a Strings object from an Arkouda server response message.
Parse the server’s response descriptor and construct a Strings array with its underlying pdarray and total byte size.
- Parameters:
rep_msg (str) – Server response message of the form:
` created <name> <type> <size> <ndim> <shape> <itemsize>+... bytes.size <total_bytes> `For example:` "created foo Strings 3 1 (3,) 8+created bytes.size 24" `- Returns:
A Strings object representing the segmented strings array on the server, initialized with the returned pdarray and byte-size metadata.
- Return type:
- Raises:
RuntimeError – If the response message cannot be parsed or does not match the expected format.
Examples
>>> import arkouda as ak
# Example response message (typically from generic_msg) >>> rep_msg = “created foo Strings 3 1 (3,) 8+created bytes.size 24” >>> s = ak.Strings.from_return_msg(rep_msg) >>> isinstance(s, ak.Strings) True
- fullmatch(pattern: bytes | arkouda.numpy.dtypes.str_scalars) arkouda.pandas.match.Match[source]¶
Return a match object where elements match only if the whole string matches the regular expression pattern.
- Parameters:
pattern (bytes or str_scalars) – Regex used to find matches
- Returns:
Match object where elements match only if the whole string matches the regular expression pattern
- Return type:
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.fullmatch('_+') <ak.Match object: matched=False; matched=True, span=(0, 4); matched=False; matched=False; matched=False>
- get_bytes() arkouda.numpy.pdarrayclass.pdarray[source]¶
Getter for the bytes component (uint8 pdarray) of this Strings.
- Returns:
Pdarray of bytes of the string accessed
- Return type:
Example
>>> import arkouda as ak >>> x = ak.array(['one', 'two', 'three']) >>> x.get_bytes() array([111 110 101 0 116 119 111 0 116 104 114 101 101 0])
- get_lengths() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return the length of each string in the array.
- Returns:
The length of each string
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
- get_offsets() arkouda.numpy.pdarrayclass.pdarray[source]¶
Getter for the offsets component (int64 pdarray) of this Strings.
- Returns:
Pdarray of offsets of the string accessed
- Return type:
Example
>>> import arkouda as ak >>> x = ak.array(['one', 'two', 'three']) >>> x.get_offsets() array([0 4 8])
- get_prefixes(n: arkouda.numpy.dtypes.int_scalars, return_origins: bool = True, proper: bool = True) Strings | Tuple[Strings, arkouda.numpy.pdarrayclass.pdarray][source]¶
Return the n-long prefix of each string, where possible.
- Parameters:
n (int_scalars) – Length of prefix
return_origins (bool, default=True) – If True, return a logical index indicating which strings were long enough to return an n-prefix
proper (bool, default=True) – If True, only return proper prefixes, i.e. from strings that are at least n+1 long. If False, allow the entire string to be returned as a prefix.
- Returns:
- prefixesStrings
The array of n-character prefixes; the number of elements is the number of True values in the returned mask.
- origin_indicespdarray, bool
Boolean array that is True where the string was long enough to return an n-character prefix, False otherwise.
- Return type:
- get_suffixes(n: arkouda.numpy.dtypes.int_scalars, return_origins: bool = True, proper: bool = True) Strings | Tuple[Strings, arkouda.numpy.pdarrayclass.pdarray][source]¶
Return the n-long suffix of each string, where possible.
- Parameters:
n (int_scalars) – Length of suffix
return_origins (bool, default=True) – If True, return a logical index indicating which strings were long enough to return an n-suffix
proper (bool, default=True) – If True, only return proper suffixes, i.e. from strings that are at least n+1 long. If False, allow the entire string to be returned as a suffix.
- Returns:
- suffixesStrings
The array of n-character suffixes; the number of elements is the number of True values in the returned mask.
- origin_indicespdarray, bool
Boolean array that is True where the string was long enough to return an n-character suffix, False otherwise.
- Return type:
- group() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return the permutation that groups the array, placing equivalent strings together. All instances of the same string are guaranteed to lie in one contiguous block of the permuted array, but the blocks are not necessarily ordered.
- Returns:
The permutation that groups the array by value
- Return type:
See also
GroupBy,uniqueNotes
If the arkouda server is compiled with “-sSegmentedString.useHash=true”, then arkouda uses 128-bit hash values to group strings, rather than sorting the strings directly. This method is fast, but the resulting permutation merely groups equivalent strings and does not sort them. If the “useHash” parameter is false, then a full sort is performed.
- Raises:
RuntimeError – Raised if there is a server-side error in executing group request or creating the pdarray encapsulating the return message
- hash() Tuple[arkouda.numpy.pdarrayclass.pdarray, arkouda.numpy.pdarrayclass.pdarray][source]¶
Compute a 128-bit hash of each string.
- Returns:
A tuple of two int64 pdarrays. The ith hash value is the concatenation of the ith values from each array.
- Return type:
Notes
The implementation uses SipHash128, a fast and balanced hash function (used by Python for dictionaries and sets). For realistic numbers of strings (up to about 10**15), the probability of a collision between two 128-bit hash values is negligible.
- info() str[source]¶
Return a JSON formatted string containing information about all components of self.
- Returns:
JSON string containing information about all components of self
- Return type:
- is_registered() numpy.bool_[source]¶
Return True iff the object is contained in the registry.
- Returns:
Indicates if the object is contained in the registry
- Return type:
- Raises:
RuntimeError – Raised if there’s a server-side error thrown
- isalnum() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings is alphanumeric.
- Returns:
True for elements that are alphanumeric, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> not_alnum = ak.array([f'%Strings {i}' for i in range(3)]) >>> alnum = ak.array([f'Strings{i}' for i in range(3)]) >>> strings = ak.concatenate([not_alnum, alnum]) >>> strings array(['%Strings 0', '%Strings 1', '%Strings 2', 'Strings0', 'Strings1', 'Strings2']) >>> strings.isalnum() array([False False False True True True])
- isalpha() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings is alphabetic. This means there is at least one character, and all the characters are alphabetic.
- Returns:
True for elements that are alphabetic, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Strings.islower,Strings.isupper,Strings.istitle,Strings.isalnumExamples
>>> import arkouda as ak >>> not_alpha = ak.array([f'%Strings {i}' for i in range(3)]) >>> alpha = ak.array(['StringA','StringB','StringC']) >>> strings = ak.concatenate([not_alpha, alpha]) >>> strings array(['%Strings 0', '%Strings 1', '%Strings 2', 'StringA', 'StringB', 'StringC']) >>> strings.isalpha() array([False False False True True True])
- isdecimal() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings has all decimal characters.
- Returns:
True for elements that are decimals, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> not_decimal = ak.array([f'Strings {i}' for i in range(3)]) >>> decimal = ak.array([f'12{i}' for i in range(3)]) >>> strings = ak.concatenate([not_decimal, decimal]) >>> strings array(['Strings 0', 'Strings 1', 'Strings 2', '120', '121', '122']) >>> strings.isdecimal() array([False False False True True True])
Special Character Examples
>>> special_strings = ak.array(["3.14", "0", "²", "2³₇", "2³x₇"]) >>> special_strings array(['3.14', '0', '²', '2³₇', '2³x₇']) >>> special_strings.isdecimal() array([False True False False False])
- isdigit() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings has all digit characters.
- Returns:
True for elements that are digits, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> not_digit = ak.array([f'Strings {i}' for i in range(3)]) >>> digit = ak.array([f'12{i}' for i in range(3)]) >>> strings = ak.concatenate([not_digit, digit]) >>> strings array(['Strings 0', 'Strings 1', 'Strings 2', '120', '121', '122']) >>> strings.isdigit() array([False False False True True True])
Special Character Examples
>>> special_strings = ak.array(["3.14", "0", "²", "2³₇", "2³x₇"]) >>> special_strings array(['3.14', '0', '²', '2³₇', '2³x₇']) >>> special_strings.isdigit() array([False True True True False])
- isempty() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings is empty.
True for elements that are the empty string, False otherwise
- Returns:
True for elements that are digits, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> not_empty = ak.array([f'Strings {i}' for i in range(3)]) >>> empty = ak.array(['' for i in range(3)]) >>> strings = ak.concatenate([not_empty, empty]) >>> strings array(['Strings 0', 'Strings 1', 'Strings 2', '', '', '']) >>> strings.isempty() array([False False False True True True])
- islower() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings is entirely lowercase.
- Returns:
True for elements that are entirely lowercase, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> lower = ak.array([f'strings {i}' for i in range(3)]) >>> upper = ak.array([f'STRINGS {i}' for i in range(3)]) >>> strings = ak.concatenate([lower, upper]) >>> strings array(['strings 0', 'strings 1', 'strings 2', 'STRINGS 0', 'STRINGS 1', 'STRINGS 2']) >>> strings.islower() array([True True True False False False])
- isnumeric() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings has all numeric characters. There are 1922 unicode characters that qualify as numeric, including the digits 0 through 9, superscripts and subscripted digits, special characters with the digits encircled or enclosed in parens, “vulgar fractions,” and more.
- Returns:
True for elements that are numerics, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> not_numeric = ak.array([f'Strings {i}' for i in range(3)]) >>> numeric = ak.array([f'12{i}' for i in range(3)]) >>> strings = ak.concatenate([not_numeric, numeric]) >>> strings array(['Strings 0', 'Strings 1', 'Strings 2', '120', '121', '122']) >>> strings.isnumeric() array([False False False True True True])
Special Character Examples
>>> special_strings = ak.array(["3.14", "0", "²", "2³₇", "2³x₇"]) >>> special_strings array(['3.14', '0', '²', '2³₇', '2³x₇']) >>> special_strings.isnumeric() array([False True True True False])
- isspace() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i has all whitespace characters (‘ ’, ‘ ’, ‘
’, ‘ ’, ‘ ’, ‘ ’).
- pdarray
True for elements that are whitespace, False otherwise
- RuntimeError
Raised if there is a server-side error thrown
Strings.islower Strings.isupper Strings.istitle
>>> import arkouda as ak >>> not_space = ak.array([f'Strings {i}' for i in range(3)]) >>> space = ak.array([' ', '\t', '\n', '\v', '\f', '\r', ' \t\n\v\f\r']) >>> strings = ak.concatenate([not_space, space]) >>> strings array(['Strings 0', 'Strings 1', 'Strings 2', ' ', 'u0009', 'n', 'u000B', 'u000C', 'u000D', ' u0009nu000Bu000Cu000D']) >>> strings.isspace() array([False False False True True True True True True True])
- istitle() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings is titlecase.
- Returns:
True for elements that are titlecase, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> mixed = ak.array([f'sTrINgs {i}' for i in range(3)]) >>> title = ak.array([f'Strings {i}' for i in range(3)]) >>> strings = ak.concatenate([mixed, title]) >>> strings array(['sTrINgs 0', 'sTrINgs 1', 'sTrINgs 2', 'Strings 0', 'Strings 1', 'Strings 2']) >>> strings.istitle() array([False False False True True True])
- isupper() arkouda.numpy.pdarrayclass.pdarray[source]¶
Return a boolean pdarray where index i indicates whether string i of the Strings is entirely uppercase.
- Returns:
True for elements that are entirely uppercase, False otherwise
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> lower = ak.array([f'strings {i}' for i in range(3)]) >>> upper = ak.array([f'STRINGS {i}' for i in range(3)]) >>> strings = ak.concatenate([lower, upper]) >>> strings array(['strings 0', 'strings 1', 'strings 2', 'STRINGS 0', 'STRINGS 1', 'STRINGS 2']) >>> strings.isupper() array([False False False True True True])
- logger: arkouda.core.logger.ArkoudaLogger¶
- lower() Strings[source]¶
Return a new Strings with all uppercase characters from the original replaced with their lowercase equivalent.
- Returns:
Strings with all uppercase characters from the original replaced with their lowercase equivalent
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings = ak.array([f'StrINgS {i}' for i in range(5)]) >>> strings array(['StrINgS 0', 'StrINgS 1', 'StrINgS 2', 'StrINgS 3', 'StrINgS 4']) >>> strings.lower() array(['strings 0', 'strings 1', 'strings 2', 'strings 3', 'strings 4'])
- lstick(other: Strings, delimiter: bytes | arkouda.numpy.dtypes.str_scalars = '') Strings[source]¶
Join the strings from another array onto the left of the strings of this array, optionally inserting a delimiter. Warning: This function is experimental and not guaranteed to work.
- Parameters:
other (Strings) – The strings to join onto self’s strings
delimiter (bytes or str_scalars, default="") – String inserted between self and other
- Returns:
The array of joined strings, as other + self
- Return type:
- Raises:
TypeError – Raised if the delimiter parameter is neither bytes nor a str or if the other parameter is not a Strings instance
RuntimeError – Raised if there is a server-side error thrown
Examples
>>> import arkouda as ak >>> s = ak.array(['a', 'c', 'e']) >>> t = ak.array(['b', 'd', 'f']) >>> s.lstick(t, delimiter='.') array(['b.a', 'd.c', 'f.e'])
- match(pattern: bytes | arkouda.numpy.dtypes.str_scalars) arkouda.pandas.match.Match[source]¶
Return a match object where elements match only if the beginning of the string matches the regular expression pattern.
- Parameters:
pattern (bytes or str_scalars) – Regex used to find matches
- Returns:
Match object where elements match only if the beginning of the string matches the regular expression pattern
- Return type:
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.match('_+') <ak.Match object: matched=False; matched=True, span=(0, 4); matched=False; matched=True, span=(0, 2); matched=False>
- objType = 'Strings'¶
- peel(delimiter: bytes | arkouda.numpy.dtypes.str_scalars, times: arkouda.numpy.dtypes.int_scalars = 1, includeDelimiter: bool = False, keepPartial: bool = False, fromRight: bool = False, regex: bool = False) Tuple[Strings, Strings][source]¶
Peel off one or more delimited fields from each string (similar to string.partition), returning two new arrays of strings. Warning: This function is experimental and not guaranteed to work.
- Parameters:
delimiter (bytes or str_scalars) – The separator where the split will occur
times (int_scalars, default=1) – The number of times the delimiter is sought, i.e. skip over the first (times-1) delimiters
includeDelimiter (bool, default=False) – If true, append the delimiter to the end of the first return array. By default, it is prepended to the beginning of the second return array.
keepPartial (bool, default=False) – If true, a string that does not contain <times> instances of the delimiter will be returned in the first array. By default, such strings are returned in the second array.
fromRight (bool, default=False) – If true, peel from the right instead of the left (see also rpeel)
regex (bool, default=False) – Indicates whether delimiter is a regular expression Note: only handles regular expressions supported by re2 (does not support lookaheads/lookbehinds)
- Returns:
- left: Strings
The field(s) peeled from the end of each string (unless fromRight is true)
- right: Strings
The remainder of each string after peeling (unless fromRight is true)
- Return type:
- Raises:
TypeError – Raised if the delimiter parameter is not byte or str_scalars, if times is not int64, or if includeDelimiter, keepPartial, or fromRight is not bool
ValueError – Raised if times is < 1 or if delimiter is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
Examples
>>> import arkouda as ak >>> s = ak.array(['a.b', 'c.d', 'e.f.g']) >>> s.peel('.') (array(['a', 'c', 'e']), array(['b', 'd', 'f.g'])) >>> s.peel('.', includeDelimiter=True) (array(['a.', 'c.', 'e.']), array(['b', 'd', 'f.g'])) >>> s.peel('.', times=2) (array(['', '', 'e.f']), array(['a.b', 'c.d', 'g'])) >>> s.peel('.', times=2, keepPartial=True) (array(['a.b', 'c.d', 'e.f']), array(['', '', 'g']))
- pretty_print_info() None[source]¶
Print information about all components of self in a human readable format.
- regex_split(pattern: bytes | arkouda.numpy.dtypes.str_scalars, maxsplit: int = 0, return_segments: bool = False) Strings | Tuple[source]¶
Return a new Strings split by the occurrences of pattern.
If maxsplit is nonzero, at most maxsplit splits occur.
- Parameters:
pattern (bytes or str_scalars) – Regex used to split strings into substrings
maxsplit (int, default=0) – The max number of pattern match occurences in each element to split. The default maxsplit=0 splits on all occurences
return_segments (bool, default=False) – If True, return mapping of original strings to first substring in return array.
- Returns:
- Strings
Substrings with pattern matches removed
- pdarray, int64 (optional)
For each original string, the index of first corresponding substring in the return array
- Return type:
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.regex_split('_+', maxsplit=2, return_segments=True) (array(['1', '2', '', '', '', '3', '', '4', '5____6___7', '']), array([0 3 5 6 9]))
- register(user_defined_name: str) Strings[source]¶
Register this Strings object with a user defined name in the arkouda server so it can be attached to later using Strings.attach().
This is an in-place operation, registering a Strings object more than once will update the name in the registry and remove the previously registered name. A name can only be registered to one object at a time.
- Parameters:
user_defined_name (str) – user defined name which the Strings object is to be registered under
- Returns:
The same Strings object which is now registered with the arkouda server and has an updated name. This is an in-place modification, the original is returned to support a fluid programming style. Please note you cannot register two different objects with the same name.
- Return type:
- Raises:
TypeError – Raised if user_defined_name is not a str
RegistrationError – If the server was unable to register the Strings object with the user_defined_name If the user is attempting to register more than one object with the same name, the former should be unregistered first to free up the registration name.
See also
attach,unregisterNotes
Registered names/Strings objects in the server are immune to deletion until they are unregistered.
- rpeel(delimiter: bytes | arkouda.numpy.dtypes.str_scalars, times: arkouda.numpy.dtypes.int_scalars = 1, includeDelimiter: bool = False, keepPartial: bool = False, regex: bool = False) Tuple[Strings, Strings][source]¶
Peel off one or more delimited fields from the end of each string (similar to string.rpartition), returning two new arrays of strings. Warning: This function is experimental and not guaranteed to work.
- Parameters:
delimiter (bytes or str_scalars) – The separator where the split will occur
times (int_scalars, default=1) – The number of times the delimiter is sought, i.e. skip over the last (times-1) delimiters
includeDelimiter (bool, default=False) – If true, prepend the delimiter to the start of the first return array. By default, it is appended to the end of the second return array.
keepPartial (bool, default=False) – If true, a string that does not contain <times> instances of the delimiter will be returned in the second array. By default, such strings are returned in the first array.
regex (bool, default=False) – Indicates whether delimiter is a regular expression Note: only handles regular expressions supported by re2 (does not support lookaheads/lookbehinds)
- Returns:
- left: Strings
The remainder of the string after peeling
- right: Strings
The field(s) that were peeled from the right of each string
- Return type:
- Raises:
TypeError – Raised if the delimiter parameter is not bytes or str_scalars or if times is not int64
ValueError – Raised if times is < 1 or if delimiter is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
Examples
>>> import arkouda as ak >>> s = ak.array(['a.b', 'c.d', 'e.f.g']) >>> s.rpeel('.') (array(['a', 'c', 'e.f']), array(['b', 'd', 'g']))
Compared against peel
>>> s.peel('.') (array(['a', 'c', 'e']), array(['b', 'd', 'f.g']))
- search(pattern: bytes | arkouda.numpy.dtypes.str_scalars) arkouda.pandas.match.Match[source]¶
Return a match object with the first location in each element where pattern produces a match. Elements match if any part of the string matches the regular expression pattern.
- Parameters:
pattern (bytes or str_scalars) – Regex used to find matches
- Returns:
Match object where elements match if any part of the string matches the regular expression pattern
- Return type:
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.search('_+') <ak.Match object: matched=True, span=(1, 2); matched=True, span=(0, 4); matched=False; matched=True, span=(0, 2); matched=False>
- shape: Tuple[int]¶
- split(delimiter: str, return_segments: bool = False, regex: bool = False) Strings | Tuple[source]¶
Unpack delimiter-joined substrings into a flat array.
- Parameters:
delimiter (str) – Characters used to split strings into substrings
return_segments (bool, default=False) – If True, also return mapping of original strings to first substring in return array.
regex (bool, default=False) – Indicates whether delimiter is a regular expression Note: only handles regular expressions supported by re2 (does not support lookaheads/lookbehinds)
- Returns:
- Strings
Flattened substrings with delimiters removed
- pdarray, int64 (optional)
For each original string, the index of first corresponding substring in the return array
- Return type:
Examples
>>> import arkouda as ak >>> orig = ak.array(['one|two', 'three|four|five', 'six']) >>> orig.split('|') array(['one', 'two', 'three', 'four', 'five', 'six']) >>> flat, mapping = orig.split('|', return_segments=True) >>> mapping array([0 2 5]) >>> under = ak.array(['one_two', 'three_____four____five', 'six']) >>> under_split, under_map = under.split('_+', return_segments=True, regex=True) >>> under_split array(['one', 'two', 'three', 'four', 'five', 'six']) >>> under_map array([0 2 5])
- startswith(substr: bytes | arkouda.numpy.dtypes.str_scalars, regex: bool = False) arkouda.numpy.pdarrayclass.pdarray[source]¶
Check whether each element starts with the given substring.
- Parameters:
substr (bytes or str_scalars) – The prefix to search for
regex (bool, default=False) – Indicates whether substr is a regular expression Note: only handles regular expressions supported by re2 (does not support lookaheads/lookbehinds)
- Returns:
True for elements that start with substr, False otherwise
- Return type:
- Raises:
TypeError – Raised if the substr parameter is not a bytes ior str_scalars
ValueError – Rasied if substr is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings_end = ak.array([f'string {i}' for i in range(1, 6)]) >>> strings_end array(['string 1', 'string 2', 'string 3', 'string 4', 'string 5']) >>> strings_end.startswith('string') array([True True True True True]) >>> strings_start = ak.array([f'{i} string' for i in range(1,6)]) >>> strings_start array(['1 string', '2 string', '3 string', '4 string', '5 string']) >>> strings_start.startswith('\\d str', regex = True) array([True True True True True])
- stick(other: Strings, delimiter: bytes | arkouda.numpy.dtypes.str_scalars = '', toLeft: bool = False) Strings[source]¶
Join the strings from another array onto one end of the strings of this array, optionally inserting a delimiter. Warning: This function is experimental and not guaranteed to work.
- Parameters:
other (Strings) – The strings to join onto self’s strings
delimiter (bytes or str_scalars, default="") – String inserted between self and other
toLeft (bool, default=False) – If true, join other strings to the left of self. By default, other is joined to the right of self.
- Returns:
The array of joined strings
- Return type:
- Raises:
TypeError – Raised if the delimiter parameter is not bytes or str_scalars or if the other parameter is not a Strings instance
ValueError – Raised if times is < 1
RuntimeError – Raised if there is a server-side error thrown
Examples
>>> import arkouda as ak >>> s = ak.array(['a', 'c', 'e']) >>> t = ak.array(['b', 'd', 'f']) >>> s.stick(t, delimiter='.') array(['a.b', 'c.d', 'e.f'])
- strip(chars: bytes | arkouda.numpy.dtypes.str_scalars | None = '') Strings[source]¶
Return a new Strings object with all leading and trailing occurrences of characters contained in chars removed. The chars argument is a string specifying the set of characters to be removed. If omitted, the chars argument defaults to removing whitespace. The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped.
- Parameters:
chars (bytes or str_scalars, optional) – the set of characters to be removed
- Returns:
Strings object with the leading and trailing characters matching the set of characters in the chars argument removed
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
Examples
>>> import arkouda as ak >>> strings = ak.array(['Strings ', ' StringS ', 'StringS ']) >>> s = strings.strip() >>> s array(['Strings', 'StringS', 'StringS'])
>>> strings = ak.array(['Strings 1', '1 StringS ', ' 1StringS 12 ']) >>> s = strings.strip(' 12') >>> s array(['Strings', 'StringS', 'StringS'])
- sub(pattern: bytes | arkouda.numpy.dtypes.str_scalars, repl: bytes | arkouda.numpy.dtypes.str_scalars, count: int = 0) Strings[source]¶
Return new Strings obtained by replacing non-overlapping occurrences of pattern with the replacement repl.
If count is nonzero, at most count substitutions occur.
- Parameters:
pattern (bytes or str_scalars) – The regex to substitue
repl (bytes or str_scalars) – The substring to replace pattern matches with
count (int, default=0) – The max number of pattern match occurences in each element to replace. The default count=0 replaces all occurences of pattern with repl
- Returns:
Strings with pattern matches replaced
- Return type:
- Raises:
TypeError – Raised if pattern or repl are not bytes or str_scalars
ValueError – Raised if pattern is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.sub(pattern='_+', repl='-', count=2) array(['1-2-', '-', '3', '-4-5____6___7', ''])
- subn(pattern: bytes | arkouda.numpy.dtypes.str_scalars, repl: bytes | arkouda.numpy.dtypes.str_scalars, count: int = 0) Tuple[Strings, arkouda.numpy.pdarrayclass.pdarray][source]¶
Perform the same operation as sub(), but return a tuple (new_Strings, number_of_substitions).
- Parameters:
pattern (bytes or str_scalars) – The regex to substitue
repl (bytes or str_scalars) – The substring to replace pattern matches with
count (int, default=0) – The max number of pattern match occurences in each element to replace. The default count=0 replaces all occurences of pattern with repl
- Returns:
- Strings
Strings with pattern matches replaced
- pdarray, int64
The number of substitutions made for each element of Strings
- Return type:
- Raises:
TypeError – Raised if pattern or repl are not bytes or str_scalars
ValueError – Raised if pattern is not a valid regex
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.subn(pattern='_+', repl='-', count=2) (array(['1-2-', '-', '3', '-4-5____6___7', '']), array([2 1 0 2 0]))
- take(indices: arkouda.numpy.dtypes.numeric_scalars | arkouda.numpy.pdarrayclass.pdarray, axis: int | None = None) Strings[source]¶
Take elements from the array along an axis.
When axis is not None, this function does the same thing as “fancy” indexing (indexing arrays using arrays); however, it can be easier to use if you need elements along a given axis. A call such as
np.take(arr, indices, axis=3)is equivalent toarr[:,:,:,indices,...].- Parameters:
indices (numeric_scalars or pdarray) – The indices of the values to extract. Also allow scalars for indices.
axis (int, optional) – The axis over which to select values. By default, the flattened input array is used.
- Returns:
A Strings containing the selected elements.
- Return type:
Examples
>>> import arkouda as ak >>> a = ak.array(["a","b","c"]) >>> indices = [0, 1] >>> a.take(indices) array(['a', 'b'])
- title() Strings[source]¶
Return a new Strings from the original replaced with their titlecase equivalent.
- Returns:
Strings from the original replaced with their titlecase equivalent.
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown.
See also
Strings.lower,String.upperExamples
>>> import arkouda as ak >>> strings = ak.array([f'StrINgS {i}' for i in range(5)]) >>> strings array(['StrINgS 0', 'StrINgS 1', 'StrINgS 2', 'StrINgS 3', 'StrINgS 4']) >>> strings.title() array(['Strings 0', 'Strings 1', 'Strings 2', 'Strings 3', 'Strings 4'])
- to_csv(prefix_path: str, dataset: str = 'strings_array', col_delim: str = ',', overwrite: bool = False) str[source]¶
Write Strings to CSV file(s). File will contain a single column with the Strings data. All CSV Files written by Arkouda include a header denoting data types of the columns. Unlike other file formats, CSV files store Strings as their UTF-8 format instead of storing bytes as uint(8).
- Parameters:
prefix_path (str) – The filename prefix to be used for saving files. Files will have _LOCALE#### appended when they are written to disk.
dataset (str, default="strings_array") – Column name to save the Strings under. Defaults to “strings_array”.
col_delim (str, default=",") – Defaults to “,”. Value to be used to separate columns within the file. Please be sure that the value used DOES NOT appear in your dataset.
overwrite (bool, default=False) – Defaults to False. If True, any existing files matching your provided prefix_path will be overwritten. If False, an error will be returned if existing files are found.
- Returns:
response message
- Return type:
- Raises:
ValueError – Raised if all datasets are not present in all parquet files or if one or more of the specified files do not exist
RuntimeError – Raised if one or more of the specified files cannot be opened. If allow_errors is true this may be raised if no values are returned from the server.
TypeError – Raised if we receive an unknown arkouda_type returned from the server
Notes
CSV format is not currently supported by load/load_all operations
The column delimiter is expected to be the same for column names and data
Be sure that column delimiters are not found within your data.
All CSV files must delimit rows using newline (
\\n) at this time.
- to_hdf(prefix_path: str, dataset: str = 'strings_array', mode: Literal['truncate', 'append'] = 'truncate', save_offsets: bool = True, file_type: Literal['single', 'distribute'] = 'distribute') str[source]¶
Save the Strings object to HDF5. The object can be saved to a collection of files or single file.
- Parameters:
prefix_path (str) – Directory and filename prefix that all output files share
dataset (str, default="strings_array") – The name of the Strings dataset to be written, defaults to strings_array
mode ({"truncate", "append"}, default = "truncate") – By default, truncate (overwrite) output files, if they exist. If ‘append’, create a new Strings dataset within existing files.
save_offsets (bool, default=True) – Defaults to True which will instruct the server to save the offsets array to HDF5 If False the offsets array will not be save and will be derived from the string values upon load/read.
file_type ({"single", "distribute"}, default = "distribute") – Default: Distribute Distribute the dataset over a file per locale. Single file will save the dataset to one file
- Returns:
String message indicating result of save operation
- Return type:
- Raises:
RuntimeError – Raised if a server-side error is thrown saving the pdarray
Notes
Parquet files do not store the segments, only the values.
Strings state is saved as two datasets within an hdf5 group: one for the string characters and one for the segments corresponding to the start of each string
the hdf5 group is named via the dataset parameter.
The prefix_path must be visible to the arkouda server and the user must have write permission.
Output files have names of the form
<prefix_path>_LOCALE<i>, where<i>ranges from 0 tonumLocalesfor file_type=’distribute’. Otherwise, the file name will be prefix_path.If any of the output files already exist and the mode is ‘truncate’, they will be overwritten. If the mode is ‘append’ and the number of output files is less than the number of locales or a dataset with the same name already exists, a
RuntimeErrorwill result.Any file extension can be used.The file I/O does not rely on the extension to determine the file format.
See also
- to_ndarray() numpy.ndarray[source]¶
Convert the array to a np.ndarray, transferring array data from the arkouda server to Python. If the array exceeds a built-in size limit, a RuntimeError is raised.
- Returns:
A numpy ndarray with the same strings as this array
- Return type:
np.ndarray
Notes
The number of bytes in the array cannot exceed
ak.core.client.maxTransferBytes, otherwise aRuntimeErrorwill be raised. This is to protect the user from overflowing the memory of the system on which the Python client is running, under the assumption that the server is running on a distributed system with much more memory than the client. The user may override this limit by setting ak.core.client.maxTransferBytes to a larger value, but proceed with caution.See also
array,tolistExamples
>>> import arkouda as ak >>> a = ak.array(["hello", "my", "world"]) >>> a.to_ndarray() array(['hello', 'my', 'world'], dtype='<U5') >>> type(a.to_ndarray()) <class 'numpy.ndarray'>
- to_parquet(prefix_path: str, dataset: str = 'strings_array', mode: Literal['truncate', 'append'] = 'truncate', compression: Literal['snappy', 'gzip', 'brotli', 'zstd', 'lz4'] | None = None) str[source]¶
Save the Strings object to Parquet. The result is a collection of files, one file per locale of the arkouda server, where each filename starts with prefix_path. Each locale saves its chunk of the array to its corresponding file.
- Parameters:
prefix_path (str) – Directory and filename prefix that all output files share
dataset (str, default="strings_array") – Name of the dataset to create in files (must not already exist)
mode ({"truncate", "append"}, default = "truncate") – By default, truncate (overwrite) output files, if they exist. If ‘append’, attempt to create new dataset in existing files.
compression ({"snappy", "gzip", "brotli", "zstd", "lz4"}, optional) – Sets the compression type used with Parquet files
- Returns:
string message indicating result of save operation
- Return type:
- Raises:
RuntimeError – Raised if a server-side error is thrown saving the pdarray
Notes
The prefix_path must be visible to the arkouda server and the user must
have write permission. - Output files have names of the form
<prefix_path>_LOCALE<i>, where<i>ranges from 0 tonumLocalesfor file_type=’distribute’. - ‘append’ write mode is supported, but is not efficient. - If any of the output files already exist and the mode is ‘truncate’, they will be overwritten. If the mode is ‘append’ and the number of output files is less than the number of locales or a dataset with the same name already exists, aRuntimeErrorwill result. - Any file extension can be used.The file I/O does not rely on the extension to determine the file format.
- tolist() List[str][source]¶
Convert the SegString to a list, transferring data from the arkouda server to Python. If the SegString exceeds a built-in size limit, a RuntimeError is raised.
- Returns:
A list with the same strings as this SegString
- Return type:
List[str]
Notes
The number of bytes in the array cannot exceed
ak.core.client.maxTransferBytes, otherwise aRuntimeErrorwill be raised. This is to protect the user from overflowing the memory of the system on which the Python client is running, under the assumption that the server is running on a distributed system with much more memory than the client. The user may override this limit by setting ak.core.client.maxTransferBytes to a larger value, but proceed with caution.See also
Examples
>>> import arkouda as ak >>> a = ak.array(["hello", "my", "world"]) >>> a.tolist() ['hello', 'my', 'world'] >>> type(a.tolist()) <class 'list'>
- transfer(hostname: str, port: arkouda.numpy.dtypes.int_scalars) str | memoryview[source]¶
Send a Strings object to a different Arkouda server.
- Parameters:
hostname (str) – The hostname where the Arkouda server intended to receive the Strings object is running.
port (int_scalars) – The port to send the array over. This needs to be an open port (i.e., not one that the Arkouda server is running on). This will open up numLocales ports, each of which in succession, so will use ports of the range {port..(port+numLocales)} (e.g., running an Arkouda server of 4 nodes, port 1234 is passed as port, Arkouda will use ports 1234, 1235, 1236, and 1237 to send the array data). This port much match the port passed to the call to ak.receive_array().
- Returns:
A message indicating a complete transfer
- Return type:
- Raises:
ValueError – Raised if the op is not within the pdarray.BinOps set
TypeError – Raised if other is not a pdarray or the pdarray.dtype is not a supported dtype
- unregister() None[source]¶
Unregister a Strings object in the arkouda server which was previously registered using register() and/or attached to using attach().
- Raises:
RuntimeError – Raised if the server could not find the internal name/symbol to remove
See also
register,attachNotes
Registered names/Strings objects in the server are immune to deletion until they are unregistered.
- update_hdf(prefix_path: str, dataset: str = 'strings_array', save_offsets: bool = True, repack: bool = True) str[source]¶
Overwrite the dataset with the name provided with this Strings object.
If the dataset does not exist it is added.
- Parameters:
prefix_path (str) – Directory and filename prefix that all output files share
dataset (str, default="strings_array") – Name of the dataset to create in files
save_offsets (bool, default=True) – Defaults to True which will instruct the server to save the offsets array to HDF5 If False the offsets array will not be save and will be derived from the string values upon load/read.
repack (bool, default=True) – Default: True HDF5 does not release memory on delete. When True, the inaccessible data (that was overwritten) is removed. When False, the data remains, but is inaccessible. Setting to false will yield better performance, but will cause file sizes to expand.
- Returns:
success message if successful
- Return type:
- Raises:
RuntimeError – Raised if a server-side error is thrown saving the Strings object
Notes
If file does not contain File_Format attribute to indicate how it was saved, the file name is checked for _LOCALE#### to determine if it is distributed.
If the dataset provided does not exist, it will be added
- upper() Strings[source]¶
Return a new Strings with all lowercase characters from the original replaced with their uppercase equivalent.
- Returns:
Strings with all lowercase characters from the original replaced with their uppercase equivalent
- Return type:
- Raises:
RuntimeError – Raised if there is a server-side error thrown
See also
Examples
>>> import arkouda as ak >>> strings = ak.array([f'StrINgS {i}' for i in range(5)]) >>> strings array(['StrINgS 0', 'StrINgS 1', 'StrINgS 2', 'StrINgS 3', 'StrINgS 4']) >>> strings.upper() array(['STRINGS 0', 'STRINGS 1', 'STRINGS 2', 'STRINGS 3', 'STRINGS 4'])