arkouda.match¶
Classes¶
Encapsulates regular expression match results on Arkouda segmented string arrays. |
Package Contents¶
- class arkouda.match.Match(matched: arkouda.numpy.pdarrayclass.pdarray, starts: arkouda.numpy.pdarrayclass.pdarray, lengths: arkouda.numpy.pdarrayclass.pdarray, indices: arkouda.numpy.pdarrayclass.pdarray, parent_entry_name: str, match_type: MatchType, pattern: str)[source]¶
Encapsulates regular expression match results on Arkouda segmented string arrays.
Created by calling search(), match(), or fullmatch() on a Strings object. Provides access to match booleans, span information, capture groups, and origin indices of matches.
- re¶
Regex pattern used.
- Type:
str
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> m = strings.search('_+') >>> m <ak.Match object: matched=True, span=(1, 2); matched=True, span=(0, 4); matched=False; matched=True, span=(0, 2); matched=False> >>> type(m) <class 'arkouda.pandas.match.Match'> >>> m.matched() array([True True False True False]) >>> m.start() array([1 0 0]) >>> m.end() array([2 4 2]) >>> m.match_type() 'SEARCH' >>> m.re '_+' >>> m[1] 'matched=True, span=(0, 4)'
- end() arkouda.numpy.pdarrayclass.pdarray [source]¶
Return the ends of matches.
- Returns:
The end positions of matches
- Return type:
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.search('_+').end() array([2 4 2])
- find_matches(return_match_origins: bool = False)[source]¶
Return all matches as a new Strings object.
- Parameters:
return_match_origins (bool) – If True, return a pdarray containing the index of the original string each pattern match is from
- Returns:
Strings – Strings object containing only matches
pdarray, int64 (optional) – The index of the original string each pattern match is from
- Raises:
RuntimeError – Raised if there is a server-side error thrown
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.search('_+').find_matches(return_match_origins=True) (array(['_', '____', '__']), array([0 1 3]))
- group(group_num: int = 0, return_group_origins: bool = False)[source]¶
Return a new Strings containing the capture group corresponding to group_num.
For the default, group_num=0, return the full match.
- Parameters:
group_num (int) – The index of the capture group to be returned
return_group_origins (bool) – If True, return a pdarray containing the index of the original string each capture group is from
- Returns:
Strings – Strings object containing only the capture groups corresponding to group_num
pdarray, int64 (optional) – The index of the original string each group is from
Examples
>>> import arkouda as ak >>> strings = ak.array(["Isaac Newton, physics", '<-calculus->', 'Gottfried Leibniz, math']) >>> m = strings.search("(\\w+) (\\w+)") >>> m.group() array(['Isaac Newton', 'Gottfried Leibniz']) >>> m.group(1) array(['Isaac', 'Gottfried']) >>> m.group(2, return_group_origins=True) (array(['Newton', 'Leibniz']), array([0 2]))
- match_type() str [source]¶
Return the type of the Match object.
- Returns:
MatchType of the Match object
- Return type:
str
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.search('_+').match_type() 'SEARCH'
- matched() arkouda.numpy.pdarrayclass.pdarray [source]¶
Return a boolean array indiciating whether each element matched.
- Returns:
True for elements that match, False otherwise
- Return type:
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.search('_+').matched() array([True True False True False])
- re: str¶
- start() arkouda.numpy.pdarrayclass.pdarray [source]¶
Return the starts of matches.
- Returns:
The start positions of matches
- Return type:
Examples
>>> import arkouda as ak >>> strings = ak.array(['1_2___', '____', '3', '__4___5____6___7', '']) >>> strings.search('_+').start() array([1 0 0])