arkouda.pandas.index

Index and MultiIndex classes for Arkouda Series and DataFrames.

This module defines the foundational indexing structures used in Arkouda’s pandas-like API, supporting labeled indexing, alignment, and grouping operations. Indexes provide the mechanism to assign meaningful labels to rows and columns.

Classes

Indexclass

One-dimensional immutable sequence used to label and align axis data. Accepts various types of inputs including pdarray, Strings, Categorical, Python lists, or pandas Index/Categorical objects. Supports optional name and lightweight list-based indexing for small inputs.

MultiIndexclass

A multi-level index for complex datasets, composed of multiple Index-like arrays (“levels”). Each level may contain categorical, string, or numeric values. Supports construction from a list of arrays or a pandas.MultiIndex.

Features

  • Flexible input types for index construction

  • Support for named and multi-level indexing

  • Efficient size and shape inference

  • Alignment and equality comparison logic

  • Integration with Arkouda Series and DataFrames

Notes

  • MultiIndex currently does not support construction from tuples; it must be created from lists of values or pandas MultiIndex objects.

  • Only one-dimensional (1D) indexing is supported at this time.

  • All level arrays in a MultiIndex must have the same length.

Examples

>>> import arkouda as ak
>>> from arkouda.pandas.index import Index, MultiIndex
>>> idx = Index([10, 20, 30], name="id")
>>> idx
Index(array([10 20 30]), dtype='int64')
>>> midx = MultiIndex([ak.array([1, 2]), ak.array(["a", "b"])], names=["num", "char"])
>>> midx.nlevels
2
>>> midx.get_level_values("char")
Index(array(['a', 'b']), dtype='<U0')

See also

-, -

Classes

Index

Sequence used for indexing and alignment.

MultiIndex

A multi-level, or hierarchical, index object for Arkouda DataFrames and Series.

Module Contents

class arkouda.pandas.index.Index(values: List | arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | arkouda.pandas.categorical.Categorical | pandas.Index | Index | pandas.Categorical, name: str | None = None, allow_list=False, max_list_size=1000)[source]

Sequence used for indexing and alignment.

The basic object storing axis labels for all DataFrame objects.

Parameters:
  • values (List, pdarray, Strings, Categorical, pandas.Categorical, pandas.Index, or Index)

  • name (str, default=None) – Name to be stored in the index.

  • False (allow_list =) – If False, list values will be converted to a pdarray. If True, list values will remain as a list, provided the data length is less than max_list_size.

:paramIf False, list values will be converted to a pdarray.

If True, list values will remain as a list, provided the data length is less than max_list_size.

Parameters:

1000 (max_list_size =) – This is the maximum allowed data length for the values to be stored as a list object.

Raises:

ValueError – Raised if allow_list=True and the size of values is > max_list_size.

See also

MultiIndex

Examples

>>> import arkouda as ak
>>> ak.Index([1, 2, 3])
Index(array([1 2 3]), dtype='int64')
>>> ak.Index(list('abc'))
Index(array(['a', 'b', 'c']), dtype='<U0')
>>> ak.Index([1, 2, 3], allow_list=True)
Index([1, 2, 3], dtype='int64')
argsort(ascending: bool = True) list | arkouda.numpy.pdarrayclass.pdarray[source]

Return the permutation that sorts the Index.

Parameters:

ascending (bool, optional) – If True (default), sort in ascending order. If False, sort in descending order.

Returns:

Indices that would sort the Index.

Return type:

list or pdarray

Examples

>>> import arkouda as ak
>>> idx = ak.Index([10, 3, 5])
>>> idx.argsort()
array([1 2 0])
concat(other)[source]

Concatenate this Index with another Index.

Parameters:

other (Index) – The Index to concatenate with this one.

Returns:

A new Index with values from both indices.

Return type:

Index

Raises:

TypeError – If the types of the two Index objects do not match.

equals(other: Index) arkouda.numpy.dtypes.bool_scalars[source]

Whether Indexes are the same size, and all entries are equal.

Parameters:

other (Index) – object to compare.

Returns:

True if the Indexes are the same, o.w. False.

Return type:

bool_scalars

Examples

>>> import arkouda as ak
>>> i = ak.Index([1, 2, 3])
>>> i_cpy = ak.Index([1, 2, 3])
>>> i.equals(i_cpy)
np.True_
>>> i2 = ak.Index([1, 2, 4])
>>> i.equals(i2)
np.False_

MultiIndex case:

>>> arrays = [ak.array([1, 1, 2, 2]), ak.array(["red", "blue", "red", "blue"])]
>>> m = ak.MultiIndex(arrays, names=["numbers2", "colors2"])
>>> m.equals(m)
True
>>> arrays2 = [ak.array([1, 1, 2, 2]), ak.array(["red", "blue", "red", "green"])]
>>> m2 = ak.MultiIndex(arrays2, names=["numbers2", "colors2"])
>>> m.equals(m2)
False
static factory(index)[source]

Construct an Index or MultiIndex based on the input.

Parameters:

index (array-like or tuple of array-like) – If a single array-like, returns an Index. If a tuple of array-like objects, returns a MultiIndex.

Returns:

An Index if input is a single array-like, or a MultiIndex otherwise.

Return type:

Index or MultiIndex

classmethod from_return_msg(rep_msg)[source]

Reconstruct an Index or MultiIndex from a return message.

Parameters:

rep_msg (str) – A string return message containing encoded index information.

Returns:

The reconstructed Index or MultiIndex instance.

Return type:

Index or MultiIndex

property inferred_type: str

Return a string of the type inferred from the values.

is_registered()[source]

Return whether the object is registered.

Return True iff the object is contained in the registry or is a component of a registered object.

Returns:

Indicates if the object is contained in the registry

Return type:

numpy.bool

Raises:

RegistrationError – Raised if there’s a server-side error or a mis-match of registered components

See also

register, attach, unregister

Notes

Objects registered with the server are immune to deletion until they are unregistered.

property is_unique

Property indicating if all values in the index are unique.

Return type:

bool - True if all values are unique, False otherwise.

lookup(key)[source]

Check for presence of key(s) in the Index.

Parameters:

key (pdarray or scalar) – The value(s) to look up in the Index. If a scalar is provided, it will be converted to a one-element array.

Returns:

A boolean array of length len(self), indicating which entries of the Index are present in key.

Return type:

pdarray

Raises:

TypeError – If key cannot be converted to an arkouda array.

map(arg: dict | arkouda.pandas.series.Series) Index[source]

Map values of Index according to an input mapping.

Parameters:

arg (dict or Series) – The mapping correspondence.

Returns:

A new index with the values transformed by the mapping correspondence.

Return type:

arkouda.pandas.index.Index

Raises:

TypeError – Raised if arg is not of type dict or arkouda.pandas.Series. Raised if index values not of type pdarray, Categorical, or Strings.

Examples

>>> import arkouda as ak
>>> idx = ak.Index(ak.array([2, 3, 2, 3, 4]))
>>> idx
Index(array([2 3 2 3 4]), dtype='int64')
>>> idx.map({4: 25.0, 2: 30.0, 1: 7.0, 3: 5.0})
Index(array([30.00000000000000000 5.00000000000000000 30.00000000000000000
5.00000000000000000 25.00000000000000000]), dtype='float64')
>>> s2 = ak.Series(ak.array(["a","b","c","d"]), index = ak.array([4,2,1,3]))
>>> idx.map(s2)
Index(array(['b', 'd', 'b', 'd', 'a']), dtype='<U0')
max_list_size = 1000
memory_usage(unit='B')[source]

Return the memory usage of the Index values.

Parameters:

unit (str, default = "B") – Unit to return. One of {‘B’, ‘KB’, ‘MB’, ‘GB’}.

Returns:

Bytes of memory consumed.

Return type:

int

Examples

>>> import arkouda as ak
>>> idx = Index(ak.array([1, 2, 3]))
>>> idx.memory_usage()
24
property names

Return Index or MultiIndex names.

property ndim

Number of dimensions of the underlying data, by definition 1.

See also

MultiIndex.ndim

property nlevels

Integer number of levels in this Index.

An Index will always have 1 level.

objType = 'Index'
register(user_defined_name)[source]

Register this Index object and underlying components with the Arkouda server.

Parameters:

user_defined_name (str) – user defined name the Index is to be registered under, this will be the root name for underlying components

Returns:

The same Index which is now registered with the arkouda server and has an updated name. This is an in-place modification, the original is returned to support a fluid programming style. Please note you cannot register two different Indexes with the same name.

Return type:

Index

Raises:
  • TypeError – Raised if user_defined_name is not a str

  • RegistrationError – If the server was unable to register the Index with the user_defined_name

See also

unregister, attach, is_registered

Notes

Objects registered with the server are immune to deletion until they are unregistered.

registered_name: str | None = None
set_dtype(dtype)[source]

Change the data type of the index.

Currently only aku.ip_address and ak.array are supported.

property shape

Return the shape of the Index.

Returns:

A tuple representing the shape of the Index (size,).

Return type:

tuple

sort_values(return_indexer: bool = False, ascending: bool = True, na_position: str = 'last') Index | Tuple[Index, arkouda.numpy.pdarrayclass.pdarray | list][source]

Return a sorted copy of the index.

Parameters:
  • return_indexer (bool, default False) – If True, also return the integer positions that sort the index.

  • ascending (bool, default True) – Sort in ascending order. Use False for descending.

  • na_position ({'first', 'last'}, default 'last') – Where to position NaNs. ‘first’ puts NaNs at the beginning, ‘last’ at the end.

Returns:

sorted_indexarkouda.Index

A new Index whose values are sorted.

indexerUnion[arkouda.pdarray, list], optional

The indices that would sort the original index. Only returned when return_indexer=True.

Return type:

Union[Index, Tuple[Index, Union[pdarray, list]]]

Examples

>>> import arkouda as ak
>>> idx = ak.Index([10, 100, 1, 1000])
>>> idx
Index(array([10 100 1 1000]), dtype='int64')

Sort in ascending order (default): >>> idx.sort_values() Index(array([1 10 100 1000]), dtype=’int64’)

Sort in descending order and get the sort positions: >>> idx.sort_values(ascending=False, return_indexer=True) (Index(array([1000 100 10 1]), dtype=’int64’), array([3 1 0 2]))

to_csv(prefix_path: str, dataset: str = 'index', col_delim: str = ',', overwrite: bool = False)[source]

Write Index to CSV file(s).

File will contain a single column with the pdarray data. All CSV Files written by Arkouda include a header denoting data types of the columns.

Parameters:
  • prefix_path (str) – The filename prefix to be used for saving files. Files will have _LOCALE#### appended when they are written to disk.

  • dataset (str) – Column name to save the pdarray under. Defaults to “array”.

  • col_delim (str) – Defaults to “,”. Value to be used to separate columns within the file. Please be sure that the value used DOES NOT appear in your dataset.

  • overwrite (bool) – Defaults to False. If True, any existing files matching your provided prefix_path will be overwritten. If False, an error will be returned if existing files are found.

Return type:

str reponse message

Raises:
  • ValueError – Raised if all datasets are not present in all parquet files or if one or more of the specified files do not exist.

  • RuntimeError – Raised if one or more of the specified files cannot be opened. If allow_errors is true this may be raised if no values are returned from the server.

  • TypeError – Raised if we receive an unknown arkouda_type returned from the server. Raised if the Index values are a list.

Notes

  • CSV format is not currently supported by load/load_all operations

  • The column delimiter is expected to be the same for column names and data

  • Be sure that column delimiters are not found within your data.

  • All CSV files must delimit rows using newline (n) at this time.

to_dict(label)[source]

Convert the Index to a dictionary with a specified label.

Parameters:

label (str or list of str) – The key to use in the resulting dictionary. If a list is provided, only the first element is used. If None, defaults to “idx”.

Returns:

A dictionary with the label as the key and the Index as the value.

Return type:

dict

to_hdf(prefix_path: str, dataset: str = 'index', mode: Literal['truncate', 'append'] = 'truncate', file_type: Literal['single', 'distribute'] = 'distribute') str[source]

Save the Index to HDF5.

The object can be saved to a collection of files or single file.

Parameters:
  • prefix_path (str) – Directory and filename prefix that all output files share

  • dataset (str) – Name of the dataset to create in files (must not already exist)

  • mode (str {'truncate' | 'append'}) – By default, truncate (overwrite) output files, if they exist. If ‘append’, attempt to create new dataset in existing files.

  • file_type (str ("single" | "distribute")) – Default: “distribute” When set to single, dataset is written to a single file. When distribute, dataset is written on a file per locale. This is only supported by HDF5 files and will have no impact of Parquet Files.

Return type:

string message indicating result of save operation

Raises:
  • RuntimeError – Raised if a server-side error is thrown saving the pdarray

  • TypeError – Raised if the Index values are a list.

Notes

  • The prefix_path must be visible to the arkouda server and the user must

have write permission. - Output files have names of the form <prefix_path>_LOCALE<i>, where <i> ranges from 0 to numLocales for file_type=’distribute’. Otherwise, the file name will be prefix_path. - If any of the output files already exist and the mode is ‘truncate’, they will be overwritten. If the mode is ‘append’ and the number of output files is less than the number of locales or a dataset with the same name already exists, a RuntimeError will result. - Any file extension can be used.The file I/O does not rely on the extension to determine the file format.

to_ndarray()[source]

Convert the Index values to a NumPy ndarray.

Returns:

A NumPy array representation of the Index values.

Return type:

numpy.ndarray

to_pandas()[source]

Convert this Arkouda-backed index wrapper to an equivalent pandas Index.

This method materializes the underlying values into a local NumPy array (or pandas Categorical, when applicable) and returns the corresponding pandas Index (or CategoricalIndex).

Returns:

A pandas Index representing the same logical values. For categorical data, a pandas.CategoricalIndex is returned.

Return type:

pandas.Index

Notes

  • If the underlying values are categorical, this returns a pandas.CategoricalIndex.

  • For unicode string-like data (or object arrays inferred as strings), this attempts to return a pandas “string” dtype Index to match pandas’ missing-value behavior (e.g., NA handling).

  • Fixed-width bytes data is preserved as bytes (no implicit decoding).

Examples

>>> import arkouda as ak
>>> import pandas
>>> idx = ak.Index(ak.array([1,2,3]))
>>> pidx = idx.to_pandas()
>>> pidx.dtype
dtype('<i8')
to_parquet(prefix_path: str, dataset: str = 'index', mode: Literal['truncate', 'append'] = 'truncate', compression: str | None = None)[source]

Save the Index to Parquet.

The result is a collection of files, one file per locale of the arkouda server, where each filename starts with prefix_path. Each locale saves its chunk of the array to its corresponding file.

Parameters:
  • prefix_path (str) – Directory and filename prefix that all output files share

  • dataset (str) – Name of the dataset to create in files (must not already exist)

  • mode ({'truncate' | 'append'}) – By default, truncate (overwrite) output files, if they exist. If ‘append’, attempt to create new dataset in existing files.

  • compression (str (Optional)) – (None | “snappy” | “gzip” | “brotli” | “zstd” | “lz4”) Sets the compression type used with Parquet files

Return type:

string message indicating result of save operation

Raises:
  • RuntimeError – Raised if a server-side error is thrown saving the pdarray

  • TypeError – Raised if the Index values are a list.

Notes

  • The prefix_path must be visible to the arkouda server and the user must

have write permission. - Output files have names of the form <prefix_path>_LOCALE<i>, where <i> ranges from 0 to numLocales for file_type=’distribute’. - ‘append’ write mode is supported, but is not efficient. - If any of the output files already exist and the mode is ‘truncate’, they will be overwritten. If the mode is ‘append’ and the number of output files is less than the number of locales or a dataset with the same name already exists, a RuntimeError will result. - Any file extension can be used.The file I/O does not rely on the extension to determine the file format.

tolist()[source]

Convert the Index values to a Python list.

Returns:

A list containing the Index values.

Return type:

list

unregister()[source]

Unregister this Index object in the arkouda server.

Unregister this Index object in the arkouda server, which was previously registered using register() and/or attached to using attach().

Raises:

RegistrationError – If the object is already unregistered or if there is a server error when attempting to unregister

See also

register, attach, is_registered

Notes

Objects registered with the server are immune to deletion until they are unregistered.

update_hdf(prefix_path: str, dataset: str = 'index', repack: bool = True)[source]

Overwrite the dataset with the name provided with this Index object.

If the dataset does not exist it is added.

Parameters:
  • prefix_path (str) – Directory and filename prefix that all output files share

  • dataset (str) – Name of the dataset to create in files

  • repack (bool) – Default: True HDF5 does not release memory on delete. When True, the inaccessible data (that was overwritten) is removed. When False, the data remains, but is inaccessible. Setting to false will yield better performance, but will cause file sizes to expand.

Raises:

RuntimeError – Raised if a server-side error is thrown saving the index

Notes

  • If file does not contain File_Format attribute to indicate how it was saved, the file name is checked for _LOCALE#### to determine if it is distributed.

  • If the dataset provided does not exist, it will be added

  • Because HDF5 deletes do not release memory, this will create a copy of the file with the new data

class arkouda.pandas.index.MultiIndex(data: list | tuple | pandas.MultiIndex | MultiIndex, name: str | None = None, names: Iterable[Hashable | None] | None = None)[source]

Bases: Index

A multi-level, or hierarchical, index object for Arkouda DataFrames and Series.

A MultiIndex allows you to represent multiple dimensions of indexing using a single object, enabling advanced indexing and grouping operations.

This class mirrors the behavior of pandas’ MultiIndex while leveraging Arkouda’s distributed data structures. Internally, it stores a list of Index objects, each representing one level of the hierarchy.

Examples

>>> import arkouda as ak
>>> from arkouda.pandas.index import MultiIndex
>>> a = ak.array([1, 2, 3])
>>> b = ak.array(['a', 'b', 'c'])
>>> mi = MultiIndex([a, b])
>>> mi[1]
MultiIndex([np.int64(2), np.str_('b')])
argsort(ascending=True)[source]

Return the indices that would sort the MultiIndex.

Parameters:

ascending (bool, default True) – If False, the result is in descending order.

Returns:

An array of indices that would sort the MultiIndex.

Return type:

pdarray

concat(other)[source]

Concatenate this MultiIndex with another, preserving duplicates and order.

Parameters:

other (MultiIndex) – The other MultiIndex to concatenate with.

Returns:

A new MultiIndex containing values from both inputs, preserving order.

Return type:

MultiIndex

Raises:

TypeError – If the type of other does not match.

property dtype: numpy.dtype

Return the dtype object of the underlying data.

equal_levels(other: MultiIndex) bool[source]

Return True if the levels of both MultiIndex objects are the same.

get_level_values(level: str | int)[source]

Return the values at a particular level of the MultiIndex.

Parameters:

level (int or str) – The level number or name. If a string is provided, it must match an entry in self.names.

Returns:

An Index object corresponding to the requested level.

Return type:

Index

Raises:
  • RuntimeError – If self.names is None and a string level is provided.

  • ValueError – If the provided string is not in self.names, or if the level index is out of bounds.

property index

Return the levels of the MultiIndex.

Returns:

A list of Index objects representing the levels of the MultiIndex.

Return type:

list

property inferred_type: str

Return the inferred type of the MultiIndex.

Returns:

The string “mixed”, indicating the MultiIndex may contain multiple types.

Return type:

str

is_registered()[source]

Check if the MultiIndex is registered with the Arkouda server.

Returns:

True if the MultiIndex has a registered name and is recognized by the server, False otherwise.

Return type:

bool

levels: list[arkouda.numpy.pdarrayclass.pdarray | arkouda.numpy.strings.Strings | arkouda.pandas.categorical.Categorical]
lookup(key: list[Any] | tuple[Any, Ellipsis]) arkouda.pandas.groupbyclass.groupable[source]

Perform element-wise lookup on the MultiIndex.

Parameters:

key (list or tuple) –

A sequence of values, one for each level of the MultiIndex.

  • If the elements are scalars (e.g., (1, "red")), they are treated as a single row key: the result is a boolean mask over rows where all levels match the corresponding scalar.

  • If the elements are arkouda arrays (e.g., list of pdarrays / Strings), they must align one-to-one with the levels, and the lookup is delegated to in1d(self.index, key) for multi-column membership.

Returns:

A boolean array indicating which rows in the MultiIndex match the key.

Return type:

groupable

Raises:
  • TypeError – If key is not a list or tuple.

  • ValueError – If the length of key does not match the number of levels.

memory_usage(unit='B')[source]

Return the memory usage of the MultiIndex levels.

Parameters:

unit (str, default = "B") – Unit to return. One of {‘B’, ‘KB’, ‘MB’, ‘GB’}.

Returns:

Bytes of memory consumed.

Return type:

int

Examples

>>> import arkouda as ak
>>> m = ak.pandas.index.MultiIndex([ak.array([1,2,3]),ak.array([4,5,6])])
>>> m.memory_usage()
48
property name

Return Index or MultiIndex name.

property names

Return Index or MultiIndex names.

property ndim

Number of dimensions of the underlying data, by definition 1.

See also

Index.ndim

property nlevels: int

Integer number of levels in this MultiIndex.

See also

Index.nlevels

objType = 'MultiIndex'
register(user_defined_name)[source]

Register this Index object and underlying components with the Arkouda server.

Parameters:

user_defined_name (str) – user defined name the Index is to be registered under, this will be the root name for underlying components

Returns:

The same Index which is now registered with the arkouda server and has an updated name. This is an in-place modification, the original is returned to support a fluid programming style. Please note you cannot register two different Indexes with the same name.

Return type:

MultiIndex

Raises:
  • TypeError – Raised if user_defined_name is not a str

  • RegistrationError – If the server was unable to register the Index with the user_defined_name

See also

unregister, attach, is_registered

Notes

Objects registered with the server are immune to deletion until they are unregistered.

registered_name: str | None
set_dtype(dtype)[source]

Change the data type of the index.

Currently only aku.ip_address and ak.array are supported.

size: arkouda.numpy.dtypes.int_scalars
to_dict(labels=None)[source]

Convert the MultiIndex to a dictionary representation.

Parameters:

labels (list of str, optional) – A list of column names for the index levels. If not provided, defaults to [‘idx_0’, ‘idx_1’, …, ‘idx_n’].

Returns:

A dictionary mapping each label to the corresponding Index object.

Return type:

dict

to_hdf(prefix_path: str, dataset: str = 'index', mode: Literal['truncate', 'append'] = 'truncate', file_type: Literal['single', 'distribute'] = 'distribute') str[source]

Save the Index to HDF5.

The object can be saved to a collection of files or single file.

Parameters:
  • prefix_path (str) – Directory and filename prefix that all output files share

  • dataset (str) – Name of the dataset to create in files (must not already exist)

  • mode ({'truncate' | 'append'}) – By default, truncate (overwrite) output files, if they exist. If ‘append’, attempt to create new dataset in existing files.

  • file_type ({"single" | "distribute"}) – Default: “distribute” When set to single, dataset is written to a single file. When distribute, dataset is written on a file per locale. This is only supported by HDF5 files and will have no impact of Parquet Files.

Return type:

string message indicating result of save operation

Raises:

RuntimeError – Raised if a server-side error is thrown saving the pdarray.

Notes

  • The prefix_path must be visible to the arkouda server and the user must

have write permission. - Output files have names of the form <prefix_path>_LOCALE<i>, where <i> ranges from 0 to numLocales for file_type=’distribute’. Otherwise, the file name will be prefix_path. - If any of the output files already exist and the mode is ‘truncate’, they will be overwritten. If the mode is ‘append’ and the number of output files is less than the number of locales or a dataset with the same name already exists, a RuntimeError will result. - Any file extension can be used.The file I/O does not rely on the extension to determine the file format.

to_ndarray()[source]

Convert the MultiIndex to a NumPy ndarray of arrays.

Returns:

A NumPy array where each element is an array corresponding to one level of the MultiIndex. Categorical levels are converted to their underlying arrays.

Return type:

numpy.ndarray

to_pandas()[source]

Convert the MultiIndex to a pandas.MultiIndex object.

Returns:

A pandas MultiIndex with the same levels and names.

Return type:

pandas.MultiIndex

Notes

Categorical levels are converted to pandas categorical arrays, while others are converted to NumPy arrays.

tolist()[source]

Convert the MultiIndex to a list of lists.

Returns:

A list of Python lists, where each inner list corresponds to one level of the MultiIndex.

Return type:

list

unregister()[source]

Unregister this MultiIndex from the Arkouda server.

Raises:

RegistrationError – If the MultiIndex is not currently registered.

update_hdf(prefix_path: str, dataset: str = 'index', repack: bool = True)[source]

Overwrite the dataset with the name provided with this Index object.

If the dataset does not exist it is added.

Parameters:
  • prefix_path (str) – Directory and filename prefix that all output files share

  • dataset (str) – Name of the dataset to create in files

  • repack (bool) – Default: True HDF5 does not release memory on delete. When True, the inaccessible data (that was overwritten) is removed. When False, the data remains, but is inaccessible. Setting to false will yield better performance, but will cause file sizes to expand.

Raises:
  • RuntimeError – Raised if a server-side error is thrown saving the index

  • TypeError – Raised if the Index levels are a list.

Notes

  • If file does not contain File_Format attribute to indicate how it was saved, the file name is checked for _LOCALE#### to determine if it is distributed.

  • If the dataset provided does not exist, it will be added

  • Because HDF5 deletes do not release memory, this will create a copy of the file with the new data