docarray.array.storage.opensearch package#

Submodules#

Module contents#

class docarray.array.storage.opensearch.StorageMixins(*args, **kwds)[source]#

Bases: FindMixin, BackendMixin, GetSetDelMixin, SequenceLikeMixin, ABC

MAX_OPENSEARCH_RETURNED_DOCS = 10000#
TYPE_MAP: Dict[str, TypeMap] = {'bool': TypeMap(type='boolean', converter=<class 'bool'>), 'double': TypeMap(type='double', converter=<class 'float'>), 'float': TypeMap(type='float', converter=<class 'float'>), 'int': TypeMap(type='integer', converter=<function _safe_cast_int>), 'long': TypeMap(type='long', converter=<function _safe_cast_int>), 'str': TypeMap(type='text', converter=<class 'str'>)}#
append(value, **kwargs)#

Append doc to the end of the array.

Parameters:

value (Document) – The doc needs to be appended.

clear()#

Clear the data of DocumentArray

count(value) integer -- return number of occurrences of value#
extend(values, **kwargs)#

S.extend(iterable) – extend sequence by appending elements from the iterable

Return type:

None

find(query=None, metric='cosine', limit=20, metric_name=None, exclude_self=False, filter=None, only_id=False, index='text', return_root=False, on=None, **kwargs)#

Returns matching Documents given an input query. If the query is a DocumentArray, Document or ArrayType, exhaustive or approximate nearest neighbor search will be performed depending on whether the storage backend supports ANN. Furthermore, if filter is not None, pre-filtering will be applied along with vector search. If the query is a dict object or, query is None and filter is not None, Documents will be filtered and all matching Documents that match the filter will be returned. In this case, query (if it’s dict) or filter will be used for filtering. The object must follow the backend-specific filter format if the backend supports filtering or DocArray’s query language format. In the latter case, filtering will be applied in the client side not the backend side. If the query is a string or list of strings, a search by text will be performed if the backend supports indexing and searching text fields. If not, a NotImplementedError will be raised.

Parameters:
  • query (Union[DocumentArray, Document, ArrayType, Dict, str, List[str], None]) – the input query to search by

  • limit (Union[int, float, None]) – the maximum number of matches, when not given defaults to 20.

  • metric_name (Optional[str]) – if provided, then match result will be marked with this string.

  • metric (Union[str, Callable[[ArrayType, ArrayType], ndarray]]) – the distance metric.

  • exclude_self (bool) – if set, Documents in results with same id as the query values will not be considered as matches. This is only applied when the input query is Document or DocumentArray.

  • filter (Union[Dict, str, None]) – filter query used for pre-filtering or filtering

  • only_id (bool) – if set, then returning matches will only contain id

  • index (str) – if the query is a string, text search will be performed on the index field, otherwise, this parameter is ignored. By default, the Document text attribute will be used for search, otherwise the tag field specified by index will be used. You can only use this parameter if the storage backend supports searching by text.

  • return_root (Optional[bool]) – if set, then the root-level DocumentArray will be returned

  • on (Optional[str]) – specifies a subindex to search on. If set, the returned DocumentArray will be retrieved from the given subindex.

  • kwargs – other kwargs.

Return type:

Union[DocumentArray, List[DocumentArray]]

Returns:

a list of DocumentArrays containing the closest Document objects for each of the queries in query.

index(value[, start[, stop]]) integer -- return first index of value.#

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

insert(index, value, **kwargs)#

Insert doc at index.

Parameters:
  • index (int) – Position of the insertion.

  • value (Document) – The doc needs to be inserted.

  • kwargs – Additional Arguments that are passed to the Document Store. This has no effect for in-memory DocumentArray.

pop([index]) item -- remove and return item at index (default last).#

Raise IndexError if list is empty or index is out of range.

remove(value)#

S.remove(value) – remove first occurrence of value. Raise ValueError if the value is not present.

reverse()#

S.reverse() – reverse IN PLACE

sync()#
class docarray.array.storage.opensearch.OpenSearchConfig(n_dim, distance='cosinesimil', hosts='http://localhost:9900', index_name=None, list_like=True, opensearch_config=<factory>, index_text=False, tag_indices=<factory>, batch_size=64, ef_construction=512, m=16, columns=None, engine='nmslib', ef_search=None, encoder=None, algorithm='hnsw', root_id=True)[source]#

Bases: object

n_dim: int#
distance: str = 'cosinesimil'#
hosts: Optional[Union[str, List[Union[str, Mapping[str, Union[str, int]]]]]] = 'http://localhost:9900'#
index_name: Optional[str] = None#
list_like: bool = True#
opensearch_config: Dict[str, Any]#
index_text: bool = False#
tag_indices: List[str]#
batch_size: int = 64#
ef_construction: Optional[int] = 512#
m: Optional[int] = 16#
columns: Optional[Union[List[Tuple[str, str]], Dict[str, str]]] = None#
engine: str = 'nmslib'#
encoder: Optional[str] = None#
algorithm: str = 'hnsw'#
root_id: bool = True#