docarray.array.mixins.embed module#
- class docarray.array.mixins.embed.EmbedMixin[source]#
Bases:
object
Helper functions for embedding with a model
- embed(embed_model, device='cpu', batch_size=256, to_numpy=False, collate_fn=None)[source]#
Fill
embedding
of Documents inplace by using embed_model For the evaluation of a model, one can directly use theembed_and_evaluate()
function.- Parameters:
embed_model (AnyDNN) – The embedding model written in Keras/Pytorch/Paddle
device (
str
) – The computational device for embed_model, can be either cpu or cuda.batch_size (
int
) – Number of Documents in a batch for embeddingto_numpy (
bool
) – If to store embeddings back to Document innumpy.ndarray
or original framework format.collate_fn (
Optional
[CollateFnType]) – create a mini-batch of Input(s) from the given DocumentArray. Default built-in collate_fn is to use the tensors of the documents.
- Return type:
T
- Returns:
itself after modified.
- docarray.array.mixins.embed.get_framework(dnn_model)[source]#
Return the framework that powers a DNN model.
Note
This is not a solid implementation. It is based on
__module__
name, the key idea is to telldnn_model
without actually importing the framework.- Parameters:
dnn_model – a DNN model
- Return type:
str
- Returns:
keras, torch, paddle or ValueError