Query by Conditions#

You can use find() to select Documents from a DocumentArray based the conditions specified in a query object. You can use da.find(query) to filter Documents and get nearest neighbors from da:

  • To filter Documents, the query object is a Python dictionary object that defines the filtering conditions using a MongoDB-like query language.

  • To find nearest neighbors, the query object needs to be a NdArray-like, a Document, or a DocumentArray object that defines embedding. You can also use .match() function for this purpose, and there is a minor interface difference between these two functions, which is described in the next chapter.

filter query syntax

The filter query syntax depends on which document store you use. Some may have their own query language.

Let’s see some examples in action. First, let’s prepare a DocumentArray:

from jina import Document, DocumentArray

da = DocumentArray(
            tags={'h': 14, 'w': 21, 'uom': 'cm'},
            tags={'h': 8.5, 'w': 11, 'uom': 'in'},
            tags={'h': 8.5, 'w': 11, 'uom': 'in'},
            tags={'h': 22.85, 'w': 30, 'uom': 'cm'},
            tags={'h': 10, 'w': 15.25, 'uom': 'cm'},

                            Documents Summary                            
  Length                 5                                               
  Homogenous Documents   True                                            
  Common Attributes      ('id', 'text', 'tags', 'weight', 'modality')  
                     Attributes Summary                     
  Attribute   Data type   #Unique values   Has empty value  
  id          ('str',)    5                False            
  weight      ('int',)    5                False            
  modality    ('str',)    2                False            
  tags        ('dict',)   5                False            
  text        ('str',)    5                False            

Filter with query operators#

A query filter document uses query operators to specify conditions:

{ <field1>: { <operator1>: <value1> }, ... }

Here field1 is any field name of a Document object. To access nested fields, you can use the dunder expression. For example, tags__timestamp accesses the doc.tags['timestamp'] field.

value1 can be either a user given Python object, or a substitution field with curly bracket {field}

Finally, operator1 can be one of the following:

Query Operator



Equal to (number, string)


Not equal to (number, string)


Greater than (number)


Greater than or equal to (number)


Less than (number)


Less than or equal to (number)


Is in an array


Not in an array


Match the specified regular expression


Match array/dict field that have the specified size. $size does not accept ranges of values.


Matches documents that have the specified field; predefined fields having a default value (for example empty string, or 0) are considered as not existing; if the expression specifies a field x in tags (tags__x), then the operator tests that x is not None.

To select all modality='D' Documents:

r = da.find({'modality': {'$eq': 'D'}})

pprint(r.to_dict(exclude_none=True))  # just for pretty print
    "id": "92aee5d665d0c4dd34db10d83642aded",
    "modality": "D",
    "tags": {
      "h": 8.5,
      "uom": "in",
      "w": 11.0
    "text": "paper",
    "weight": 100.0
    "id": "1a9d2139b02bc1c7842ecda94b347889",
    "modality": "D",
    "tags": {
      "h": 22.85,
      "uom": "cm",
      "w": 30.0
    "text": "planner",
    "weight": 75.0

To select all Documents whose .tags['h']>10,

r = da.find({'tags__h': {'$gt': 10}})
    "id": "4045a9659875fd1299e482d710753de3",
    "modality": "A",
    "tags": {
      "h": 14.0,
      "uom": "cm",
      "w": 21.0
    "text": "journal",
    "weight": 25.0
    "id": "cf7691c445220b94b88ff116911bad24",
    "modality": "D",
    "tags": {
      "h": 22.85,
      "uom": "cm",
      "w": 30.0
    "text": "planner",
    "weight": 75.0

Beside using a predefined value, you can also use a substitution with {field}. Notice those curly braces. For example:

r = da.find({'tags__h': {'$gt': '{tags__w}'}})
    "id": "44c6a4b18eaa005c6dbe15a28a32ebce",
    "modality": "A",
    "tags": {
      "h": 14.0,
      "uom": "cm",
      "w": 10.0
    "text": "journal",
    "weight": 25.0

Combine multiple conditions#

You can combine multiple conditions using the following operators:

Boolean Operator



Join query clauses with a logical AND


Join query clauses with a logical OR


Inverts the effect of a query expression

r = da.find({'$or': [{'weight': {'$eq': 45}}, {'modality': {'$eq': 'D'}}]})
    "id": "22985b71b6d483c31cbe507ed4d02bd1",
    "modality": "D",
    "tags": {
      "h": 8.5,
      "uom": "in",
      "w": 11.0
    "text": "paper",
    "weight": 100.0
    "id": "a071faf19feac5809642e3afcd3a5878",
    "modality": "D",
    "tags": {
      "h": 22.85,
      "uom": "cm",
      "w": 30.0
    "text": "planner",
    "weight": 75.0
    "id": "411ecc70a71a3f00fc3259bf08c239d1",
    "modality": "A",
    "tags": {
      "h": 10.0,
      "uom": "cm",
      "w": 15.25
    "text": "postcard",
    "weight": 45.0