GraphQL#
DocArray supports GraphQL for querying a DocumentArray and getting exactly the fields you need: So, if .embedding
is too big and verbose, you don’t need to query it. Compared to the REST API, clients using GraphQL are fast and stable because they are in control the data they get, not the server.
When integrating DocArray into a GraphQL app, you only need to implement the query (in GraphQL idiom, this is like the API endpoint that your server allows). The schema part is provided by DocArray and can be used out of the box.
Tip
This feature requires strawberry
. You can install it via pip install "docarray[full]"
or pip install "strawberry-graphql[debug-server]"
.
See also
This article does not serve as the introduction to GraphQL. If you don’t have a GraphQL background, we strongly recommend learning more about GraphQL in the official GraphQL documentation. You may also want to learn more about Strawberry. Otherwise, you may get confused by the GraphQL idioms, e.g. query, schema.
Basic example#
Let’s create dummy matches in a DocumentArray:
from docarray import DocumentArray
import numpy as np
da = DocumentArray.empty(3)
da.embeddings = np.random.random([3, 15])
db = DocumentArray.empty(4)
db.embeddings = np.random.random([4, 15])
da.match(db)
da.summary()
Documents Summary
Length 3
Homogenous Documents True
Has nested Documents in ('matches',)
Common Attributes ('id', 'embedding', 'matches')
Attributes Summary
Attribute Data type #Unique values Has empty value
────────────────────────────────────────────────────────────────
embedding ('ndarray',) 3 False
id ('str',) 3 False
matches ('MatchArray',) 3 False
Storage Summary
Class DocumentArrayInMemory
Backend In Memory
Now let’s build a query (remember in GraphQL this means an endpoint) that allows users to fetch this DocumentArray:
from typing import List
from docarray.document.strawberry_type import StrawberryDocument
import strawberry
@strawberry.type
class Query:
docs: List[StrawberryDocument] = strawberry.field(
resolver=lambda: da.to_strawberry_type()
)
schema = strawberry.Schema(query=Query)
Notice how we leverage StrawberryDocument
and use to_strawberry_type()
to convert the type in the resolver before returning the result.
In practice, da
could be your final search results, or some DocumentArray after embedding or preprocessing. Here we just use the dummy matches we created before to serve as the results.
Finally, save all code snippets above into toy.py
and run it from the terminal:
strawberry server toy
This output:
Running strawberry on http://0.0.0.0:8000/graphql 🍓
Now open http://0.0.0.0:8000/graphql
in your browser. You should be able to see a GraphQL playground at this URL.
Try the following query
{
docs {
id
matches {
id
}
}
}
Now we have one endpoint that allows user to selectively read fields from a DocumentArray. We can add additional endpoints to the Query
class to support advanced filtering and selecting, but this is beyond the scope of this tutorial. It is also your responsibility as the app/service provider to decide what API you want to expose to users.
Integrate with FastAPI#
Strawberry’s built-in server is perfect for prototyping an API. When it comes to production, you can use FastAPI. Here’s a short example to show how to wrap the above snippet in a FastAPI app:
from strawberry.asgi import GraphQL
from fastapi import FastAPI
graphql_app = GraphQL(schema)
app = FastAPI()
app.add_route('/graphql', graphql_app)
app.add_websocket_route('/graphql', graphql_app)
Learn more about FastAPI GraphQL support.