Partial GraphQL models with pydantic

I was recently building a client for a GraphQL service in python and was faced with the problem of modeling the message types.

Since those are essentially json models, I turned to pydantic with which I had good experience in the past.

Although in theory this should be relatively simple, I encountered some annoying problems right off the bat.

Lets take trevorblades’ country info GraphQL API for example.

Its schema type for Country looks like this:

type Country {
    code: ID!
    name: String!
    native: String!
    phone: String!
    capital: String
    currency: String
    emoji: String!
}

I’ve omitted attributes leading to other types to keep this example brief but know that this API also has info on languages, continents and states!

As an example, this simple query returns the following json:

from requests import post

url = "https://countries.trevorblades.com"

query = """
{
  country(code: "LV") {
    code
    name
    native
    phone
    capital
    currency
    emoji
  }
}
"""

post(url, json={"query": query}).json()

# Returns:
# {'data': {'country': {'code': 'LV', 'name': 'Latvia', 'native': 'Latvija',
# 'phone': '371', 'capital': 'Riga', 'currency': 'EUR', 'emoji': '🇱🇻'}}}

If we follow the schema type our pydantic class will look something like this:

from typing import Optional
from pydantic import BaseModel

class Country(BaseModel):
    code: str
    name: str
    native: str
    phone: str
    capital: Optional[str]
    currency: Optional[str]
    emoji: str

Which works great, using it to parse the response works perfectly:

res = post(url, json={"query": query}).json()

Country.parse_obj(res["data"]["country"])

# Returns:
# code='LV' name='Latvia' native='Latvija' phone='371' capital='Riga'
# currency='EUR' emoji='🇱🇻'

But, what if we want to query a small set of Country’s attributes?

query = """
{
  country(code: "LV") {
    code
    name
  }
}
"""

res = post(url, json={"query": query}).json()

Country.parse_obj(res["data"]["country"])

# Throws:
# pydantic.error_wrappers.ValidationError: 3 validation errors for Country
# native
#   field required (type=value_error.missing)
# phone
#   field required (type=value_error.missing)
# emoji
#   field required (type=value_error.missing)

pydantic doesn’t like the fact that some required attributes are missing.

This leaves us with some depressing “solutions” going forward:

Force querying all attributes whenever a Country is accessed.
Resulting in needless data being fetched.
Create a model for each query, omitting unused required attributes.
Resulting in an infinite amount of pydantic model classes which is impossible to keep track of.
Make all attributes Optional to avoid parsing errors.
Which results in a misleading model claiming all attributes are optional.

On my project we started working with option 3 and marked all attributes as optional.

That proved to be hazardous very quickly because:

The IDE was showing warnings claiming required attributes being accessed might be None all the time, which led to:
Developers failing to take into account legitimate possible None warnings assuming these are due to the loose models.

Troubled by these hazards I started researching a better option and landed on a good compromise.

The solution

Since the big issue around the 3rd option (all optional) is the misleading model effecting the IDE, I started looking for a way to implicitly allow for partial creation of the model.

I ended up with this:

from typing import Union, Tuple, Dict, Any, get_origin, get_args
from pydantic import BaseModel
from pydantic.fields import DeferredType
from pydantic.main import ModelMetaclass
from pydantic.generics import GenericModel

class ImplicitOptional(ModelMetaclass):
    def __new__(cls, name: str, bases: Tuple[type], namespaces: Dict[str, Any], **kwargs):
        annotations: dict = namespaces.get("__annotations__", {})

        for base in bases:
            for base_ in base.__mro__:
                if base_ is BaseModel or base_ is GenericModel:
                    break

                annotations = {**getattr(base_, "__annotations__", {}), **annotations}

        for field, annotation in annotations.items():

            if field.startswith("__"):
                continue

            if get_origin(annotation) is Union and type(None) in get_args(annotation):
                continue

            if isinstance(annotation, DeferredType):
                continue

            annotations[field] = Optional[annotation]

        namespaces["__annotations__"] = annotations

        return super().__new__(cls, name, bases, namespaces, **kwargs)

class GraphQLBaseModel(BaseModel, metaclass=ImplicitOptional):
    pass

This solution is based on the solution proposed here with some extra considerations for generic classes and other complex cases that were not covered.

What this voodoo python magic class does essentially is convert all non-optional fields declared in your models to optional by inheriting from GraphQLBaseModel instead of BaseModel.

This allows the IDE to continue type checking a more accurate model, displaying relevant warnings, while parsing a partial model without errors at runtime.

So when implementing for our example:

class Country(GraphQLBaseModel):
    code: str
    name: str
    native: str
    phone: str
    capital: Optional[str]
    currency: Optional[str]
    emoji: str

query = """
{
  country(code: "LV") {
    code
    name
  }
}
"""

res = post(url, json={"query": query}).json()

Country.parse_obj(res["data"]["country"])

# Returns
# code='LV' name='Latvia' native=None phone=None capital=None currency=None
# emoji=None

It is important to mention, although somewhat preferable to option 3, this solution still comes with a sharp edge.

When encountering a None value in a required field, you could safely deduce that it wasn’t mentioned in the query. On the other hand, a None value in an optional field is ambiguous and could mean either that data was None or that the field was not mentioned in the query.

The solution#

The solution