Sep 18th, 2020 - written by Kimserey with .
Few months ago we looked into Marshmallow, a Python serialisation and validation framework which can be used to translate Flask request data to SQLAlchemy model and vice versa. In today’s post we will look at how we can serialise an array containing polymorphic data.
Thanks to the flexibility of Python, it is common practice to hold different classes on the same array. Those objects could be from a derived class; for example if we had a base class Notification
class where the array would contain NotificationX
and NotificationY
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Notification:
def __init__(self, id: int):
self.id = id
class NotifcationUserCreated(Notification):
def __init__(self, id: int, username: str):
super().__init__(id)
self.username = username
def __repr__(self):
return "<NotifcationUserCreated id={}, username={}".format(
self.id, self.username
)
class NotifcationQuantityUpdated(Notification):
def __init__(self, id: int, quantity: int):
super().__init__(id)
self.quantity = quantity
def __repr__(self):
return "<NotifcationQuantityUpdated id={}, quantity={}".format(
self.id, self.quantity
)
We can then have an array composed of any notification:
1
2
3
4
5
6
7
8
9
from faker import Faker
fake = Faker()
notifications = [
NotifcationUserCreated(1, fake.name()),
NotifcationQuantityUpdated(2, 10),
NotifcationUserCreated(3, fake.name()),
]
will result in the following notifications:
1
2
3
[<NotifcationUserCreated id=1, username=Lauren Lambert,
<NotifcationQuantityUpdated id=2, quantity=10,
<NotifcationUserCreated id=3, username=David Woods]
As we can see, we are able to mix multiple classes into the array.
In order to serialize the notifications, we can create their Marshmallow schemas:
1
2
3
4
5
6
7
8
9
10
11
12
from marshmallow import Schema
from marshmallow.fields import Int, Str
class NotifcationUserCreatedSchema(Schema):
id = Int()
username = Str()
class NotifcationQuantityUpdatedSchema(Schema):
id = Int()
quantity = Int()
But as we can see, if we try to dump
using one schema, we would lose the information for notifications of the other type:
1
2
schema = NotifcationUserCreatedSchema(many=True)
schema.dump(notifications)
would result in:
1
2
3
[{'username': 'Lori Jackson', 'id': 1},
{'id': 2},
{'username': 'Samantha Clark', 'id': 3}]
Since the array contains polymorphic data, we need a way to use NotifcationUserCreatedSchema
when the object is a user created notification, and use the other schema when the object is of the other type.
In order to handle the selection of the right schema, we’ll use a type_map
attribute which will map from the notification type to the schema.
We first start by creating notification_type
attributes on the classes:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class NotifcationUserCreated(Notification):
notification_type = "user_created"
def __init__(self, id: int, username: str):
super().__init__(id)
self.username = username
def __repr__(self):
return "<NotifcationUserCreated id={}, username={}".format(
self.id, self.username
)
class NotifcationQuantityUpdated(Notification):
notification_type = "quantity_updated"
def __init__(self, id: int, quantity: int):
super().__init__(id)
self.quantity = quantity
def __repr__(self):
return "<NotifcationQuantityUpdated id={}, quantity={}".format(
self.id, self.quantity
)
Then we create a NotificationSchema
which holds a type_map
mapping from the notification_type
to the class schemas NotifcationUserCreatedSchema
and NotifcationQuantityUpdatedSchema
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
class NotificationSchema(Schema):
"""Notification schema."""
type_map = {
"user_created": NotifcationUserCreatedSchema,
"quantity_updated": NotifcationQuantityUpdatedSchema,
}
def dump(self, obj: typing.Any, *, many: bool = None):
result = []
errors = {}
many = self.many if many is None else bool(many)
if not many:
return self._dump(obj)
for idx, value in enumerate(obj):
try:
res = self._dump(value)
result.append(res)
except ValidationError as error:
errors[idx] = error.normalized_messages()
result.append(error.valid_data)
if errors:
raise ValidationError(errors, data=obj, valid_data=result)
return result
def _dump(self, obj: typing.Any):
notification_type = getattr(obj, "notification_type")
inner_schema = NotificationSchema.type_map.get(notification_type)
if inner_schema is None:
raise ValidationError(f"Missing schema for '{notification_type}'")
return inner_schema().dump(obj)
The NotificationSchema
acts as the parent schema which will use the proper schema to dump the object. We override the original dump
function def dump(self, obj: typing.Any, *, many: bool = None)
and within it, we use the type map to instantiate the right schema use dump
from that schema.
A special scenario to handle is when given many=True
, the object is expected to be an array which we enumerate and consolidate the validation errors - which should only be missing schema
validation errors (as dump
doesn’t run validation only load
).
And using this schema we can now serialize the notifications:
1
2
schema = NotificationSchema(many=True)
schema.dump(notifications)
which will result in:
1
2
3
[{'username': 'Anthony Montgomery', 'id': 1},
{'id': 2, 'quantity': 10},
{'username': 'Mr. Andrew Carter', 'id': 3}]
And we can see that we are able to serialize each notification properly!
Today we looked into serializing a polymorphic array, we started by creating a polymorphic structure example with Notifications. We then saw how to create associated Marshmallow schemas for it. And finally we looked at how we could override Marshmallow schema dump
to serialize properly the array. I hope you liked this post and I see you on the next one!