The manage.py dumpdata
or manage.py loaddata
commands can translate the
contents of your database into YAML, XML, or some other format using a
serializer. There may come a time when the built-in serializers do not meet
your needs, and you find yourself writing a custom serializer (and probably a
custom deserializer too). How can you go about testing these?
There are two basic scenarios:
- Serialization
- Dump database objects to whatever format you want
dumpdata
to output- Deserialization
- Loading database objects from your custom format, like
loaddata
does
You could test each of these separately, or you could do a full round-trip test
that mimics dumpdata
followed by loaddata
. We'll test them separately in
this example. We won't be testing the manage.py
command itself; this is
strictly a unit test on the custom serializer and/or deserializer.
There are many different levels of detail you could get into; we'll cover only a very simple scenario, involving a single model with a single string field. A more comprehensive test would include multiple models, relationships between and within models, and a wide variety of model field types and data. This example covers only a single model, with a single string field; it's intended only to give you a basic starting point. It's up to you how rigorously you want to test your serializer and deserializer.
If you've written your own serializer, you've probably added something like this
to your settings.py
:
SERIALIZATION_MODULES = { 'yaml': 'myapp.custom_yaml', }
In this example, we're using a customized YAML serializer. Anytime Django needs
to dump or load yaml
format, it'll use myapp/custom_yaml.py
to do it.
This example serializer uses a slightly different structure than Django's built-in YAML serializer. It's designed to be significantly less verbose, while still capturing all the relevant data. You may be familiar with Django's usual YAML output, which looks something like this:
- fields: {name: green} model: myapp.color pk: 1 - fields: {name: blue} model: myapp.color pk: 2 - fields: {name: red} model: myapp.color pk: 3 - fields: {name: squishy} model: myapp.texture pk: 1 - fields: {name: crumbly} model: myapp.texture pk: 2
This serializer uses a more condensed format:
myapp.color: 1: {name: green} 2: {name: blue} 3: {name: red} myapp.texture: 1: {name: squishy} 2: {name: crumbly}
We won't go into the details of the serializer itself; all we're concerned with here is how make sure it produces correct output, and that the corresponding deserializer loads it correctly afterwards.
To test the serializer, we will create some model instances, then serialize them
and make sure the output is correct. The django.core.serializers
module
defines a serialize
function that takes the name of the format you want to
serialize, along with a QuerySet
of objects to serialize, and returns a
string of serialized data. That's what we'll use in our test:
from django.test import TestCase from django.core import serializers from myapp.models import Color class YamlSerializerTest (TestCase): def test_serializer(self): # Stuff to serialize Color(name='green').save() Color(name='blue').save() Color(name='red').save() # Expected output expect_yaml = \ 'myapp.color:\n' \ ' 1: {name: green}\n' \ ' 2: {name: blue}\n' \ ' 3: {name: red}\n' # Do the serialization actual_yaml = serializers.serialize('yaml', Color.objects.all()) # Did it work? self.assertEqual(actual_yaml, expect_yaml)
Notice that we pass 'yaml'
as the first argument to serialize
;
ordinarily this would use the default YAML serializer, but since we've
overridden that in SERIALIZATION_MODULES
, it'll use our custom one instead.
Since we're working with standard YAML, another way to verify the result is to
parse it using yaml.load
, and check that the resulting Python data structure
(in this case, a dict
) matches expectations:
class YamlSerializerTest (TestCase): def test_serializer(self): # ... self.assertEqual( yaml.load(actual_yaml), { 'myapp.color': { 1: {'name': 'green'}, 2: {'name': 'blue'}, 3: {'name': 'red'}, } } )
Of course, if you're serializing to your own made-up custom format, you may not have a standalone parser for that format so readily available. In that case, you may simply choose to rely on your deserializer tests to ensure that the output is parsed correctly.
Unless your serializer is designed for one-way conversion, you'll want to include some tests for your deserializer as well. Starting with the serialized text output, we'll make sure that it gets loaded into the database and correctly builds the original models.
The django.core.serializers
module includes a counterpart to the
serialize
function called (you guessed it) deserialize
. This function
accepts a format (like yaml
), along with a chunk of text to deserialize.
It returns a generator that yields each of the objects as they are parsed. We'll
convert these into a list, then verify that the deserialized objects have the
correct values in their fields:
class YamlSerializerTest (TestCase): def test_deserializer(self): # Input text input_yaml = \ 'myapp.color:\n' \ ' 1: {name: green}\n' \ ' 2: {name: blue}\n' \ ' 3: {name: red}\n' # Deserialize into a list of objects objects = list(serializers.deserialize('yaml', input_yaml)) # Were three objects deserialized? self.assertEqual(len(objects), 3) # Did the objects deserialize correctly? self.assertEqual(objects[0].object.name, 'green') self.assertEqual(objects[1].object.name, 'blue') self.assertEqual(objects[2].object.name, 'red')
Perhaps this isn't the most elegant way to do it, but it gets the job done.
Several custom serializers are available on djangosnippets.org, including json and csv serializers. The slightly better YAML serializer was used as the basis for the examples above. For a much more thorough serializer test suite, please consult Django's regression tests.