Sometimes, old design decisions comes back to bite you. This is one of those tales.
A project I’m working on had a Django model similar to this:
class Municipality(models.Model): code = models.CharField(max_length=2, primary_key=True) name = models.CharField(max_length=100)
and it was used by other models such as:
class ZipCode(models.Model): code = models.CharField(max_length=4, primary_key=True) municipality = models.ForeignKey(Municipality)
And all was good, until we needed to expand the municipality model to support different countries, and thus a single unique field with only the
code - which may collide across countries, was not enough.
For all the modern parts of the code base we use
UUIDs as the primary key,
so we wanted to migrate the primary key of
Municipality to a
retaining all the relations.
As of September 2017, Django does not support migrating primary keys in any nice manner, so we’re on our own here.
We tried a lot of magic solutions, but we always got stuck with the migrations system not being able to detect and handle the changes nicely.
After some research and quite a bit of trial and error, we settled on the following approach. It has some drawbacks I’ll get back to, but works reasonable well.
A quick reminder on how the world looks from a database perspective. When you define a
ForeignKey field in Django, it creates a database column of the same type as the primary key of the referenced model on the referring model, and adds the foreign key constraints. So in the example above, we have two tables (in pseudo SQL):
CREATE TABLE municipality ( code varchar(2) PRIMARY KEY NOT NULL, name varchar(100) ); CREATE TABLE zipcode ( code varchar(4) PRIMARY KEY NOT NULL, municipality_id VARCHAR(2) REFERENCES(municipality.id) NOT NULL );
So, we need to:
- Break the foreign key constraints.
- Alter the root model.
- Map the the new primary key ids to the old ones.
- Re-apply the foreign keys to it.
We start by breaking the foreign keys in the referring models.
class ZipCode(models.Model): code = ... # Same as before municipality = models.CharField(max_length=2) # Foreign key removed here
python manage.py makemigrations -n break_zipcode_muni_foreignkey
Now that the
Municipality model is free from any external referring models,
we can go to work on it.
Start by adding the new
class Municipality(models.Model): id = models.UUIDField(default=uuid.uuid4)
python manage.py makemigrations -n add_id_field_to_muni
For some reason, using
uuid.uuid4() as a default function in the migration didn’t work in my case, so I added a step in the created migration to create new unique ids for all rows:
def create_ids(apps, schema_editor): Municipality = apps.get_model('myapp', 'Municipality') for m in municipality: m.id = uuid.uuid4() m.save() # ... operations = [ migrations.AddField(...), migrations.RunPython( code=create_ids, reverse_code=migrations.RunPython.noop, ), ]
Now we have a
id field on
Municipality, and we should be able to switch the primary key around:
class Municipality(models.Model): id = models.UUIDField(default=uuid.uuid4, primary_key=True) # primary_key added code = models.CharField(max_length=2, unique=True) # primary_key replaced with unique
Create the migration, and make sure that the
AlterField operation for
code is run before the
id so that we never have two primary keys at the same time. We’ve added
primary_key to the
id field and
unique=True to the
code field, since we still want to enforce that constraint for now, and we lost it when we removed the
primary_key attribute from it.
Congratulations, we now have a new
UUID primary key. But we still need to clean up everything we broke the foreign keys from.
Lets start by creating an empty migration:
python manage.py makemigrations --empty -n fix_zipcode_fk_to_muni_uuid myapp
Open the file, and let us begin:
def match(apps, schema_editor): ZipCode = apps.get_model('myapp', 'ZipCode') Municipality = apps.get_model('myapp', 'Municipality') for zip_code in ZipCode.object.all(): zip_code.temp_muni = Municipality.objects.get(code=z.municipality) zip_code.save() # ... operations = [ migrations.AddField( model_name='zipcode', name='temp_muni', field=models.UUIDField(null=True), ), migrations.RunPython( code=match, reverse_code=migrations.RunPython.noop, ), migrations.RemoveField(model_name='zipcode', name='municipality'), migrations.RenameField( model_name='zipcode', old_name='temp_muni', new_name='municipality'), migrations.AlterField( model_name='zipcode', name='municipality', field=models.ForeignKey( on_delete=django.db.models.deletion.PROTECT, to='municipality') ]
Let us go through the steps here.
- Add a temporary field for storing the
Municipalitythat we want to connect to. We don’t make it a
ForeignKeyfield just yet, as Django gets confused about the naming later on.
- We run the
matchfunction to look up the new ids by the old lookup key, and store it in the temporary field
- Remove the old
- Rename the temporary field to
- Finally migrate the type of
municipalityto a foreign key to create all the database constraints we need.
And there you go. All done.
There are some down sides here. Since we split the migrations into several files/migrations, we leave ourself vulnerable if any of the later migrations fail. This will probably leave the application in a pretty unworkable state. So make sure to test the migrations quite a bit. You can reduce the risk by hand editing all the steps into one migration, but if you have references from multiple different apps, then you need to the breaking and restoring in separate steps anyway.
Logging / Debugging
You’ll most likely end up with some SQL errors during the process of creating these, so a nice trick I like to do is to create a simple logging migration operation.
def log(message): def fake_op(apps, schema_editor): print(message) return fake_op # ... operations = [ migration.RunPython(log('Step 1')), migration.AlterField(..), migration.RunPython(log('Step 2')), # ... ]
This allows you to see where in the process the migration fail.
To see what SQL Django creates for a given migration, run
python manage.py sqlmigrate <appname> <migration_number>. This is super useful for checking whether operations are run in the order that you expect.