Local Development for Large Upgrades

Using COW to get a usable local setup when developing upgrade procedures for messy sites.

I’ve had to do way too many Plone upgrades where the sites are just a horrible nest of bad administration and maintenance, worse customizations, and the worst add-ons. A site in such a state is going to take a lot of small adjustments and additions to the migration procedure to get it back into a healthy enough state for the Plone upgrades to run. The kinds of problems you run into on such a site are often very specific to corners of functionality and very specific to deep dark corners of the content heirarchy. I’ve found no better way to work on this than to implement upgrade steps, run the upgrade, fix any errors, QA the completed upgrade, write new upgrade steps and refine the existing ones and repeat ad nauseam.

Since running the upgrade procedure can take a long time, the iterations take a long time. This is compounded by the amount of time it can take to transfer the DB, no matter how you do it. Using ZEO over remote is very slow, I assume due to the latency, from looking at iftop output. Syncing the BLOBs, even with a well tuned rsync, script can take forever.

I finally found a set of approaches that brings my development of upgrade steps up to a tolerable speed. It’s still slower than “normal” development, but it no longer feels maddening. Firstly, I developed an upgrade runner that commits upgrades per profile version increment so that I don’t have to start the whole procedure over. It’s called collective.upgrade and I plan to cut a release along with a more detailed blog post once I’ve deployed this current upgrade.

Secondly, I used a union mount (UnionFS, AUFS, etc.) to get copy-on-write behavior for my var/filestorage/Data.fs and var/blobstorage/. IOW, whenever the upgrade procedure reads a BLOB, it gets it from the production blobstorage directory via a network filesystem (SSHFS, CIFS/SMB, NFS, etc.). When writing to a BLOB, however, it writes it to a local directory and will use that version of the BLOB in the future. Since BLOBs are very often compressed images and files, I find no penalty in just letting the network FS do a dumb transfer of BLOBs as opposed to compressing with rsync -z. I also use the same setup for Data.fs, but since it’s much smaller than the BLOBs and it’s much more heavily used than the BLOBs, I’ve found it best to just rsync the Data.fs and Data.fs.index. With this setup I can test the upgrade at nearly local speeds and my upgrade step development is much faster for having all my favorite tools.

Here’s some scripts I’m using to do this. It’s very important that you mount the prod newtork FS as read-only with the -o ro mount option:

sudo mount -v -t <network fs type> <remote network fs URL> var/prod -o ro,<fs options>
sudo mount -v -t aufs none var/filestorage -o dirs=var/filestorage.prod:var/prod/var/filestorage=rr
sudo mount -v -t aufs none var/blobstorage -o dirs=var/blobstorage.prod:var/prod/var/blobstorage=rr

When it’s time to refresh to the latest prod, shutdown Zope and ZEO:

rsync -Paz <remote rsync URL>/var/filestorage/Data.fs <remote rsync URL>/var/filestorage/Data.fs.index var/filestorage.prod/
rsync -Paz --existing <remote rsync URL>/var/blobstorage/ var/blobstorage.prod/

The second command there will only revert BLOBs that were already copied locally back to the prod version.

Enjoy, but be very careful that you don’t accidentally apply changes to prod. Backup prod and make sure your prod network mount is read-only.

Updated on 29 February 2012

Imported from Plone on Mar 15, 2021. The date for this update is the last modified date in Plone.

Comments

comments powered by Disqus