.. Ross Patterson's Blog imported post, created by `$ ./bin/rfc822-to-post` on Mar 15, 2021. .. meta:: :description: A terrifying tale of ZODB errors that only showed during the hardest time to debug. :keywords: Python, Plone, Zope .. post:: Nov 21, 2012 :tags: Python, Plone, Zope :author: Ross Patterson :redirect: @@redirect-to-uuid/da91172a9d114da696940feb76408a8c ######################### POSKeyError During Commit ######################### A terrifying tale of ZODB errors that only showed during the hardest time to debug. Believe it or not, I'm a `POSKeyError`_ virgin, I've never run into one... Until now! I was working on a very small `Plone`_ 3.1.1 to 4.2.2 upgrade. Not too many add-ons, simple skin/theme customizations, 1,810 objects in ~800MB. No problem, right? Wrong. I got a ``POSKeyError`` after the upgrade step that converts files and images to `BLOBs`_. The wierd part is that it only happened during the `transaction.commit()`_, not during `ZODB`_ object access:: File "/opt/src/eggs/Zope2-2.13.18-py2.7.egg/Zope2/App/startup.py", line 301, in commit transaction.commit() File "/opt/src/eggs/transaction-1.1.1-py2.7.egg/transaction/_manager.py", line 89, in commit return self.get().commit() File "/opt/src/eggs/transaction-1.1.1-py2.7.egg/transaction/_transaction.py", line 329, in commit self._commitResources() File "/opt/src/eggs/transaction-1.1.1-py2.7.egg/transaction/_transaction.py", line 443, in _commitResources rm.commit(self) File "/opt/src/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/Connection.py", line 572, in commit oid, serial, transaction) File "/opt/src/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/BaseStorage.py", line 416, in checkCurrentSerialInTransaction committed_tid = self.getTid(oid) File "/opt/src/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/FileStorage/FileStorage.py", line 770, in getTid pos = self._lookup_pos(oid) File "/opt/src/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/FileStorage/FileStorage.py", line 403, in _lookup_pos raise POSKeyError(oid) POSKeyError: 0x055350 I tried all the tools in `ZODB.scripts`_ and `zc.zodbdgc` but they couldn't tell me anything about the ``oid`` in question. That's probably because it was a different ``oid`` every time I attempted the upgrade step *on the same ZODB*. Without being able to use the typical tools to find out what ZODB objects were referring to those ``oids`` I was really stuck. Then `David Glick`_ came to my rescue. Actually, he was staying at my apartment so I drafted him. :-) He suggested that I might have to create a mapping of ``oids`` to stack frames at the time when objects were added to the transaction so that I could figure out what *had been* going on once I knew *which* ``oid`` would be the problem. This worked very well and much thanks to David. I added the following to the place where the objects were being added to the transaction, ``ZODB.Connection.Connection.readCurrent()``:: if not hasattr(self, 'XXX'): self.XXX = {} self.XXX[ob._p_oid] = sys._getframe() Then once I had my ``POSKeyError`` in ``pdb.post_mortem()`` I could invoke ``pdb`` at the place where the offending ``oid`` had been added:: import pdb; pdbr = pdb.Pdb() ; pdbr.reset() ; pdbr.interaction(self.XXX[oid], None) Using that I was able to find out what objects were involved. They were all ``portal_catalog/path`` index ``Trees.IIBTree.IITreeSet`` objects. What's odd is that the error happens even if I do a catalog clear and rebuild first. After much more debugging and discussion, we determined that the `ZODB.Connection.Connection.readCurrent()`_ support was involved somehow. Also, as I deleted the images that were involved I noticed another migration error earlier on went away for each image I removed. That's the key, when the BLOB migration fails for one file or image, it uses a savepoint to rollback *just* that image's migration and proceed with the rest. It sure seems like there's a bug in the ZODB where when a savepoint is rolled back the objects registered with the transaction are cleaned up but the objects registered for ``readCurrent`` are not. Gee, someone should write a test case. :-) In the meantime, it turns out the `reason the image migrations were failing`_ was something else all together but I thought I should share this debugging technique for capturing stack information when you can't know which stack you need to inspect until later. I also hope this ZODB bug gets squashed. .. _`David Glick`: http://glicksoftware.com/ .. _`POSKeyError`: http://plonechix.blogspot.com/2009/12/definitive-guide-to-poskeyerror.html .. _`Plone`: http://plone.org .. _`BLOBs`: http://svn.zope.org/ZODB/trunk/src/ZODB/blob.py?view=markup .. _`transaction.commit()`: http://www.zodb.org/zodbbook/transactions.html .. _`ZODB`: http://zodb.org .. _`ZODB.scripts`: http://svn.zope.org/ZODB/trunk/src/ZODB/scripts/README.txt?view=markup .. _`zc.zodbdgc`: http://zodb.org/documentation/articles/multi-zodb-gc.html .. _`ZODB.Connection.Connection.readCurrent()`: http://zope3.pov.lt/trac/browser/ZODB/trunk/src/ZODB/interfaces.py#L289 .. _`reason the image migrations were failing`: https://github.com/plone/plone.app.blob/commit/d06895d4a026539e5eb8717e7df9832104bf350c .. update:: Nov 21, 2012 Imported from Plone on Mar 15, 2021. The date for this update is the last modified date in Plone.