Using Functional Tests for Profiling

Use your functional tests with zope.testing/zope.testrunner to do easier profiling.

I’ve always enjoyed Zope2’s support for profiling requests, but it’s also a bit of a PITA. When profiling, even with cProfile or hotshot, Python can be a bit of a dog and with a featureful application like Plone, it makes even a simple page load a painful experience. So I have to get the page load I want to profile all set up in a browser, stop the instance, enable profiling, fire the instance back up, and then load the page I want to profile. Then I have to be really careful not to load any pages other than the profiling view because not will they be dog slow but they will pollute the profiling data. It’s also quite laborious to profile a bunch of different page loads to try and get a more representative sense of where your hot spots are. Finally, I have to use the Zope2 profiling view, I can’t use other tools.

I’ve known about zope.testing/zope.testrunner’s “–profile=cProfile” command line option for sometime. When you use it, it spits out the “top 50 stats, sorted by cumulative time and number of calls”. Under a featureful application like Plone, however, the top 50 stats are usually consumed with the top publisher and template rendering calls and don’t actually tell you anything useful.

Recently I needed to do some profiling of the baseline performance of an application. Since I use functional zope.testbrowser tests a lot (some would say too much), my functional tests are actually a pretty good representation of the real world application usage. So I was thrilled to find that the testrunner actually leaves the raw profiling data behind after a test run! It leaves the data in the working directory in which the test run was invoked. If you’re doing this in buildout using a zc.recipe.testrunner part named “test”, as is the convention, the file will be in parts/test/tests_profile.??????.prof.

In order to avoid the test fixture setup from polluting the profiling data, I use the “–repeat” command line option for the test runner to make sure that profiling data from the actual tests are much louder in the profiling data than the test setup. For a large functional test suite I tend to use “–repeat=3” since the suite is long enough as it is and any more repetitions just take too long:

$ bin/test --profile=cProfile --repeat=3

If I’m profiling one test or a small number of tests, however, I like to use “–repeat=10” to be very sure that my profiling data isn’t skewed:

$ bin/test -t my/functional/test.txt --profile=cProfile --repeat=10

Once the tests are finished I can fire up python and use the pstats module to browse my stats. Here I discovered another surprising convenience. Since I use mr.developer everywhere right now, all the code I generally have an interest in optimizing is in “src”. I can use that fact to get much more targeted stats reporting by limiting the stats to those for code in “src”:

>>> import os
>>> import pstats
>>> stats = pstats.Stats('parts/test/tests_profile.??????.prof').sort_stats('cumulative')
>>> stats.print_stats(os.getcwd()+'/src', 50)
...    parts/test/tests_profile.??????.prof
         ... function calls (... primitive calls) in ... CPU seconds
   Ordered by: cumulative time
   List reduced from ... to ... due to restriction <'.../src'>
   List reduced from ... to 50 due to restriction <50>
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)

I’m actually wondering if this might be a better default for the testrunner output when “–profile” is used. If the ZTK people think I should do that, just let me know and I’d be happy to contribute that.

You should also, however, be careful when just looking at the code in “src”. For example, if your code is rendering templates when it doesn’t need to, then limiting the stats to “src” will hide that from you. Filtering by “src” is a good start, but be sure to also check out the top 50 overall and to look into single tests in detail so you don’t miss anything. See the print_stats docs to see everything you can do.

Finally, one should never just assume that even functional tests represent real-world load profiles. I would love to see a tool for capturing and sanitizing request data from production along with the response times for those requests which can then turn that sanitized real-world data into functional load test suites. This would be useful for doing continuous integration that can alert developers when they introduce performance regressions but it would also be very useful for targeted profiling where you could be nearly certain that your optimizations would get you the biggest bang for your users’ bucks. This is something I’d love to work on in a sprint sometime.

Updated on 25 July 2011

Imported from Plone on Mar 15, 2021. The date for this update is the last modified date in Plone.


comments powered by Disqus