spell-checking web spiders
While preparing this site for lunch, I tired as much as possible to pay attention to FF's spell checking, but I became quiet nervous about all the typos I'm sure I missed. Ok, now that we've gotten irony out of the way... :)
Some googling revealed several web services that will crawl your website and report spelling errors but all of them charge money and I'd like a little more control anyways.
I considered breifly making such a thing Zope or Plone specific, but my real purpose would be as a validation tool for a site. I'd want to check "alt" attributes and various other data that might be missed by some zope.schema, zope.formlib, or Archetypes based solution. A Zope or Plone specific solution would, however, have the advantage of supporting truly interactive spell checking that would allow you to correct spelling as you went.
Leaving Zope or Plone specific support aside, all the support you'd need for spidering is already provided by wget and other tools, and aspell is all you could ask for from a spell checker. All you'd really need is something to parse the output of wget for all the UI and search engine accessible strings and pipe them through aspell reporting results. The part that seems a little daunting is specifying all the XML elements and attributes to check. Maybe there's a listing already somewhere? Something to do with accessibility standards maybe?
Well, yet another fun project to work on if I ever have any free time or if anyone wants to sponsor it.
Updated on 10 February 2008
Imported from Plone on Mar 15, 2021. The date for this update is the last modified date in Plone.