Stripping MS Word Proprietary Tags Out of HTML

Anyone who has saved Word documents as web pages and then tried to edit them know one thing: Word adds a huge set of proprietary tags to the HTML. Here’s some tools to strip the code down to pristine, compliant HTML. (via AskMeFi)

  • HTML Tidy is the classic tool that cleans up errant tags and other gremlins. FAQ.
  • Demoroniser is a Perl command-line program that removes ‘moronic Microsoft HTML’.
  • Word HTML Cleaner is an online tool that can upload a file and then spit it back cleaned up.
  • TextRep can search replace multiple files and directories in Windows.
  • Dreamweaver owners – don’t forget Dreamweaver does this in spades.

Leave a Reply

Your email address will not be published. Required fields are marked *