HTML Character Sanitization Solution

This Comment will be submitted for moderation and will not be accessible to other users until it has been approved.


9 points

Hello, all.

I've been scouring the web for a solution to a longstanding problem of mine, and I wonder if anyone here knows of a solution. Since long before I came to Drupal I've dealt with the problem of clients copying and
pasting text into their sites from other web pages or office products and introducing high ASCII characters like curly quotes, ellipses, em dashes, etc. I'm not concerned with invalid HTML. There are plenty of great solutions for that. HTML Purifier might do some character cleaning, but my
current client needs to put iframes in body content, and HTML Purifier strips those out. Does anybody know of anything to help out here? Even if there's a library or project outside of Drupal that I could write a module for?perhaps create an input filter for it or something?I'd be all over it.

Thanks all.



7 points

Hey, let me know what you come up with. I have no problem creating sites with xhtml... But have had trouble with clients inserting their own data that causes validation errors (current issue is the facebook badge).

Good luck,

Holly Ferree,

Anonymous's picture
Created by Anonymous
19 points

I've been working on cleaning up Word crap, but there has to be something better...
?
Nancy

Anonymous's picture
Created by Anonymous
-3 points

configure your input formats and tell it to allow iframe, or create your own custom input format

RA Smith

Anonymous's picture
Created by Anonymous
0 points

I trust you're familiar with the "paste from word" and "paste text" buttons on TinyMCE (which can be used very easily using TinyTinyMCE, or with a little more effort with WYSIWYG).

The first cleans up most (but not all); the last cleans up all(including formatting).

-Bram

Anonymous's picture
Created by Anonymous
5 points

If you want to write an output filter yourself, you can copy what this perl script is doing:
http://www.fourmilab.ch/webtools/demoroniser/

It also happens to be the best named script EVER.

Justin

Anonymous's picture
Created by Anonymous

Post Comment

  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <php>, <python>, <ruby>. Beside the tag style "<foo>" it is also possible to use "[foo]". PHP source code can also be enclosed in <?php ... ?> or <% ... %>.