The Content Handler Plugin
After reading this guide, you will be able to:
- Understand what Content Handler are and how to use them
- Use the Content Handler API to create, modify and
- Extend Content Handler with custom implementations
This guide is currently work-in-progress.
1 What are Content Handlers?
Aloha Content Handlers (Content Handler) are used to sanitize content loaded or pasted in Aloha Editor.
There are some Content Handler available:
- Generic
- Word
- oEmbed
- Sanitize
Plugins also provide Content Handler:
- Block common/block plugin
There are three hooks available:
- insertHtml in common/commands plugin
- initEditable in core/editable.js
2 Enabling the Content Handler Plugin
2.1 Using Configuration
Aloha.settings.contentHandler: { insertHtml: [ 'word', 'generic', 'oembed', 'sanitize' ], initEditable: [ 'sanitize' ], getContents: [ 'blockelement', 'sanitize', 'basic' ], sanitize: 'relaxed' // relaxed, restricted, basic, allows: { elements: [ 'strong', 'em', 'i', 'b', 'blockquote', 'br', 'cite', 'code', 'dd', 'div', 'dl', 'dt', 'em', 'i', 'li', 'ol', 'p', 'pre', 'q', 'small', 'strike', 'sub', 'sup', 'u', 'ul'], attributes: { 'a' : ['href'], 'blockquote': ['cite'], 'q' : ['cite'] }, protocols: { 'a' : {'href': ['ftp', 'http', 'https', 'mailto', '__relative__']}, // Sanitize.RELATIVE 'blockquote': {'cite': ['http', 'https', '__relative__']}, 'q' : {'cite': ['http', 'https', '__relative__']} } }, handler: { generic: { transformFormattings: false // this will disable the transformFormattings method in the generic content handler }, sanitize: { '.aloha-captioned-image-caption': { elements: [ 'em', 'strong' ] } } } }
For the sanitize Content Handler you can use a predefined set of allowed HTML-tags and attributes (“sanitize” option) or your own set defined as “allows” option. The configuration option for “insertHtml” and “initEditable” defines which Content Handler should be used at that hook points. “initEditable” uses the “sanitize” Content Handler by default. “insertHtml” uses all registered Content Handler by default (including the Content Handler from other Plugins)
The order how the handlers are loaded and executed is important. If you first load the “sanitize” or “generic” handler the “word” handler will not be able to detect a MS Word document.
3 APIs and Extension Points
There is the Content Handler Manager in Aloha Core available where Content Handler can register themself.
3.1 Writing your own Content Handler
define( ['aloha', 'jquery', 'aloha/contenthandlermanager'], function(Aloha, jQuery, ContentHandlerManager) { "use strict"; var MyContentHandler = ContentHandlerManager.createHandler({ handleContent: function( content ) { // do something with the content return content; // return as HTML text not jQuery/DOM object } }); return MyContentHandler; });
All Content Handler needs to support the “handleContent” method and will receive and return the content as HTML text.
4 Internals
5 Future Work
6 Changelog
- October 06, 2011: Initial version by Rene Kapusta
From plugins.texile
7 Contenthandler Plugin
The Contenthandler Plugin has no user interface and is used in conjunction with the paste plugin to be able to handle pasted content from Microsoft Word and the like. Currently it provides three so called “Content Handlers” which will be used to cleanup html content on various occasions like pasting or when initializing an editable. Those contenthandlers are:
- word
- generic
- sanitize
7.1 Word Content Handler
The Word Content Handler will detect content pasted from Microsoft Word 2003 and newer versions. It will:
- remove all “mso-*” classes
- remove all divs and spans from the pasted content
- remove font tags
- convert bullet points to unordered lists
- transform titles to h1 and subtitles to h2 tags
- keep bold and italic formatting, headlines, tables and lists but get rid of all other formattings
7.2 Generic Content Handler
The Generic Content Handler is a bit less generic than his name might suggest as he will apply the following cleaning actions:
- clean lists by removing any non-ul or li children
- cleaning tables by removing the following attributes
- border
- cellpadding
- cellspacing
- width
- height
- valign
- remove comments
- unwrap contents of div, font and span tags (unless the span’s are language annotations made by the wai-lang plugin)
- remove styles
- remove namespaced elements
- transform formattings by turning
- <strong> tags into <b> tags
- <em> to <i>
- <s> and <strike> to <del>
- and by removing <u> tags (as underline text decoration should solely be used for links)
7.3 Sanitize Content Handler
The Sanitize Content Handler does not work reliably on IE7, and will therefore just do nothing, if this browser is detected.
The Sanitize Content Handler will remove all dom elements and attributes not covered by it’s configuration. You may specify your own configuration based on these default settings:
Aloha.settings.contentHandler.sanitize = { // elements allowed in the content elements: [ 'a', 'abbr', 'b', 'blockquote', 'br', 'cite', 'code', 'dd', 'del', 'dl', 'dt', 'em', 'i', 'li', 'ol', 'p', 'pre', 'q', 'small', 'strike', 'strong', 'sub', 'sup', 'u', 'ul' ], // attributes allowed for specific elements attributes: { 'a' : ['href'], 'blockquote' : ['cite'], 'q' : ['cite'], 'abbr': ['title'] }, // protocols allowed for certain attributes of elements protocols: { 'a' : {'href': ['ftp', 'http', 'https', 'mailto', '__relative__']}, 'blockquote' : {'cite': ['http', 'https', '__relative__']}, 'q' : {'cite': ['http', 'https', '__relative__']} } }