Archive

Archive for February, 2013

Knockout.js: HtmlNoScript binding

February 14th, 2013 No comments

    The Knockout.js is one of the most popular and fast-developing libraries that bring the MVVM pattern into the world of JavaScript development. Knockout.js allows to declaratively bind UI (Document Object Model elements) to an underlying data model. One of the built-in bindings is the html-binding allowing to display an arbitrary Html provided by an associated property/field of the data model. The typical use of this binding may look like this:

<div data-bind="html: comment" />

where the comment property may be specified as shown in the following data model:

...
// include the knockout library
<script src="js/knockout/knockout-2.2.0.debug.js" type="text/javascript"></script>
...
$(function () {
	// define a data model
	var viewModel = function () {
        var self = this;		
		self.comment = ko.observable("some html comment <span>Hello<script type='text/javascript'>alert('Hello!');<\/script></span>");
		...
	}
	...
	// create instance of the data model
	var viewModelInstance = new viewModel();
	// bind the instance to UI (since that moment all of the declarative bindings have worked off and the data are displayed)
	ko.applyBindings(viewModelInstance);
});

Applying a value to a DOM element, the html-binding exploits the jQuery‘s html() method, of course, if jQuery is available on the page; otherwise knockout.js relies on its own logic that ends up with call of the DOM‘s appendChild function. In both cases if the html being applied contains inclusions of JavaScripts, the scripts execute once the html has been added to the DOM. Preventing all unwanted scripts from running is the best practice when displaying random htmls on the page. Using the approach described in my post How to prevent execution of JavaScript within a html being added to the DOM we are able to develop our own knockout binding disabling the scripts. It could look like the following:

 ko.bindingHandlers.htmlNoScripts = {
    init: function () {
        // Prevent binding on the dynamically-injected HTML
        return { 'controlsDescendantBindings': true };
    },
    update: function (element, valueAccessor, allBindingsAccessor) {
        // First get the latest data that we're bound to
        var value = valueAccessor();
        // Next, whether or not the supplied model property is observable, get its current value
        var valueUnwrapped = ko.utils.unwrapObservable(value);
		// disable scripts
        var disarmedHtml = valueUnwrapped.replace(/<script(?=(\s|>))/i, '<script type="text/xml" ');
        ko.utils.setHtml(element, disarmedHtml);
    }
};
 

An alternative way to prevent scripts from running is to use the innerHTML property of a DOM element as follows:

 var html = "<script type='text/javascript'>alert('Hello!');<\/script>";
 var newSpan = document.createElement('span');
 newSpan.innerHTML = str;
 document.getElementById('content').appendChild(newSpan);
 

Some people, however, report that this doesn’t work in Internet Explorer 7 for unknown reasons. So, let’s combine these two methods in one to minimize the risk of unwanted scripts running. My tests show the combined approach successfully prevent JavaScript from being executed in all popular browsers. Even in the worst case, at least one of the methods works off. The resultant knockout binding may look like the following:

ko.bindingHandlers.htmlNoScripts = {
    init: function () {
        // Prevent binding on the dynamically-injected HTML
        return { 'controlsDescendantBindings': true };
    },
    update: function (element, valueAccessor, allBindingsAccessor) {
        // First get the latest data that we're bound to
        var value = valueAccessor();
        // Next, whether or not the supplied model property is observable, get its current value
        var valueUnwrapped = ko.utils.unwrapObservable(value);
        // disable scripts
        var disarmedHtml = valueUnwrapped.replace(/<script(?=(\s|>))/i, '<script type="text/xml" ');        
        // create a wrapping element
        var newSpan = document.createElement('span');
        // safely set internal html of the wrapping element
        newSpan.innerHTML = disarmedHtml;
        // clear the associated node from the previous content
        ko.utils.emptyDomNode(element);
        // add the sanitized html to the DOM
        element.appendChild(newSpan);
    }
};

This htmlNoScripts binding can be used as follows:

<div data-bind="htmlNoScripts: comment" />

Related posts:

JavaScript: How to prevent execution of JavaScript within a html being added to the DOM

February 5th, 2013 No comments

    Getting pieces of html from an external web service, I display them “as is” on a page. To insert the htmls into the document I use handy jQuery methods: append() and html(). The only problem is the htmls contain inclusions of JavaScript that is executed whenever I add the htmls to the DOM (Document Object Model). Having looked for an acceptable solution to prevent untrusted scripts from running I found out that almost all solutions come to removing script-tags from html using the Regular Expressions. I tried a couple of regexes, and they really worked. However, every time I could find such a combination of script-tag, JavaScript inside it and wrapping html when the regexes either didn’t recognize a script block or cut out more than it was needed. And I don’t even mention that parsing an arbitrary HTML with the Regular Expressions is a bad idea in all respects :). Trying to find another solution, I recalled that if we replace the type=”text/javascript” of a script-tag with the type=”text/xml”, the JavaScript inside will not execute. This is quite known fact, and a few JavaScript libraries use this behavior for their purposes. However, two things impeded me to implement the direct replacing of one type with another: the type=”text/javascript” sub-string may encounter outside the <script>, somewhere in the content (like in this article 🙂 ); the script-tag may be without the type attribute at all, and, in this case, the code inside will be run as if the type=”text/javascript” is specified. Taking into account these conditions, I developed a few tricky regexes, that seemed to be fairly complicated and not reliable enough though. After all, this led me to a thought to examine what happens if the script-tag has two type attributes. Something like this:

<script type="text/xml" type="text/javascript">
    alert('Hello!');
</script>

All browsers I tested this in didn’t execute JavaScript. On the contrary, if I change the type attributes over, the script is executed. Only the first type attribute seems to be efficient. So, my resultant solution is based on the following assumptions:

  • JavaScript inside the <script type=”text/xml”> doesn’t execute;
  • JavaScript inside the <script type=”text/xml” type=”text/javascript”> doesn’t execute as well because only the first type is considered;

So, below is a very simple JavaScript function, which prevents scripts from running and seems to avoid most of drawbacks:

function preventJS(html) {
    return html.replace(/<script(?=(\s|>))/i, '<script type="text/xml" ');
}

The solution still relies on the Regular Expressions, but the regex itself is quite simple and unambiguous. The script-tags are preserved in the html, so you can treat them later in a manner you want.

Below is an example of use

<html>
  <head>
    <title>Prevent scripts from running</title>
    <script src="jquery-1.8.3.js" type="text/javascript"></script>
  </head>
  <body>
    <div id="content"></div>
    <script type="text/javascript">
      $(function () {
        var html1 = "<span>Hello 1<script type='text/javascript'>alert('Hello 1!');<\/script></span>"
        var html2 = "<span>Hello 2 &lt;script&gt; alert('Hi') &lt;/script&gt; <script   type='text/javascript'>alert('Hello 2!');<\/script></span>";
        var html3 = "<script src=\"someJs.js\" type='text/javascript'><\/script>";
        var html4 = "<script>alert('Hello 4!');<\/script>";
        var html5 = "<scriptsomeAttr >alert('Hello 5!');";

        $("#content").html(preventJS(html1));
        $("#content").append(preventJS(html2));
        $("#content").append(preventJS(html3));
        $("#content").append(preventJS(html4));
        $("#content").append(preventJS(html5));
      });

      function preventJS(html) {
        return html.replace(/<script(?=(\s|>))/i, '<script type="text/xml" ');
      }
    </script>        
  </body>
</html>

I tested this solution in Google Chrome v. 24.0.1312.56, FireFox v. 18.0.1 and Internet Explorer 9 v.9.0.8112.16421, and it works as directed. However, I still have doubts whether it’s applicable for all browsers and their versions. So, if you have tested it, please don’t hesitate to post a comment here with the result, browser’s name and version you use.

For test purposes you can download a html page and a couple of js-files here. If the preventing works correctly, having opened the page in a browser, you shouldn’t get any alerts.

Related posts:
Categories: JavaScript, jQuery, Regex Tags: , ,