Skip to content Skip to sidebar Skip to footer

What Is The Maximum Depth Of Html Documents In Practice?

I want to allow embedding of HTML but avoid DoS due to deeply nested HTML documents that crash some browsers. I'd like to be able to accommodate 99.9% of documents, but reject tho

Solution 1:

It may be worth asking coderesearch@google.com. Their study from 2005 (http://code.google.com/webstats/) doesn't cover your particular question. They sampled more than a billion documents though, and are interested in hearing about anything you feel is worth examining.

--[Update]--

Here's a crude script I wrote to test the browsers I have (putting the number of elements to nest into the query string):

var n = Number(window.location.search.substring(1));

var outboundHtml = '';
var inboundHtml = '';

for(var i = 0; i < n; i++)
{
    outboundHtml += '<div>' + (i + 1);
    inboundHtml += '</div>';
}

var testWindow = window.open();
testWindow.document.open();
testWindow.document.write(outboundHtml + inboundHtml);
testWindow.document.close();

And here are my findings (may be specific to my machine, Win XP, 3Gb Ram):

  • Chrome 9: 3218 nested elements will render, 3129 crashes tab. (Chrome 9 is old I know, the updater fails on my corporate LAN)
  • Safari 5: 3477 will render, 3478 browser closes completely.
  • IE8: 1000000+ will render (memory permitting), although performance degrades significantly when into high 4-figure numbers due to event bubbling when scrolling/moving the mouse/etc. Anything over 10000 appears to lock up, but I think is just taking a very long time, so is effective DoS.
  • Opera 11: Just limited by memory as far as I can tell, i.e. my script runs out of memory for 10000000. For large documents that do render though, there doesn't seem to be any performance degradation like in IE.
  • Firefox 3.6: ~1500000 will render but testing above this range resulted in the browser crashing with Mozilla Crash Reporter or just hanging, sometimes a number which worked would fail a subsequent time, but larger numbers ~1700000 would crash Firefox straight from a restart.

More on Chrome:

Changing the DIV to a SPAN resulted in Chrome being able to nest 9202 elements before crashing. So it's not the size of the HTML that is the reason (although SPAN elements may be more lightweight).

Nesting 2077 table cells (<table><tr><td>) worked (6231 elements), until you scrolled down to cell 445, then it crashed, so you can't nest 445 Table Cells (1335 elements).

Testing with files generated from the script (as opposed to writing to new windows) give slightly higher tolerances, but Chrome still crashed.

You can nest 1409 list items (<ul><li>) before it crashes, which is interesting because:

  • Firefox stops indenting list items after 99, a programmatic constraint maybe.
  • Opera's keeps indenting with glitches at 250, 376, 502, 628, 754, 880...

Setting a DOCTYPE is effective in IE8 (putting it into standards mode, i.e. var outboundHtml = '<!DOCTYPE html>';): It will not nest 792 list items (the tab crashes/closes) or 1593 DIVs. It made no difference in IE8 whether the test was generated from the script or loaded from a file.

So the nesting limit of a browser apparently depends on the type of HTML elements the attacker is injecting, and the layout engine. There could be some HTML considerably smaller than this. And we have a plain-HTML DoS for IE8, Chrome and Safari users with a considerably small payload.

It seems if you are going to allow users to post HTML that gets rendered on one of your pages, it is worth considering a limit on nested elements if there is a generous size limit.

Solution 2:

For webkit, the maximum document depth is configurable, but by default it is 512

http://trac.webkit.org/browser/trunk/Source/WebCore/page/Settings.h#L408

staticconstunsigned defaultMaximumHTMLParserDOMTreeDepth = 512;

Post a Comment for "What Is The Maximum Depth Of Html Documents In Practice?"