68184764 an Attack Resistant and Rapid Recovery Desktop System

Embed Size (px)

Citation preview

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    1/160

    CLARKSON UNIVERSITY

    An Attack-Resistant and Rapid Recovery Desktop System

    A Dissertation by

    Todd Deshane

    Coulter School of Engineering

    Submitted in partial fulfillment of the requirements

    for the degree of

    Doctor of Philosophy

    Engineering Science

    August 2010

    cTodd Deshane 2010

    Accepted by the Graduate School

    Date DEAN

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    2/160

    UMI Number: 3428987

    All rights reserved

    INFORMATION TO ALL USERSThe quality of this reproduction is dependent upon the quality of the copy submitted.

    In the unlikely event that the author did not send a complete manuscriptand there are missing pages, these will be noted. Also, if material had to be removed,

    a note will indicate the deletion.

    UMI 3428987Copyright 2010 by ProQuest LLC.

    All rights reserved. This edition of the work is protected againstunauthorized copying under Title 17, United States Code.

    ProQuest LLC789 East Eisenhower Parkway

    P.O. Box 1346Ann Arbor, MI 48106-1346

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    3/160

    The undersigned have examined the dissertation entitled An Attack-Resistant and

    Rapid Recovery Desktop System presented by Todd Deshane, a candidate for the

    degree of Doctor of Philosophy, Engineering Science and hereby certify that it is worthy of

    acceptance.

    Date

    EXAMINING COMMITTEE

    Dr. Susan Conry

    Dr. Daqing Hou

    Dr. Robert Meyer

    Dr. Joachim Stahl

    ADVISOR Dr. Jeanna Matthews

    ii

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    4/160

    CLARKSON UNIVERSITY

    An Attack-Resistant and Rapid Recovery Desktop System

    By: Todd Deshane

    Advisor: Jeanna Matthews

    Abstract

    General-purpose computing devices, such as personal computers (PCs), and the operating

    systems that run on them provide more functionality and capabilities than most users will

    ever want or need. Too much of the burden of keeping these computer systems secure is

    placed on the end users. Users are often required to keep the operating system, applica-

    tions, security software, and anti-virus definitions up-to-date. Even with the latest security

    updates, users are still susceptible to the newest exploits. When a system does become com-

    promised, the process of then restoring it to a usable state can frequently result in the loss

    of personal data stored on the system. Personal data can often only be recovered through

    repeated effort and in some cases can never be recovered. Malicious software (malware)

    is not the only source of problems on a computer system. Software bugs and conflicting

    software packages can also cause system instability as well as data corruption.

    In this dissertation, we present a unique desktop system architecture solution to the

    pervasive problem of recovering from malware attacks. We demonstrate our architecture

    with an open source implementation of our Rapid Recovery Desktop system that provides

    resistance against attack and rapid recovery from broken system state and malware in-

    festation. Our system combines a file server virtual machine (FS-VM), a network virtual

    machine (NET-VM), a virtual machine contract system, and a virtualization security frame-

    work (OSCKAR) to isolate, provide access control, and limit the privileges of applications.

    We measured the systems performance overhead and evaluated the security and recovery

    benefits.

    iii

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    5/160

    Acknowledgements

    Id like to thank God for me giving the opportunity to get a PhD. His plans are always so

    much greater than even my wildest dreams. Randy Pausch, in his Last Lecture said, luck

    is where preparation meets opportunity. This is a very insightful quote. I like to thinkand live my life by an analogous definition, that of grace. Grace is where faith meets Gods

    blessings. Grace is also defined as getting what you dont deserve. I feel that I am given

    tremendous blessings every day and thank God that He is so gracious to me.

    I wouldnt have made it this far in my PhD endeavors if it wasnt for my wife, Patty.

    She is my everything and helps me so much each day. She has been by my side helping me

    with all of the little details of the whole process. It truly was a journey, one that I am so

    grateful that we have been able to take together. Not only is this the end of one chapter of

    our lives, but it is the beginning of the next.

    I have a wonderful family and great friends. They have been so supportive of me in

    everything that I do. I know that they are so proud of me and my accomplishments, but

    Im glad that I have their love and support. They make changing the world worth all of the

    work.

    My advisor, Jeanna Matthews, has been a true inspiration to me. Her passion for

    making a difference, helping people, and her philosophy of leaving things better than you

    found them have made a significant impact on my life in many ways. I am thankful for

    her endless patience, guidance, and wisdom. I also thank her for helping me develop such

    a strong interest in networking, systems, and open source.

    The Applied Computer Science Labs at Clarkson, specifically the Clarkson Open Source

    Institute (COSI) and the Internet Teaching Laboratory (ITL), have also been another won-

    derful inspiration and help to me. I have met so many people in the labs and developed so

    many great relationships over the years. I appreciate the many conversations and feedback

    that I have received from them over the years on my research, life, and the world in general.

    I would like to thank my PhD committee for their thorough and insightful feedback

    on my dissertation. Also thanks to the many reviewers that read early drafts of chapters,

    iv

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    6/160

    paragraphs, and ideas. The conversations and feedback that I have received have turned

    this work into something that I could not have come up with all by myself.

    Id like to thank IBM for their generosity to higher education and specifically for funding

    me for two years with consecutive IBM PhD Fellowships. My mentors during the program,

    Sean Dague and Rick Harper, are still influential and impressive to me today. Im so

    fortunate to have met both of them.

    I know that by listing specific people I will inevitably leave someone out. However, I

    would like to acknowledge by name several close friends and colleagues who I have met

    during my PhD career. These guys have worked with me in one way or another over the

    years and they deserve recognition. A big thanks to Eli M. Dow, Wenjin Hu, Patrick F.

    Wilbur, Jim Owens, and Tao Yang. You guys are next for getting your PhDs. I pass that

    torch onto you. Best of luck.

    Last, but certainly not least, I would be remiss to forget to thank the open source

    community for producing great free software, helping me understand things, and having

    patience to help answer even the most trivial of questions. I hope that I can somehow give

    back to a free and open source software (FOSS) community that is very deserving of my

    efforts and thanks.

    v

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    7/160

    Contents

    1 Introduction 1

    1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.1.1 The Current State of Malware . . . . . . . . . . . . . . . . . . . . . 3

    1.1.2 Challenges to Change . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.1.3 An Overview of Our Approach . . . . . . . . . . . . . . . . . . . . . 7

    1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    1.3 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . 10

    2 Related Work 11

    2.1 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.1.1 Virtualization Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.1.2 History and Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1.3 Virtual Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.1.4 Virtual Machine Contracts . . . . . . . . . . . . . . . . . . . . . . . 18

    2.2 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.2.1 The Principle of Least Privilege . . . . . . . . . . . . . . . . . . . . . 20

    2.2.2 Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    2.2.3 Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    2.3 Virtualization and Security . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.3.1 Virtualization and Isolation . . . . . . . . . . . . . . . . . . . . . . . 24

    2.3.2 Virtualization and Access Control . . . . . . . . . . . . . . . . . . . 26

    vi

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    8/160

    2.4 Backup and Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

    2.5 Network Security and Intrusion Detection . . . . . . . . . . . . . . . . . . . 30

    2.6 Anti-virus Software and Host-based Intrusion Detection Systems . . . . . . 31

    3 Architecture 333.1 Design Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    3.1.1 Virtualization Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    3.1.2 Security Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    3.1.3 Environment Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    3.1.4 Open Source and Open Standards . . . . . . . . . . . . . . . . . . . 35

    3.1.5 Threat Model and Assumptions . . . . . . . . . . . . . . . . . . . . . 35

    3.2 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    3.3 Virtual Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    3.3.1 Whole Desktop in a Single Appliance . . . . . . . . . . . . . . . . . 42

    3.3.2 Grouping Applications Based on Access Needs . . . . . . . . . . . . 43

    3.3.3 One Application Per Virtual Appliance . . . . . . . . . . . . . . . . 45

    3.4 Virtual Machine Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    3.5 Virtualization Security Framework . . . . . . . . . . . . . . . . . . . . . . . 47

    3.5.1 Virtualization Security Framework and Virtual Machine Contracts . 49

    3.6 File System Level Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 53

    3.7 Network Level Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    3.7.1 Hardening the Overall System . . . . . . . . . . . . . . . . . . . . . 58

    4 Implementation 60

    4.1 Virtualization Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

    4.1.1 Hypervisor Comp onent . . . . . . . . . . . . . . . . . . . . . . . . . 60

    4.1.2 VMM Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    4.1.3 Builder Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

    4.2 OSCKAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    vii

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    9/160

    4.3 Virtual Machine Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    4.4 FS-VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    4.5 NET-VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    4.6 Example Virtual Appliance . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    4.6.1 Browser Virtual Appliance . . . . . . . . . . . . . . . . . . . . . . . 83

    5 Evaluation 87

    5.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

    5.1.1 Virtualization Overhead . . . . . . . . . . . . . . . . . . . . . . . . . 87

    5.1.2 Enforcement Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . 99

    5.2 Effectiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    5.2.1 Malware Classifcation Analysis . . . . . . . . . . . . . . . . . . . . . 102

    5.2.2 Evaluation of Recovery Properties . . . . . . . . . . . . . . . . . . . 115

    6 Conclusion 118

    6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

    6.1.1 HCI-SEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

    6.1.2 Malware Collection and Analysis . . . . . . . . . . . . . . . . . . . . 119

    6.1.3 Implementation-related Improvements . . . . . . . . . . . . . . . . . 120

    6.1.4 Application to Other Environments . . . . . . . . . . . . . . . . . . 126

    A Performance Results 147

    A.1 Details of Performance Results for this Dissertation . . . . . . . . . . . . . . 147

    A.2 Other Related Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 148

    viii

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    10/160

    List of Figures

    3.1 The architecture of our Rapid Recovery Desktop from a simplified network

    view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    3.2 The Progression of Virtual Appliance Decomposition . . . . . . . . . . . . . 41

    3.3 The Achitecture of our OSCKAR Virtualization Security Framework . . . . 50

    3.4 The architecture of our Rapid Recovery Desktop from a file system view . . 55

    3.5 The architecture of our Rapid Recovery Desktop from a network view . . . 57

    4.1 Our Achitecture on a Integrated Hypervisor (such as KVM) . . . . . . . . . 63

    4.2 Our Achitecture on a Stand-alone Hypervisor (such as Xen) . . . . . . . . . 64

    4.3 VMM rule set that uses the generic vmm backend chosen by the VMM interface 67

    4.4 VMM rule set that uses the specific qemu-spice backend . . . . . . . . . . . 68

    4.5 Builder rule set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    4.6 A Sample Policy Manager Global Contract. Note that the $ARG is simply

    the argument to the event. $ARG is replaced with the argument passed

    during the event at runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    4.7 Process of importing virtual machine contract (VMC) and starting VM . . 73

    4.8 Overview of contract types, rule sets, and events supported by our contract

    system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

    4.9 General Virtual Machine Contract (VMC) Format . . . . . . . . . . . . . . 77

    4.10 F S-VM example rule set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.11 NET-VM example rule set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

    ix

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    11/160

    4.12 Rate-limiting NET-VM example rule set . . . . . . . . . . . . . . . . . . . . 82

    4.13 Browser Appliance Virtual Machine Contract (VMC) (1 of 3) . . . . . . . . 84

    4.14 Browser Appliance Virtual Machine Contract (VMC) Continued (2 of 3) . . 85

    4.15 Browser Appliance Virtual Machine Contract (VMC) Continued (3 of 3) . . 86

    5.1 Linux guest read performance . . . . . . . . . . . . . . . . . . . . . . . . . . 90

    5.2 Linux guest write performance . . . . . . . . . . . . . . . . . . . . . . . . . 91

    5.3 Windows guest read performance . . . . . . . . . . . . . . . . . . . . . . . . 92

    5.4 Windows guest write performance . . . . . . . . . . . . . . . . . . . . . . . 93

    5.5 Windows guest read performance using a variety of virtual disk backends . 94

    5.6 Windows guest write performance using a variety of virtual disk backends . 95

    5.7 Linux guest networking performance . . . . . . . . . . . . . . . . . . . . . . 97

    5.8 Windows guest networking performance . . . . . . . . . . . . . . . . . . . . 98

    5.9 Linux guest to FS-VM read performance . . . . . . . . . . . . . . . . . . . . 100

    5.10 Linux guest to FS-VM write performance . . . . . . . . . . . . . . . . . . . 101

    5.11 Linux guest networking performance . . . . . . . . . . . . . . . . . . . . . . 103

    x

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    12/160

    Chapter 1

    Introduction

    1.1 Motivation

    General purpose computing devices, such as personal computers (PCs), and the operating

    systems that run on them provide more functionality and capabilities than most users will

    ever want or need. For example, these computing devices can send large quantities of emails

    in seconds (on a scale proportional to the network bandwidth and computer power). A user

    is unlikely to ever need to send as many emails in a lifetime as their computing device could

    send in a day. However, malicious software (malware) seeks to take advantage of any spare

    computing power that it can control, making full use of the frequently spare functionality

    that general purpose computing devices and operating systems provide. One clear example

    of this phenomenon is the commonly accepted and reported fact that over 90% of all email

    is spam [78]. Although it is difficult to determine how much of this spam is sent by home

    user PCs, it is estimated that 95% of all spam is sent by botnets [30], which are composed

    of a variety of zombie computers including many home user PCs.

    Too much of the burden of keeping a computer system secure is placed on the end users.

    Users are often required to keep the operating system, applications, security software, and

    anti-virus definitions up-to-date. Non-malicious or accidental incidents, such as system or

    software updates, can cause more noticeable problems to users, since, unlike malware, these

    1

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    13/160

    incidents are not aiming to hide undetected in a users computer. These incidents can cause

    system instability and, in the worse case, make the system unusable [33, 77, 108, 160]. As

    a result, users often disable or refuse to perform updates [46, 56, 158]. Even with all the

    latest security updates, users are still susceptible to zero day exploits, which are exploits

    that have not been seen before and thus are not detected by traditional signature-based

    security software.

    When an end user falls victim to any sort of malware, such as a virus, a commonly

    recommended course of action is to make backups of any critical data and then to wipe

    the system completely and re-install. Throwing the computer away and buying a new one

    is considered by some to be easier than getting rid of the malware through conventional

    means [137,147].

    Not only can malware take down the system, but it can cause the user to lose personal

    data, such as pictures or documents. The most diligent of users will make sure the latest

    updates are installed, keep backups of their personal data, and be careful not to click on

    anything suspicious. Taking these defensive measures can reduce the chance of system

    downtime and data loss, but require significant effort on the part of the user. Several

    recent studies indicate that most users are unwilling to perform updates nor back up their

    systems [2, 56,158]. Other studies indicate that many users are unable to adequately access

    risk and will make poor security decision to attain their goals [107,153].

    Fully restoring a compromised system can be an agonizing process often involving re-

    installing the operating system and user applications. This can take hours or days even with

    all the proper materials readily on hand. For average users, even assembling the installation

    materials (for example, CDs, manuals, and configuration settings) may be an overwhelming

    task, not to mention correctly installing and configuring each piece of software. Hiring

    a professional to restore the system and applications can be expensive and may require

    purchasing new software licenses.

    To make matters worse, the process of restoring a compromised system to a usable

    state can frequently result in the loss of any personal data stored on the system. From

    2

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    14/160

    the users perspective, this is often the worst outcome of an attack. System data may be

    challenging to restore, but it can be restored from public sources. Personal data, however,

    can only be restored from private backups and the vast majority of personal computer users

    do not routinely back up their data. Once lost, personal data can only be recovered through

    repeated effort (for example, rewriting a report) and in some cases can never be recovered

    (for example, digital photos of a one time event).

    1.1.1 The Current State of Malware

    A trusting, naive design of the Internet and powerful, general purpose, commodity computer

    systems have led to wide-spread security problems. The Internet was originally developed

    by and for the government and universities and it was used in a trusting manner to share

    information. The explosive growth of the world wide web, starting in the 1990s [118],

    brought with it millions of Internet users, not all of whom had benign intentions. Malicious

    hackers1 exploited an Internet that wasnt built with security in mind. To make matters

    worse, the default configuration of the most popular operating system of the time was for

    users to run with full administrative privileges. Thus, a virus that ran as the user had

    full access to the system. Since that time, many security measures, such as public-key

    cryptography, firewalls, and intrusion detection systems, have been added to the Internet

    infrastructure. Commodity operating systems have also added security features, such as

    built-in firewalls, user access control, and system restore.

    Despite these efforts, global scale security problems, such as widespread malware and

    botnet activity [13,62,71,90, 102,110], still exist. The first well-documented computer worm

    of 1988 [149] exploited several widely used programs. Due to limited system security and

    an overall trusting Internet infrastructure, it was able to spread quickly across much of

    the Internet causing much disruption. Computer malware in the early days was often the

    work of curious or playful individuals seeking to exploit for experimentation, exploration,

    prank, or vandalism. This early malware generally led to minor disruption and annoyance,1Malicious hackers are more accurately defined as crackers (http://catb.org/jargon/html/C/cracker.html),

    but the term hacker is commonly (mis)-used.

    3

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    15/160

    but rarely led to much damage or loss to individuals or organizations. However, modern

    malware, particularly over the past decade or so, has been primarily used by organized

    crime to exploit and profit from users and all kinds of organizations [79,120].

    Organized crime has set up shop all across the Internet, often in the form of botnets.

    Botnets are a distributed network of computers controlled remotely by malicious hackers

    that can be put into action on demand to perform distributed denial-of-service (DDoS)

    attacks on targeted websites, engage in mass e-mailer spam campaigns to sell pharmaceuti-

    cals, or promote the page rank of other hijacked sites. All of these actions can be taken by

    malicious hackers in an effort to exploit more users and systems in order to increase profits

    and the size of their botnets.

    There is a growing black market of professional malware and exploit kits, which may even

    come with tech support [120]! These exploit kits consist of various automated tools that can

    be used to trick and exploit users. An example exploit kit might include a tool that does

    automated account creation and performs fake use of a popular social networking websites,

    such as Facebook or Twitter. The tool could then also have support for controlling fake or

    stolen accounts by a botnet. These accounts could then be used carefully and deceitfully to

    gain trust among real social networking users in order to serve them targeted spam links or

    hijack their personal information with phishing techniques. Profit can then be made by, for

    example, using the personal information for identity theft, link-referral affiliate programs,

    and tricking users to click on the various spam links [120].

    Another common method used to spread malware is by using drive-by-downloads. Drive-

    by-downloads are a method of attack that tricks users into visiting sites by using spam

    links or typo-squatting (registering commonly misspelled domains) and then automatically

    installing malicious binaries. Another related attack method is tricking users to click to

    install fake plugins or fake anti-virus that are actually malware [131]. These malware-

    infected computers are then used in the botnets to exploit more users and systems.

    Over time, security of software improves and users are trained to be on the lookout, thus

    forcing malicious hackers to find new ways to spread their malware. For instance, a more

    4

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    16/160

    recent trend is to use search engine optimization (SEO) tricks to promote malware sites to

    the top of the search results for trendy and popular search terms. A user searching for Bill

    Clinton to find out about his recent heart operation would likely have found themselves

    downloading fake anti-virus that is actually malware [63]. In a recent study [133], fake

    anti-virus was found to account for 15% of all malware detected on the web using Googles

    malware detection infrastructure. The specific details of these various types of attacks have

    changed, but the general nature of the attacks has not. Attackers rely on exploiting systems

    or tricking users to spread their malware. The attacks will target whatever is popular, which

    might mean Facebook or Twitter today, but could mean other popular technologies, such

    as smart phones or new web-based applications, in the near future. These attacks are not

    slowing down and are likely to only get more sophisticated [66,114,179].

    1.1.2 Challenges to Change

    The same basic exploit techniques have been used by malicious hackers for quite some time.

    One of the main reasons that these techniques still work in practice is because general

    purpose operating systems are designed to allow applications to run with the full privileges

    of the user. For instance, if a user has access to read, write, or delete a file, then any

    application run by that user has access to read, write, or delete that file. This is not a new

    problem. Researchers realized this problem over 20 years ago [79], yet the vast majority of

    users still dont have their applications restricted in an effective and usable way.

    One of the main reasons that there is still no solution to this problem used in practice

    is that a solution to this problem does not seem to fit in anybodys business model. Fixing

    the problem does not sell new computers, it does not sell new versions of operating systems,

    and it definitely does not sell new versions of security software, such as anti-virus and anti-

    malware. Software companies, such as Microsoft, publish studies that recommend the use

    of automatic updates as one of the most effective things that an organization can use to

    help prevent the spread of malware [46]. Some companies in the malware defense business

    recommend that users follow the same old security best practices [35] even in the face of

    5

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    17/160

    new and more subtle threats. However, it is well-known that threats can affect people even

    those visiting legitimate websites [163]. Other anti-virus companies tend to take a band-aid

    type approach, promoting their products as an effective way to keep ahead of the attackers

    with automated security updates and by employing new technologies [55].

    Another reason that there is still a security problem on desktop computers is that even

    when reasonable solutions exist, they are rarely used in common practice. For example,

    mandatory access control (MAC) systems, such as SELinux [87], AppArmor [12], and Win-

    dows Mandatory Integrity Control (MIC) [97], go a long way toward solving the problem,

    but are not commonly used in practice. These protections are hard to use [82] and tend

    to produce too many false positives, which often leads to them being disabled. In a study

    by Sunshine, et al., it was shown that users often incorrectly understand the risk involved

    with SSL warnings in the browser. Another study by Motiee, et al., showed that users made

    incorrect security decisions when using Windows access control protections [107]. These two

    studies are examples that demonstrate users tendencies to do whatever it takes (despite

    the security risk) to complete their task [153].

    Another challenge to adoption of viable solutions is that the solution must reach a crit-

    ical mass of users to be effective. As we will show in Chapter 2 on related work, there

    are many proposed solutions in research that are unlikely to ever be used in practice. The

    critical issue is that the software needs to be usable for a wide variety of users. Even if

    the software both solves the problem and is usable, that does not imply that it is easily

    distributed. An effective means of distribution may require original equipment manufacture

    (OEM) agreements or resource and time investment in infrastructure and staff to develop

    and foster user and developer communities. In any case, implementing ideas that funda-

    mentally change and improve how a large majority of computer users work is a significant

    undertaking.

    Despite these challenges to change, we hope that the approach described in this disserta-

    tion changes the way users, developers, and security professionals think about the security

    of computer systems. The solution proposed in this dissertation uses well-understood secu-

    6

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    18/160

    rity practices and makes use of some of the latest innovations in virtualization technology

    on commodity desktop systems. We demonstrate a desktop system that provides resis-

    tance against attack, fast recovery from exploits, and minimizes the impact that any single

    exploited application can have on the system and user-specific data.

    1.1.3 An Overview of Our Approach

    Our solution is based on separating user data into a file server virtual machine (FS-VM)

    and accessing that data with virtual machine appliances, or simply virtual appliances, which

    encapsulate one or more applications. Furthermore, we associate a contract with each

    virtual appliance that describes its specific behavior in terms of basic resource requirements,

    user data access needs, and network access specifications. Contracts restrict the virtual

    appliances to the task that they were designed to do and all other access is denied by

    default.

    This architecture creates a situation in which a virtual appliance that is infected with

    malware is not able to take over the whole system. The malware will only be able to access

    a very limited set of the users personal data and only in the manner specified by the the

    virtual appliances contract. By placing applications in virtual appliances, recovery from

    various system problems, such as malware or malfunctioning applications, is a straightfor-

    ward process. For example, it is safe to roll back the disk image of a virtual appliance

    without affecting the users data, since user data is stored in the FS-VM.

    From a network perspective, our architecture has a set of virtual switches that isolate

    virtual appliances and the FS-VM from internal and external attacks. A network virtual

    machine (NET-VM) component manages the virtual switches to enforce network policy.

    Just as the FS-VM only allows access to particular data, the NET-VM only allows access

    to specific network segments and only allows traffic flows that are explicitly specified in the

    virtual appliances contracts. All other traffic is denied by default at the virtual switch level,

    which reduces the amount of network processing done on the individual virtual appliances.

    The real benefit with this type of network architecture is that any incoming connection

    7

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    19/160

    attempts or outgoing connections that are not explicitly allowed by contract rules are denied

    by the NET-VM at the virtual switch level. This means that even if malware compromises

    a virtual appliance and opens up a port not specified in the virtual appliance contract,

    the NET-VM will not allow any incoming traffic to flow to that port. This is a significant

    improvement compared to firewall-based protection, since the firewall inside the virtual

    appliance can be disabled, and yet the virtual appliances networking remains protected by

    the NET-VM that is controlling the virtual switch or switches that the virtual appliance is

    connected to. Having this NET-VM enforcement outside of the virtual appliance presents

    a significant deterrent to traditional attacks.

    We tie together our architecure with a virtualization security framework that we de-

    veloped, called OSCKAR. OSCKAR is used to manage the interactions between virtual

    appliances and the enforcement elements (the FS-VM and NET-VM) based on virtual ma-

    chine contracts and global policy. Even if a malicious entity is able to gain control of a

    virtual appliance, OSCKAR policy enforcement, from outside of the virtual appliance, can

    protect the rest of the system. Response to contract violations can lead to restarting a

    virtual appliance or restoring it to a known good state. OSCKAR provides the framework

    to provide effective interaction between the underlying virtualization technologies and the

    higher level file server and network components of our architecture.

    1.2 Contribution

    This dissertation presents a unique desktop system architecture solution to the pervasive

    problem of recovering from malware attacks. We borrow concepts from local area network

    (LAN) and data center environments and apply them in a novel way to a single desktop

    system. We contribute the design and implementation of several techniques that are not

    available in common practice. These techniques include: a novel way of associating meta-

    data with virtual machines, separating system and user data for the purposes of recovery,

    supporting the rapid rollback of system state for a system under attack, and preserving user

    data during the recovery process.

    8

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    20/160

    In this dissertation, we show that the current best practices do not solve the problem

    addressed by our solution. We demonstrate the feasibility of our approach with the design,

    implementation, and evaluation of our open source Rapid Recovery Desktop system. As

    a consequence of its design, our system is also effective for recovering from non-malicious

    incidents (such as system updates) that cause system instability or otherwise make the

    software system unusable.

    We restructure the desktop as a set of virtual machine appliances (virtual appliances)

    and associate contracts with each. At the heart of our system is a virtual machine contract

    (VMC) system and a virtualization security framework (OSCKAR). We construct and in-

    tegrate a file server virtual machine (FS-VM) to store and protect the user personal data

    store. We also construct and integrate a network virtual machine (NET-VM) to create

    internal private network segments and to protect the system from external and internal

    network attacks. This architecture brings many of the advantages of a well-managed local

    area network (LAN) to end user desktop systems.

    Using an architecture based on virtualization extends the capabilities of a LAN, since

    it allows us to attach metadata (contracts) to virtual machines to manage and administer

    then in ways that are not possible with an isolated desktop system. For example, virtual

    appliances can be rolled back to a known good state nearly instantaneously, while at the

    same time preserving the users personal data in the FS-VM. On a physical desktop, rolling

    back a system would require the operating system and applications to re-installed or the

    hard drive to be re-imaged. Also, with a physical desktop, the process of protecting the

    users personal data requires storing the data on a distinct physical location (for example, a

    spare internal or external drive). Virtual machines as a digital object are particularly well-

    suited for managing, segregating, and protecting a desktop system. We demonstrate the

    feasibility of our approach with an open source implementation and evaluate our prototype

    in terms of performance and effectiveness.

    An initial prototype implementation of our Rapid Recovery Desktop system and an

    initial FS-VM can be found in our previous work [93]. In this dissertation, we do not

    9

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    21/160

    seek to innovate on the implementation of the FS-VM component itself, but do contribute

    some alternative deployment strategies and provide an effective integration of an FS-VM

    component into our Rapid Recovery Desktop system. In section 6.1, we describe some

    potential directions that a more advanced FS-VM component could take. The NET-VM

    component is a new Rapid Recovery Desktop component contributed by this dissertation.

    It makes use of some of the latest advances in open source virtual switch technology. Also,

    we show how the NET-VM can be integrated into our Rapid Recovery Desktop system.

    Further, we have developed a prototype implementation of a generic virtualization security

    framework, called OSCKAR, that supports our virtual machine appliance contract system.

    We describe the generic design of the OSCKAR framework and show its application to the

    specific application of our Rapid Recovery Desktop system. Finally, we evaluate our current

    rapid recovery desktop system prototype in terms of performance and effectiveness against

    attack.

    1.3 Organization of the Dissertation

    The rest of this dissertation is organized as follows. In Chapter 2, we discuss related work

    in the areas of virtualization and security. Then, we discuss the design of our system in

    Chapter 3, followed by the implementation details in Chapter 4. Next, in Chapter 5, we

    present evaluation in terms of performance and effectiveness. Finally, in Chapter 6, we

    conclude and present future work.

    10

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    22/160

    Chapter 2

    Related Work

    In this chapter, we build upon and complement a rich array of related work. First, we

    consider the long history of virtualization, both on server-class systems, such as the IBM

    mainframe, and on commodity desktop systems. Lessons learned and concepts applied over

    the years have provided a great foundation on which this work lies. Next, we consider the

    long history of the field of computer security, primarily from a network and information

    security standpoint, and further limiting our scope to focus primarily on the factors that

    have a direct impact on commodity desktop computing. Then, we describe the broad spec-

    trum of work that has combined virtualization and security techniques to address various

    security issues.

    There is also a substantial amount of related work that shares some of the goals of

    our work. First, there has been lot of effort in the network security and network intrusion

    detection system (NIDS) space that is often complementary to our NET-VM. Second there

    is a body of work in the backup and recovery space that is often complementary to our FS-

    VM. Finally, there is a set of related work that deals with malware and host-based intrusion

    detection systems (HIDS) that is often complementary to our system design as a whole.

    The related work and concepts with respect to malware classification will be covered

    in section 5.2, since it will be more helpful to have that discussion available to evaluate

    effectiveness against attacks. Similarly, some of the more advanced concepts in the FS-VM

    11

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    23/160

    related work will be included in section 6.1. Finally, there is a growing body of work in the

    area of human computer interaction (HCI) as it relates to security (SEC) in the specific

    field of HCI-SEC that will also be addressed in section 6.1.

    2.1 Virtualization

    2.1.1 Virtualization Types

    There are many different types of virtualization options. A common breakdown is into

    the categories of emulation, full virtualization, paravirtualization, operating system level

    virtualization, library virtualization, and application virtualization. Emulation is when

    a different architecture is being created (virtualized) often in order to simulate hardware

    that is not available (for example, some legacy applications require old hardware) or fordevelopment on new platforms that hardware is still being developed (for example, mobile

    platform emulators). Emulators typically run much slower than other types of virtulization,

    since all of the virtualization/emulation is done is software.

    Full virtualization is creating a virtualized version of the same platform (for example,

    x86 on x86). Full virtualization is one of the most common types of virtualization. It

    performs relatively well, since most of (or sometimes all of) the operation can be performed

    on platform being virtualized. We rely on hardware support for virtualization in order to

    provide full virtualization support on Xen and KVM.

    Paravirtualization is when the guest operating system kernel is modified in order to

    support virtualization. The virtualized platform is the same and the p erformance of this

    type of virtualization is fast, since the guest is able to be made virtualization-aware. The

    limitation to this type of virtualization is that the operating system kernel source must be

    available. We use this type of virtualization for open source operating systems running on

    Xen.

    Operating System level virtualization (also commonly referred to as container-based),

    is when the virtual guests share the kernel of the host system. This is a very fast option,

    12

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    24/160

    but limits the guest type to be the same as the base system (for example, Linux guests

    must run on a Linux base). Also, providing performance isolation (one guest consuming

    lots of resources not affected other guests) has traditionally been difficult to implement with

    operating system level virtualization.

    The last two types of virtualization, library and application that we mention are not used

    to virtualize operating system instances, but instead run at the application layer. Library

    virtualization is typically done to emulate an operating system or subsystem (for example,

    Wine provides a subset of the Win32 API to allows Windows applications to run on Linux).

    Finally, application virtualization provides a managed runtime environment in order to have

    cross platform application mobility (for example, the Java runtime environment).

    2.1.2 History and Evolution

    Virtual machine technology, including the virtual machine monitor (VMM) or hypervisor

    was pioneered by IBM in the 1960s [34]. The original IBM VMM was designed for IBM

    System/370 machines and has co-evolved over the years with the IBM hardware on which

    it runs. The current iteration of the IBM VMM is now known as the z/VM hypervisor

    and runs on IBM System z (zSeries) server hardware. Other VMM software and hardware

    co-evolutions include IBMs pSeries hypervisors for pSeries hardware [4, 5] and Sun Mi-

    crosystems Logical Domains (now Oracle VM for SPARC) for SPARC hardware [100,101].

    There also exists other hardware, such as the Alpha processor, that was specifically designed

    to support virtualization [70].

    The history of virtual machine technology for mainstream PC platforms is an interesting

    one. Since x86 hardware was not originally designed to be virtualizable [128], this introduced

    additional overhead and complexity in virtualization. Also, early personal computers were

    not typically configured with sufficient memory to support multiple, simultaneously running

    VMs. As PCs increased in power and memory prices fell, virtualization b ecame more

    feasible for commodity platforms and a number of commercial and open source virtualization

    products were introduced. The Disco project [17] was the first to create a VMM that

    13

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    25/160

    ran on experimental commodity ccNUMA hardware. Members of the Disco team later

    founded VMware, which is the commercial pioneer of virtual machine technology on x86

    hardware [3, 168].

    In this dissertation, we focus on two types of virtual machine technology: (1) Paravirtu-

    alization, which requires minor modifications to an operating system, making it aware that

    there is an underlying VMM and (2) Hardware-assisted full virtualization, which allows

    running unmodified operating systems on top of the VMM.

    The first approach, paravirtualization, a term first coined by the developers of De-

    nali [175] and then popularized by Xen [10], brought with it evidence that virtualization

    benefits could be achieved with low overhead. With full virtualization, the guest VM is un-

    aware that it is running on simulated hardware because the interface presented is the same

    as the physical hardware. With paravirtualization, however, the guest VM is aware that

    it is being virtualized since it is modified to make system (or hyper) calls directly into the

    hypervisor. The paravirtual modifications are usually small and are intended to improve

    performance by avoiding the use of the non-virtualizable instructions [128] and optimizing

    expensive operations. The paravirtualization approach has the advantage of better perfor-

    mance, but since some modification to the guest is required, it is ill-suited for use with

    closed source operating systems. When Xen released their performance numbers at SOSP

    2003, a team of us at Clarkson University published independent verification of these re-

    sults and extended the comparison to Linux running on an IBM zServer, we demonstrated

    that virtualization benefits could be realized on older hardware with low performance over-

    head [26].

    The second approach, hardware-assisted full virtualization, first showed up for com-

    modity hardware in 2005 in the form of the Intels VT-x virtualization extension, which

    was followed shortly after by AMDs AMD-V virtualization extensions [162]. This marked

    the beginning of a new era of virtualization software and hardware co-evolution. This co-

    evolution era, which we are still in the midst of, involves the cooperation of commodity

    market virtualization players such as the developers of software hypervisors (such as Xen,

    14

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    26/160

    KVM, and VMware) and hardware vendors (such as Intel and AMD). These virtualization

    hardware extensions allow for unmodified guest operating systems to run more effectively

    on a wider variety of virtualization platforms, such as Xen and KVM. The hardware ex-

    tensions are required for full virtualization support (for example, Windows guests) on Xen

    and is a requirement to use the Linux Kernel-based Virtual Machine (KVM) [74].

    First generation hardware support for virtualization (VT-x and AMD-V) made proper

    virtualization [128] of the x86 hardware possible, but it did not always achieve performance

    gains compared to existing software approaches (such as binary re-writing) to virtualize the

    x86 architecture [3]. The virtualization software and hardware co-evolution had only just

    begun. In an effort to shed some light on the initial mediocre performance of x86 hardware

    virtualization, Karger [70] described some performance and security lessons learned from

    virtualizating the Alpha processor and compared that architecture to x86 virtualization

    hardware. The Alpha processor, which is based on a reduced instruction set computing

    (RISC) architecture, was specifically designed to support virtualization. This architecture

    had advantages, such as the way it handled sensitive instructions, page tables, and trans-

    lation lookaside buffer (TLB) misses, which made it easier to implement high performance

    support for virtualization. Karger suggests that Intel and AMD should learn from the

    lessons of this and other architectures that were designed to support virtualization. The

    Karger paper and the general history of virtualization suggest that the co-evolution of hard-

    ware and software virtualization is a process that often needs to be refined over time (like,

    for example, the IBM z/VM and z Series) and that it is difficult for the hardware to be

    designed to support high performance virtualization from the beginning.

    The software hypervisors, such as Xen and KVM, are evolving to make better use of the

    x86 hardware virtualization extensions. At the same time, the x86 hardware is evolving and

    vendors, such as Intel and AMD, have released second and third generation virtualization

    hardware extensions to add performance and security benefits. Second generation hardware

    extensions target performance improvements for switching between guest operation systems

    by adding hardware support for handling guest page tables (also referred to as shadow

    15

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    27/160

    page tables). The specific technologies released are Intel Extended Page Tables (EPT)

    and AMD Nested Page Tables (NPT). Third generation hardware extensions, in the form

    of input/output memory management units (IOMMUs), seek to improve the security of

    virtual device direct memory access (DMA) and the performance of virtual I/O devices,

    such as graphics, disk, and network. Specific IOMMU hardware releases include Intels

    VT-d and AMDs IOMMU. Other hardware virtualization technologies include Intel vPro

    and AMD DASH, which use on chip management capabilities and trusted platform module

    (TPM) technology to provide various security and manageability opportunities.

    The overall virtualization hardware and software co-evolution process is making virtual-

    ization a ubiquitous part of commodity computing both in the server and desktop markets.

    Further evidence that virtualization is making an impact on a wider audience is the XP

    mode feature that was added to Windows 7. This feature uses virtual machine technology

    to run Windows XP applications or a full Windows XP environment inside of Windows

    7 [176].

    2.1.3 Virtual Appliances

    A more recent trend in the virtualization space is toward virtual machine appliances (or

    simply virtual appliances). Virtual appliances are pre-configured virtual machine instances

    that are designed for specific tasks. For example, appliances exist for user-level software,

    such as browsers, and server software, such as web and database servers. The ability to

    quickly deploy a pre-configured virtual appliance is a clear and compelling advantage of

    virtualization and is becoming an increasingly popular method for software distribution.

    Virtual appliances, at a high level, are analogous to household appliances that are used

    for one particular task. An even better analogy for virtual appliances is a comparison

    to information appliances. The term information appliance was coined by Jeff Raskin

    and was further described in the book The Invisible Computer by Don Norman [112].

    The basic idea of information appliances is that the Personal Computer (PC) is a general

    purpose device and since it tries to be everything to everyone, it fails at being usable.

    16

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    28/160

    Normans purposed solution is to replace the PC with information appliances, or single

    purpose devices, such as digital cameras, printers, document writers, etc. that each do

    one job and do it well. He argues that special-purpose devices can be made more usable.

    Information appliances together would then make up all the functions of the PC and the

    computer itself would become invisible (behind the scenes). Computers already play this

    role in part, but getting the computer industry to make the last big leap to a world of

    information appliances is a challenging one. The Invisible Computer goes into many

    aspects of the problem, from the market and business side of things to the complexities of

    large programming projects and operating systems.

    Sapuntzakis, et al., first introduced the concept of a virtual appliance [145], which

    they described as a virtual machine that replaces a physical computer appliance (such as a

    firewall). Their vision for virtual appliances was in the Collective architecture, which they

    described as a compute utility that provides virtual appliances as a service. Further, they

    explained that the virtual appliances would send their displays to a remote display on a thin

    client. Their concept is basically what we know of today as a cloud service that provides

    load balancing of infrastructure as a service (IaaS) or perhaps more closely analogous to

    desktop as a service (DaaS).

    Virtual appliances have been a key component of architectures developed at Clarkson.

    For example, virtual computer appliances were mentioned as a component of the architec-

    ture proposed by Evanchik [47]. This was followed shortly by work in which we described

    how virtual machine appliances fit into a Rapid Recovery Desktop System [93]. The virtual

    machine appliance concept as proposed in our paper described creating virtual appliances

    by placing one or more applications that have similar data and network access needs into

    a virtual machine. Further we recommended that appliances come with a virtual machine

    contract that explicitly specifies those needs.

    VMware further popularized the virtual appliance concept with marketing and virtual

    appliance development contests with large cash prizes [61, 169]. Many other open source

    and commercial vendors are also distributing virtual appliances [69, 140, 151, 165]. The

    17

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    29/160

    associated virtual machine contracts have not gained as much traction however. Virtual

    machines, and therefore also virtual appliances, often come with configuration files that

    specify the basic hardware needs (CPU, memory, disk, etc.) of the virtual machine, but

    there is still a need for contracts that allow for the configuration of more fine-grained data

    and network resource access needs. This dissertation presents a basic, extensible contract

    system implementation in order to address this need.

    2.1.4 Virtual Machine Contracts

    To the best of our knowledge, the concept of virtual machine contracts (VMCs) in the

    context of virtual appliances was first developed at Clarkson. The VMCs put forth in

    Evanchiks masters thesis [47] were based on the concept of having a virtual appliance

    specify a set of very specific system calls that it would be allowed to make. For example,

    any read or write system calls to files or directories would need to be specified. Further,

    network-based system calls, such as bind and listen, would need to be specified in the

    virtual appliance contract. The contract methodology of explicit allow and default deny is

    an approach that this dissertation builds upon. The contract enforcement proposed in that

    thesis was based on modifying the kernel of the virtual appliance and replacing system calls

    with hypercalls (system calls into the hypervisor) that are intercepted and validated by a

    contract enforcement element running in the hypervisor.

    Differences between that work and this dissertation are the contract specification and

    enforcement aspect. In this dissertation, we implement the contract system in a much

    more general and effective way, such that enforcement can occur outside of the virtual

    appliance where it is harder for attackers to subvert. We are not limited to relying on

    enforcement within the kernel of the guest VM, as is proposed in the architecture proposed

    by Evanchik [47]. We also developed OSCKAR to give more flexibility and control to

    the virtual appliance and enforcement element designers. For example, any type of virtual

    appliance contract rules can be specified as long as there is an enforcement element that can

    respond to them. System call-based contracts could be employed with our OSCKAR system

    18

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    30/160

    (provided that appropriate enforcement elements are implemented), but that contract style

    is not used in our current Rapid Recovery Desktop implementation. Chapters 3 and 4

    present the design and implementation details of OSCKAR.

    In Data Protection and Rapid Recovery From Attack With A Virtual Private File

    Server and Virtual Machine Appliances [93], our focus of the virtual machine contracts

    was on file system contract rules in which a dedicated file server virtual machine (FS-VM)

    stored user data and allowed virtual appliance to mount specific portions of the data in read,

    write, or append-only fashion. The FS-VM in that paper supported read and write rate-

    limiting with a modified NFS server. Here, we extend the work done in that paper to add

    two new components, the OSCKAR virtualization security framework and the NET-VM.

    In [92], Matthews, et al., proposed a contract system and architecture, including the

    concept of enforcement elements, very similar to the architecture presented in this paper.

    In that paper, they demonstrate the feasibility and approach of such a system in a data

    center environment and described extending the Open Virtualization Format (OVF) [11],

    which is an open standard for packaging and distributing virtual appliances, to support

    more advanced data and network access rules. We extend that work to a Rapid Recovery

    Desktop system and implement an open source virtualization security framework that sup-

    ports custom contract rules and enforcement elements, which could include support for the

    OVF standard in the future.

    2.2 Security

    The security principles employed in this dissertation have been well-studied and applied

    in general, but we suggest using them, in combination with other technologies, such as

    virtualization, in a way that it not commonly seen in practice today. Specifically, we apply

    the principle of least privilege, isolation, and access control to virtual appliances.

    19

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    31/160

    2.2.1 The Principle of Least Privilege

    A number of security principles were first formally describe by Saltzer and Schroeder in [143].

    Among those principles was the principle of least privilege, which states that Every pro-

    gram and every user of the system should operate using the least set of privileges necessary

    to complete the job. Saltzer and Schroeder explain that the rationale behind this principle

    is to limit the damage that can occur from an accident or error, to limit the interaction

    among privileged programs, and to provide a rationale for where to place protection mecha-

    nisms. The goal of the system described in this dissertation is that virtual appliances adhere

    to the principle of least privilege. Virtual appliances are an effective and practical way to

    implement the principle of least privilege. We are not arguing that our design perfectly

    achieves least privilege. Doing so would require perfect virtual appliance contracts, a very

    detail-level processing of virtual appliance operations, and would likely be intolerably hard

    to use for any user. However, as we will describe in the sections that follow, access control

    methods that attempt to apply the principle of least privilege to various degrees are often

    disabled by users. The fact that perfect adherence may not be possible is no excuse not

    to apply reasonable constraints on virtual appliances. Some examples of mechanisms that

    help to apply the principle of least privilege include isolation and access control, which we

    discuss in the next sections.

    2.2.2 Isolation

    Complete isolation, as described in [143], is a protection system that separates principals

    into compartments between which no flow of information or control is possible. Two

    approaches for achieving complete isolation described by Saltzer and Schroeder include

    isolated virtual machines and authentication mechanisms. Virtual machines, as described

    earlier in this chapter, have been used for many years on mainframe hardware, but until

    the Disco project [17] in 1997 were considered computationally prohibitive on commodity

    systems. So, traditional isolation has relied on authentication mechanisms, such as username

    and password system login. This dissertation makes use of virtual machines to provide

    20

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    32/160

    isolation between applications stored in virtual appliances. We are certainly not the first to

    make use of virtualization for this purpose, but are among a growing list of systems using

    virtualization for security purposes. Examples of other research systems will be described

    later in this chapter.

    Although using virtual machines to provide isolation is very common in research, it is

    more challenging to apply virtualization to real production systems, especially on the desk-

    top. As part of this dissertation we hope to encourage taking research ideas and converting

    them into real systems. This concept is exemplified by a recent alpha release of the Qubes

    operating system [142], which is a new operating system based on the Xen hypervisor. We

    will describe Qubes in more detail in the Virtualization and Security section of this chapter,

    but for now we note that Qubes uses virtual machines and various virtualization hardware

    extensions to provide isolation for applications.

    2.2.3 Access Control

    Discretionary Access Control

    Early mechanisms for access control were described by Lampson in [80]. The principles

    and mechanisms described in that paper provide the foundation for the discretionary access

    control (DAC) [144] that is in common use today. The basic idea of DAC is an access control

    matrix with domains (users, groups, etc.) labeling the rows, objects (files, directories,processes, etc.) labeling the columns, and capabilities or access permissions (read, write,

    execute, etc.) as the entries within the matrix. The typically implementation is done with

    access control lists. One major weakness with DAC is that the granularity of access is too

    coarse. More specifically, if a user has access to a file, then any program running as that user

    has access to that file. Our approach to mitigating this problem is to apply access control

    at the virtual appliance level, specifically with the explicit allow, default deny policy (as

    was described in [47]).

    21

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    33/160

    Mandatory Access Control

    Recognizing the limitations of DAC are not a new revelation. One common alternative

    to DAC is mandatory access control (MAC), which is sometimes referred to as rule-based

    access control [85] or lattice-based access control [41]. A traditional view of MAC associates

    it with multi-level security (MLS), but it has been recognized that the MLS-based approach

    is too limiting to meet many security requirements [87]. The basic idea behind MAC is that

    interactions between subjects (users, programs, etc.) and objects (files, programs, etc) are

    handled by a set of system-wide security policies. The basic implementation is usually

    that all subjects and objects are labeled and policy logic is separated from the enforcement

    mechanism. MAC is significantly more sophisticated than DAC, but at a higher cost of

    complexity.

    The contract system presented in this dissertation shares many of the goals of MAC

    (for example, limiting user and application access, and separating mechanism from policy),

    but since our system is designed and implemented at the virtualization layer and applied

    to virtual appliances, we are able to specify resource restrictions at a relatively high level

    of abstraction. For example, we are able to write contract rules in terms of the virtual

    CPUs, memory, disks, and network resources of the virtual appliance. MAC policy, on the

    other hand, is typically specified in terms of lower level constructs, such as system calls and

    operation system objects (i.e. files and processes). Further, virtual appliances are likelyto provide more isolation than MAC, since malware could potentially disable MAC that is

    running on a traditional operating system. However, malware within a virtual appliance

    would not be able to turn off our contract system unless it was somehow able to subvert

    the virtual machine by, for example, breaking out of the VM and into the hypervisor or

    breaking into a component of our trusted computing base.

    It is also worth mentioning the various implementations of MAC found in practice.

    These include SELinux [87, 148], and AppArmor [12]. Microsoft has also added a MAC

    model into its operating systems with the addition of Mandatory Integrity Control starting

    in Windows Vista [97, 98]. Another interesting policy enforcement tool, called Systrace

    22

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    34/160

    [130], is touted as a lightweight replacement for MAC. This tool generates system call

    signatures in a learning mode and then enforces those policies in real time. A tool such

    as this could be used to generate system call-based contracts for applications. The output

    of such a tool could have been used directly with the implementation proposed in [47].

    Finally, SELinux Sandboxes [106, 173], based on SELinux, are an attempt to further limit

    applications by making them run in a temporary sandbox directory that is cleaned after

    the application exits. SELinux-sandboxed applications can also be run within their own X

    server environment.

    Although MAC systems are becoming more powerful and easier to use, the most ad-

    vanced features that they provide, such as SELinux Sandboxes, are generally only available

    for Linux applications. Using virtualization, as is described in this dissertation, allows ap-

    plications from other operating systems, such as Windows, to be supported. MAC concepts

    and policies should also be considered complementary to our system for two reasons. First,

    existing MAC application policy rules (for example, the application-specific confinement

    rules that are written for existing MAC systems) could be used to help virtual appliance de-

    signers build better virtual appliance contracts. Second, MAC support could also be added

    to the virtualization layer of our system, the technologies that enabled this, sVirt [155] and

    XSM [28], will be discussed in the next section.

    2.3 Virtualization and Security

    There is a vast amount of related work that attempts to apply virtualization techniques to

    solve security problems. The VMM layer can be used to monitor the guest from below and,

    often times, without the guest OS knowing it is being watched1. Some popular applications

    that make use of this unique perspective are intrusion detection systems [48, 54, 68, 76, 81,

    178,180], fault tolerance systems [15], virtual machine record and playback systems [44,73],

    malware analysis tools [13], honeypots [6], secure desktop systems [96, 142, 181], trusted

    1The red pill program can be used to detect if you are running in a virtual machine, see:ttp://invisiblethings.org/papers/redpill.html

    23

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    35/160

    computing platforms [53], and sandboxes [159].

    Some of the early work that applied virtualization to security include the following:

    Bressoud and Schneider developed fault-tolerant systems using virtual machine technology

    to replicate the state of a primary system to a backup system [15]. Dunlap, et al., used

    virtual machines to provide secure logging and replay [44]. King and Chen used virtual

    machine technology and secure logging to determine the cause of an attack after it had

    occurred [73]. Reed et al. used virtual machine technology to bring untrusted code safely

    into a shared computing environment [134]. Zhao et al. used virtual machines to provide

    protection against root kits [181].

    2.3.1 Virtualization and Isolation

    In this section we highlight two systems that use virtualization for isolation of applications.

    The first system, called Isolated Execution [159], which has been released by Intel in alpha

    form as an open source sandbox system that allows a user to right click on a binary exe-

    cutable file and run it in a sandbox VM. Although the Isolated Execution system is in an

    early development stage, it does demonstrate useful concepts that could be applied to the

    system described in this dissertation.

    The second system is an operating system, called Qubes [142], that is built on top of

    the Xen hypervisor. At a high level, Qubes shares many of the components of our Rapid

    Recovery Desktop system. Specifically, they include a network domain, which is similar to

    our NET-VM component, and a storage domain, which is similar to our FS-VM component.

    The overall goal of their system, like ours, is to isolate desktop applications from each other

    using virtual appliances. However, their approach differs from ours is several interesting

    ways. First, their virtual appliances, which they refer to as AppVMs, are assumed to be

    based on a common base file system so as to be able to make use of a set of read-only core

    system files, which has the benefit of being able to do updates once to that shared system

    core. Due to this architectural choice, the current release only supports Linux as the base,

    but they are investigating ways to support other operating systems, such as Windows, in

    24

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    36/160

    the future. Architecturally we choose to make a different choice for our Rapid Recovery

    Desktop system. By creating virtual appliances that store their own system state, we are

    able to more easily support a variety of base operating systems.

    Another difference in the Qubes architecture is that AppVMs store user data within the

    AppVMs themselves, which is in contrast to our Rapid Recovery Desktop system that stores

    user data in a dedicated FS-VM. This design choice exemplifies the different approaches

    in terms of recovery and threat model between the Qubes system and our Rapid Recovery

    Desktop system. The Qubes system uses the storage domain to store and backup user

    and application data in an encrypted file system, thus treating the storage domain as an

    untrusted entity and not part of the trusted computing base. Our system, on the other

    hand, uses the FS-VM to store user data and allows virtual appliances to mount specific

    parts of it. In this way, we treat our FS-VM as a part of our trusted computing base and do

    file system enforcement and protection outside of the virtual appliance in order to provide

    an easy way to roll back virtual appliances without affecting user data. Their threat model

    is specifically based around reducing the trusting computing base (they apply the concepts

    of disaggregation of the Xen management domain as described in [109]), so that malware

    that compromises a particular component is not able to affect other parts of the system.

    In contrast, our threat model is based on distrusting virtual appliances, so that malware

    that compromises a virtual appliance is not able to compromise other appliances nor user

    data that is stored in a isolated, hardened, and carefully protected FS-VM. In Chapter 3,

    we will describe the methods and architectural decisions we use to protect our FS-VM.

    Similar to the storage domain in the Qubes architecture, the network domain is removed

    from the trusted computing base of their architecture and network policy enforcement is

    done within each of the AppVMs. Their reasoning for this goes back to their overall threat

    model concept of reducing the size of the trusted computing base and the assumption that

    having an external network component cannot provide additional security to a compromised

    AppVM. As before, our Rapid Recovery Desktop system architecture is in direct contrast

    to theirs in that we treat our NET-VM component as part of the trusted computing base

    25

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    37/160

    and by placing it outside of the virtual appliances we use it to protect against malicious

    network activity, even in the case that a virtual appliance is compromised. We believe that

    by distrusting the virtual appliances, we can limit their ability to do harm to the rest of

    the system and the rest of the world.

    A final difference between the Qubes architecture and ours is that theirs relies on hard-

    ware support for virtualization. Specifically, they make use of the IOMMU support to give

    direct access to the network card to their network domain and the storage controller to

    their storage domain. Further they make use of the Trusted eXecution Technology (TXT)

    and the trusted platorm module (TPM) included in Intels vPro to do crypographic sign-

    ing of of boot and disk images. We plan to make use of the IOMMU capabilities for our

    NET-VM and FS-VM to improve performance and security, but we do not strictly rely on

    them to complete our threat model like Qubes does. This difference allows our system to

    be deployable on more hardware than Qubes.

    Despite the differences in architecture between Qubes and our Rapid Recovery Desktop,

    there are still ways that we could make use of some of their techniques for specific use cases.

    We will describe aspects of the Qubes architecture that we would like to integrate in section

    6.1.

    2.3.2 Virtualization and Access Control

    In the Mandatory Access Control (MAC) section earlier in this chapter, we mentioned that

    MAC could be added to the virtualization layer. One interesting approach taken by Quynh

    et al. [132] in their VMAC system was to add a special service VM that provides central

    management of MAC policies for other virtual machines. A VM like the one in that paper

    might be able to integrated, as future work, into our Rapid Recovery Desktop system.

    MAC was added to the Xen hypervisor in the form of Xen Security Modules (XSM) [28,

    29], which were implemented by the National Information Assurance Research Lab within

    the National Security Agency (NSA). XSM provide various MAC policies to be enforced

    at the Xen hypervisor level. MAC has also been integrated into the libvirt virtualization

    26

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    38/160

    toolkit [83] in the form of sVirt [155]. As will be described in Chapter 4 on implementation,

    libvirt is used to interact with the various virtualization capabilities on Linux (and other

    OSes). sVirt allows for MAC policy enforcement for the various virtualization systems that

    run on Linux, which does not yet (and may not necessarily ever completely) include support

    for Xen, since Xen is a stand alone hypervisor that is not intended to be integrated into

    Linux itself.

    MAC policies at the hypervisor level could allow for much of the basic enforcement

    that our Rapid Recovery Desktop system needs along with various other more complicated

    scenarios. For example, it could be used to assign labels to VMs and enforce various policies,

    such as VM A is only allowed to run if VM B is not running, at the hypervisor level. Adding

    MAC support at the hypervisor level of the Rapid Recovery Desktop could be an interesting

    area of future work.

    2.4 Backup and Recovery

    Our Rapid Recovery Desktop system is not intended to be a replacement for making back-

    ups, but instead it should be considered complementary. Having backups is still required

    in the case of hardware failure, for example. In this section, we consider the relationship

    of backup and recovery systems to our Rapid Recovery Desktop system. With our Rapid

    Recovery Desktop system, we focus on the problems of rapid system restoration and pro-

    tection of user data. We are unaware of another system that has separated user data and

    system data in the way that we are proposing. We optimize the handling of each to provide

    rapid system restoration after an attack.

    Our system also helps streamline the backup process by allowing efforts to focus on

    the irreplaceable personal data rather than on the recoverable system data. This allows

    backup efforts to be customized to the differing needs of system data and personal data.

    The differing rates of change for system data and user data imply different backup needs

    for each of these data types. Specifically, there is a mismatch between the overall rate of

    change in system data and the user-visible rate of change.

    27

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    39/160

    System data changes at clearly predictable points (for example, when a new application

    is installed or a patch is applied). Between these points, new system data may be written

    (such as system logs), but often this activity is of little interest to users as long as the

    system continues to function. For example, if a months worth of system logs were lost,

    most users would be perfectly happy as long as the system was returned to an internally

    consistent and functioning state. Therefore, there is little need to protect this new system

    data between change points.

    With user data, however, even small changes are important. For example, a user may

    only add 1 page of text to a report in an 8 hour workday, but the loss of that one day of

    data would be immediately visible. This means that efforts to protect user data can be

    effective even if targeted at a small percentage of overall data. Users also tend to retain a

    large body of personal data that is not actively being changed. Incremental backups can

    be kept much smaller when focused on changes to user data rather than system data.

    One common approach to providing data protection and recovery from attack is making

    full backups of all data on the physical machine both personal and system data. There are

    several ways to backup a system including copying all files to alternate media that can be

    mounted as a separate file system (for example, a data DVD) or making an exact bootable

    image of the drive with a utility such as Clonezilla [27].

    Burning data to DVD or other removable media creates a portable backup that is well-

    suited to restoring personal data and transporting it to other systems. Mounting the backup

    is also an easy way to verify its correctness and completeness. However, backups of this

    type are rarely bootable and typically require system state to be restored via re-installation

    of the operating system and applications. For example, even if all of the files associated

    with a program are backed up, the program may still not run correctly from the backup

    (for example, if it requires registry changes, specific shared libraries or kernel support).

    Making an exact image of the drive with a utility such as Clonezilla is a better way to

    backup system data. It maintains all dependencies between executables and the operating

    system. Images such as this can typically be either booted directly or used to re-image the

    28

  • 7/31/2019 68184764 an Attack Resistant and Rapid Recovery Desktop System

    40/160

    damaged system to a bootable state. However, images such as this are not always portable

    to other systems as they may contain dependencies on the hardware configuration (such

    as CPU architecture). They are also not as convenient for mounting on other systems to

    extract individual files or to verify the completeness of the backup. In contrast, backing up

    virtual appliances makes it easier to test backups without disturbing the system state.

    Despite the limitations of backup facilities, our system is designed to complement rather

    than replace backup. One goal of our system is to avoid the need for restoration from backup

    by preventing damage to personal data and providing rapid recovery of system data from

    known-good checkpoints. While it is still important to make backups, in many cases using

    our systems built in features can mean that users do not need to make use of their backups

    as often. Restoring a system from backups is often a cumbersome and manual process not

    to mention an error-prone one. Given the small percentage of users that regularly backup

    their system (and the even smaller percentage that test the correctness of their backups), it

    is important to reduce the number of situations in which restoring from backup is required.

    Our virtual machine appliances also make backups of system data that are portable to

    other machines. System data is made portable by the checkpoints of the virtual machine ap-

    pliances. The virtualization system handles abstracting details of the underlying hardware

    platform so that guests will run on any machine.

    When restoring a traditional system from a backup, users are typically forced to choose

    between returning their system to a usable state immediately or preserving the corrupted

    system for analysis of the failure or attack and to possibly recover data. With our ar-

    chitecture, users can save the corrupted system image while still immediately restoring

    a functional image. These images are also much smaller than full backups because they

    contain only system data, not personal data, such as a users music collection.

    A key advantage of our system relative to backups is that our architecture allows com-

    promis