project on Enabling Epidemiology of Computer Viruses in the Developing World- by Michael Paik

Embed Size (px)

Citation preview

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    1/44

    Gotta Catch Em All!

    Innoculous : Enabling Epidemiologyof Computer Viruses in theDeveloping World- by Michael Paik

    Project Report By :Mansi Gupta (3013021) and Malavikka Sharma (3013020)

    B.Sc. (H) Computer Science (VI th Semester)Hansraj College

    University of Delhi2012

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    2/44

    INTRODUCTION

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    3/44

    What is a computer virus ?A virus is a small piece of software that piggybacks on realprograms in order to get executed. Once its running, itspreads by inserting copies of itself into other executablecode or documents.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    4/44

    The Problem

    Among all the problems which a computer user in thedeveloping world faces today, the most pernicious one isprevalence of computer viruses, which have immediate

    and unexpected costs.

    However, it is difficult to pin down the reliable figuresabout the rates and types of infections, as well as scale of damage done because these rates only reflect reports fromlegally purchased copies of antivirus software run oninternet-connected machine, and not the preponderance of software in the developing world, which is illegallyobtained, out of its license period, or operated offline andtherefore not updated.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    5/44

    The Global Infection Rate map by McAfeeLabs.

    Virus Infections per million citizens from all viruses:

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    6/44

    Virus Infections per million citizens from top 10 viruses:

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    7/44

    While data aggregated at this level is inconclusive, the

    difference between North America and the developingregions in this regard is remarkable in that it stronglysuggests that the specific virus types present in thedeveloping world, while high in absolute infection ratedisplay a different ecology than that in the developedworld.

    Anecdotal accounts by experts on the ground put thefigure of infection rates in the developing world at up to

    80% indicating a well and truly endemic problem, a figurecorroborated by recent surveys by Bhattacharya et al.conducted in Bangalore, India.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    8/44

    The prevalence and impact of viruses is summarized in theFigure :

    As evident in the figure, 80% of centers experiencemoderate to high prevalence of computer viruses, wheremoderate indicates regular infections that causeconsiderable problems and high corresponds to

    continuous, highly detrimental infections.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    9/44

    It also summarizes the average expense on antivirussoftware, grouped according to the severity of the virus

    problem in a given location.While the expense are highly variable it is evident

    that investment in antivirus software is not sufficient tospare a shop owner from the problems.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    10/44

    In addition malware authors distribute their s/w in infectedversion of popular pirated s/w.

    Internet security firm Intego in 2009 discovered a newTrojan horse in pirated copies of Apples iWork 09productivity s/w that could allow hacker to takecontrol of infected computer .

    Research attributes viruses as originating from USB sticksin addition to Internet websites. It also cites SD cards asfrequent vector of virus infections.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    11/44

    The author of the research paper thus presented anddescribed,

    INNOCULOUS : a system consisting of a specially craftedUSB key, software and an incentivization strategy aimedtowards disinfecting machines, creating revenue streamsfor small business and individuals in developing world andobtaining rich information about computer virusinfections,in proceedings of the 5th ACM workshop on Networked

    Systems for Developing Regions (NSDR) 2011, WashingtonDC , June 2011

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    12/44

    DESIGN

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    13/44

    InspirationInnoculous was inspired by Disk Knight , a securitysoftware developed by a Bangladeshi student to protectcomputers against malicious programs that use USBmemory sticks to spread.

    Its idea was simple : if a USB key is protected by Disk Knight the program will prevent the launch of any otherprocess on the computer and display a message promptingthe user to block or allow the starting process.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    14/44

    However there was a problem in its implementation.

    Disk Knight once installed starts copying itself ontoevery unprotected USB key, making it protected. Furthermore, when this new protected USB key is

    inserted into another system, Disk Knight would run andinstall itself onto that system without users consent.

    This makes it a computer virus in itself.

    Disk Knight has been classified as PUA (potentiallyunwanted application).

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    15/44

    Environment

    Innoculous was designed specifically to address infectionson Windows Platform, particularly XP variant because avast majority of virus infections in the wild are on this

    platform due to it popularity. (The Windows family coversfor over 80% of total market.)

    2012 Win7 Vista Win2003 WinXP Linux Mac Mobile

    February 48.7% 4.5% 0.7% 30.0% 5.0% 9.1% 1.3%

    January 47.1% 4.7% 0.7% 31.4% 4.9% 9.0% 1.3%

    http://www.w3schools.com/browsers/browsers_mobile.asphttp://www.w3schools.com/browsers/browsers_mobile.asp
  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    16/44

    2011 Win7 Vista Win2003 WinXP Linux Mac Mobile

    December 46.1% 5.0% 0.7% 32.6% 4.9% 8.5% 1.2%

    November 45.5% 5.2% 0.7% 32.8% 5.1% 8.8% 1.0%

    October 44.7% 5.5% 0.7% 33.4% 5.0% 8.9% 1.0%

    September 42.2% 5.6% 0.8% 36.2% 5.1% 8.6% 0.9%

    August 40.4% 5.9% 0.8% 38.0% 5.2% 8.2% 0.9%

    July 39.1% 6.3% 0.9% 39.1% 5.3% 7.8% 1.0%

    June 37.8% 6.7% 0.9% 39.7% 5.2% 8.1% 0.9%

    May 36.5% 7.1% 0.9% 40.7% 5.1% 8.3% 0.8%

    April 35.9% 7.6% 0.9% 40.9% 5.1% 8.3% 0.8%

    March 34.1% 7.9% 0.9% 42.9% 5.1% 8.0% 0.7%

    February 32.2% 8.3% 1.0% 44.2% 5.1% 8.1% 0.7%

    January

    31.1%

    8.6%

    1.0%

    45.3%

    5.0%

    7.8%

    0.7%

    http://www.w3schools.com/browsers/browsers_mobile.asphttp://www.w3schools.com/browsers/browsers_mobile.asp
  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    17/44

    2010 Win7 Vista Win2003 WinXP W2000 Linux Mac

    December 29.1% 8.9% 1.1% 47.2% 0.2% 5.0% 7.3%

    November 28.5% 9.5% 1.1% 47.0% 0.2% 5.0% 7.7%

    October 26.8% 9.9% 1.1% 48.9% 0.3% 4.7% 7.6%

    September 24.3% 10.0% 1.1% 51.7% 0.3% 4.6% 7.2%

    August 22.3% 10.5% 1.3% 53.1% 0.4% 4.9% 6.7%

    July 20.6% 10.9% 1.3% 54.6% 0.4% 4.8% 6.5%

    June 19.8% 11.7% 1.3% 54.6% 0.4% 4.8% 6.8%

    May 18.9% 12.4% 1.3% 55.3% 0.4% 4.5% 6.7%

    April 16.7% 13.2% 1.3% 56.1% 0.5% 4.5% 7.1% March 14.7% 13.7% 1.4% 57.8% 0.5% 4.5% 6.9%

    February 13.0% 14.4% 1.4% 58.4% 0.6% 4.6% 7.1%

    January 11.3% 15.4% 1.4% 59.4% 0.6% 4.6% 6.8%

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    18/44

    2009 Win7 Vista Win2003 WinXP W2000 Linux Mac

    December 9.0% 16.0% 1.4% 61.6% 0.6% 4.5% 6.5%

    November 6.7% 17.5% 1.4% 62.2% 0.7% 4.3% 6.7%

    October 4.4% 18.6% 1.5% 63.3% 0.7% 4.2% 6.8%

    September 3.2% 18.3% 1.5% 65.2% 0.8% 4.1% 6.5%

    August 2.5% 18.1% 1.6% 66.2% 0.9% 4.2% 6.1%

    July 1.9% 17.7% 1.7% 67.1% 1.0% 4.3% 6.0%

    June 1.6% 18.3% 1.7% 66.9% 1.0% 4.2% 5.9%

    May 1.1% 18.4% 1.7% 67.2% 1.1% 4.1% 6.1%

    April 0.7% 17.9% 1.7% 68.0% 1.2% 4.0% 6.1%

    March 0.5% 17.3% 1.7% 68.9% 1.3% 4.0% 5.9%

    February 0.4% 17.2% 1.6% 69.0% 1.4% 4.0% 6.0%

    January 0.2% 16.5% 1.6% 69.8% 1.6% 3.9% 5.8%

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    19/44

    Data LoggingComputer data logging is the process of recording events,with an automated computer program, in a certain scopein order to provide an audit trail that can be used tounderstand the activity of the system and to diagnose

    problems.

    As one stated goal of Innoculous project was to acquirerich data about virus infections, a writable medium was

    necessary

    After considering several alternatives, a single self contained USB key was selected with additional effort to

    ameliorate the infection problem.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    20/44

    Infection CleaningViruses target various type of transmission media or hosts.

    Binary Executable files.Volume Boot Records of floppy disks and hard disk partitions.

    General purpose Script files.

    Application specific script files.

    System specific autorun script files.Documents that contain macros.Arbitrary computer files.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    21/44

    One of the primary goal of Innoculous was cleaning of virus infections which necessitates an anti virus solution.

    This lead to two important design considerations :-Innoculous needed a self-contained and preferably

    scriptable, command line interface.

    Measures must be taken in order to preventdisabling of the anti virus engine or corruption of thelogs by viruses that might exist on the machine beingscanned.

    antivirus fulfilled theserequirements and was thus selected. Moreover it wasexplicitly free for use for not-for-profit and research

    purposes

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    22/44

    Infection PreventionWindows variants from Windows 2000 through toWindows7, recognize only the first partition that exists onany USB memory key, and do not themselves have anycapability to create multiple partitions on such devices.

    In observance of this fact, Innoculous was installed on a second partition on a USB

    stick, after a dummy 1 megabyte NTFS partition (theminimum size), which is presented to Windows.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    23/44

    In order to partially mitigate USB threats, this 1megabyte partition has its entire capacity occupied by adummy file with a known hash, making the partitiontamper evident and proving too small for many infectionswith large or advanced payloads.

    In addition, the small size of this partition willdiscourage users from storing their own personal data onthese devices.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    24/44

    IMPLEMENTATION

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    25/44

    Custom ScriptingThe script of Innoculous is written in VB Script.

    It has the following functionality:

    1. Displays the keys hardware ID/serial number. 2. Presents the user with an option to replicate a child key .3. Asks the user for the PIN, ZIP or other postal code of

    their current location, if available.4. Presents the user with an option to start a scan. If a scan

    is started: Records serial numbers of all hard drives in the system . Begins scan using Panda Antivirus, storing verbose

    logs .

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    26/44

    Deactivates Autorun using command-line registryeditor.

    Records salient information about machine includingWindows serial number, installed patches, etc.

    5. If network connectivity is available: Checks for updated virus definitions from apreconfigured IP address

    Compresses and uploads any existing scan logs

    Records system time skew against NTP server.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    27/44

    WinPEInnoculous is implemented using Windows PE 3.1 32 bit,which provides a preinstallation environment based onWindows 7 SP1.

    Windows Preinstallation Environment (aka Windows PE orWinPE) is a lightweight version of Windows XP, WindowsServer 2003, Windows Vista, Windows 7 or Windows Server2008 R2 that is used for the deployment of workstations and

    servers. It is intended as a 32-bit or 64-bit replacement forMS-DOS during the installation phase of Windows, and canbe booted via PXE, CD-ROM, USB flash drive or hard disk.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    28/44

    USB Key PreparationA USB key of at least 2GB in size is necessary forInnoculous to run.It was prepared on a Linux machine using the followingsteps:

    Using parted , an NTFS partition is created from1023kB to 2MB. This creates a 1 megabyte (1024kB)partition, which is the minimum size supported by anymodern filesystem supported by Windows.

    Using mkntfs , the NTFS partition is formatted toNTFS

    parted is then used to create and format a FAT32partition comprising the remainder of the device.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    29/44

    A Windows PE image is imaged onto the FAT32

    partition using dd or partimage .

    install-mbr or other Master Boot Record programis used to install the MBR onto the USB key and point it tothe second partition, e.g.

    install-mbr -p2 -e2 -v /dev/sdb.

    Using the output from fdisk -ul , the start boundaryis encoded into hexadecimal using, e.g. printf , and

    inserted in little-endian format at position 0x1C of thesecond partition. This can be done using any hex editor,such as hexedit on the device, e.g.

    hexedit /dev/sdb2.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    30/44

    Deep Forensics

    As the Innoculous installation, when run, has access to allfiles resident on the host machines drives, it is possible tocopy various files from the computer for forensic analysisregarding behavior. Access to these data, properly

    redacted, could prove to be a significant source of insightinto infection vectors and browsing habits in thedeveloping world.

    This functionality, however, is not currently implementedgiven the murky ethics surrounding the issue of privacy.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    31/44

    DISTRIBUTION

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    32/44

    ReplicationThe script that serves as the core of Innoculous alsocontains the ability to replicate the entire system toanother USB key. It does this using the Windows AIK (Automated Installation Kit), builder binaries as well asWindows versions of partitioning tools to create a directcopy of itself.

    In the process of replication,

    The parent key records the serial number of the USBdevice it is replicating itself to.In addition, the replicated key is initialized with the

    hardware value of its parent, creating a bidirectional link that, as the keys are replicated, creates a graph of keys.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    33/44

    Incentivization

    The graph of keys is critical to the incentivization model,essentially a bounty on new virus types encountered andnumber of machines scanned.

    In order to encourage users of the system to replicate theirkeys and give them to others , a system analogous to theMIT Red Balloon Challenge Team which was used duringthe DARPA Network Challenge was adopted.

    The challenge was to be the first to submit thelocations of 10 moored, 8-foot, red, weather balloons at 10fixed locations in the continental United States.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    34/44

    In this model,

    bounties would bepaid out starting withthe finder and thengeometrically smaller

    proportions to thefinders parent,grandparent, etc.Explicitly,1/2 wouldbe paid out to thefinder,1/4 to theparent,1/8 to thegrandparent, etc:Rs.4, Rs.2, and Rs.1.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    35/44

    While these amounts are small, given the large numbers of infected machines and potentially multiple infections permachine, this could represent a notable revenue stream inthe developing world.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    36/44

    ControlsIn order to maintain reins on the system, several controlsmay be optionally implemented on the keys:

    A usage-based suicide gene that would wipe the keyonce n scans had been completed and uploaded atsome internet-connected machine.

    A time-based suicide gene that would wipe the keyat a given date, verified against a known NTP serveron some internet-connected machine.

    Generational limits for how many generations fromthe first tier of keys distributed may be replicated.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    37/44

    Invoked self-destruct that, when triggered by theserver, will cause the key to delete itself upon its nextcheck for virus signature updates.

    Invoked disabling of self-replication, forcing anygiven key to be a leaf node in the graph.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    38/44

    ANALYSIS

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    39/44

    The data Innoculous provides can be used for variousanalysis:

    GEOGRAPHIC SPREAD ANALYSIS

    This could illustrate the spread levels and densitiesof particular strains of viruses over a region.

    Differences in geographic distributionof viruses before 2003

    Before viruses turned intomoney making machines,they were mostly done in

    developed western nations,like Europe, USA, Canada,Japan, Australia.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    40/44

    Today the biggest hotspotsare Russia, Ukraine,Kazakhstan, Romania,Moldova, China obviously,and South America,especially Brazil, which isthe biggest sourceof banking trojans whichsteal money during onlinebanking.

    Differences in geographic distribution of viruses after 2003

    By 2009,there were even more advanced viruses and nowthe amount of infected machines around the world is in themillions

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    41/44

    STRAIN ANALYSISBased on the birthday of each strain of virus, worm, or other

    malware, it is possible to determine certain data regarding theage, spread rate and infection vector of observed viruses.

    REINFECTION

    As some machines will likely be scanned more than once givena sufficiently large network of Innoculous keys, data willemerge regarding subsequent re-infection of machines thathave been cleaned before .

    PIRACY ANALYSISDetermining what proportion of Windows installations aregenuine and which may have come infected with viruses.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    42/44

    Nearly ten times as many Windows XP SP3 systems get infected asWindows 7 SP1 64-bit systems. Even Windows Vista with its latestservice pack installed reports only half of the infection rate thanwhat Windows XP reports.

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    43/44

    CONCLUSION

  • 7/29/2019 project on Enabling Epidemiology of Computer Viruses in the Developing World - by Michael Paik

    44/44

    The use of the Innoculous system, if widespread, willprovide the research community with a detailed corpus of data regarding virus infection rates and types at low costwhile simultaneously providing revenue streams for smallbusiness owners and individuals in the developing worldand raising awareness of the problems presented by virusinfection.

    As a bonus, it also will provide a social network graph of people in the region(s) in question who are likely to beconsidered local computer power users, information thatcould help establish a valuable social network in deployingfuture projects.