Android App Clones

Embed Size (px)

Citation preview

  • 8/12/2019 Android App Clones

    1/42

    Attack of the Clones:Detecting Cloned

    Applications on AndroidMarkets

    Jonathan Crussell1,2, Clint Gibler1, and Hao Chen1

    1

    University of California, Davis2 Sandia National Labs

    Source: ESORICS 2012

    http://clintgibler.com/http://www.cs.ucdavis.edu/~hchen/http://www.iit.cnr.it/esorics2012/http://www.iit.cnr.it/esorics2012/http://www.cs.ucdavis.edu/~hchen/http://www.cs.ucdavis.edu/~hchen/http://www.cs.ucdavis.edu/~hchen/http://clintgibler.com/
  • 8/12/2019 Android App Clones

    2/42

    Outline

    Introduction Background

    Threat Model

    Clone Detection Approaches and Related Work

    Methodology

    Evaluation Case Studies

    Discussion

    Conclusion

  • 8/12/2019 Android App Clones

    3/42

    Introduction

    Much of the user experience of Android relies on third-party Android has numerous marketplaces.

    Protect users from malicious apps.

    Protect developers from plagiarists.

  • 8/12/2019 Android App Clones

    4/42

    Introduction

    Developers can charge directly for their apps. Offer free apps that are ad-supported or contain in-game bi

    Some apps have two version.

    Paid appcracked & release for free

    Free appcloned & change ad libraries

  • 8/12/2019 Android App Clones

    5/42

    Introduction

  • 8/12/2019 Android App Clones

    6/42

    Background

    Android Markets Android Application Structure

  • 8/12/2019 Android App Clones

    7/42

    Threat ModelDefinition of Clone

    Clones occur when two applicationshave similar codebut have different ownership.

    IgnoreThird-party librariesMultiple versions of the same application if they have the

    ownership.

  • 8/12/2019 Android App Clones

    8/42

    Resistance to Evasion Techniques.

    High level modifications Method Restructurings

    Control Flow Alterations

    Addition/Deletion

    Reordering

  • 8/12/2019 Android App Clones

    9/42

    Non Goals

    Find cloning in native code. Determine which applications are the victims and which are

  • 8/12/2019 Android App Clones

    10/42

    Clone Detection ApproachesFeatBased

    Feature based approaches analyze a program and extract a features.

    Number or size of classes, methods, loops, or variables to inlibraries.

    Low detection rate or high false positive rate.

  • 8/12/2019 Android App Clones

    11/42

    Clone Detection ApproachesStructuBased

    Structure based systems convert programs into a stream of and then compare the streams between two programs.

    More robustly than feature based systems.

    JPLAG, Winnowing and MOSS.

    Comparing DEX byte code streams could be a quite quick an

    method to find exactly or near exactly copied code. But byte code streams contain nohigher level semantic kno

    about the code.

  • 8/12/2019 Android App Clones

    12/42

    Clone Detection ApproachesPDGBased

    Program Dependence Graph:each node is a statementeach edge shows a dependency between statements

    two types of dependencies: data and control

    A data dependency edge between statements 1and 2exisis a variable in

    2

    whose value depends on 1

    .

    A control dependency between two statements exists if thevalue of the first statement controls whether the second staexecutes.

  • 8/12/2019 Android App Clones

    13/42

    Related Work

    Androguard, DEXCD and DroidMOSS. All these approaches are structure based or structure based

    approximations.

    None of these tools use any semantic information to aid in dplagiarism.

  • 8/12/2019 Android App Clones

    14/42

    Methodology

  • 8/12/2019 Android App Clones

    15/42

    Selecting Potentially Cloned Applica

    The goal of an application plagiarist is to entice unwary userchoose her cloned application instead of the original.

    Name and description.

  • 8/12/2019 Android App Clones

    16/42

    Determining Application Similarity Based onAttributes

    We use Solrto mimic the search engines on Android market Attributes of the apps:

    name, package, market, owner, and description

    http://lucene.apache.org/solr/http://lucene.apache.org/solr/
  • 8/12/2019 Android App Clones

    17/42

    Constructing PDGs

    dex2jar: Convert both apps code from the DEX format to a J WALA: Construct PDGs for each method in every class of the

    applications.

    Only data dependency edges: More robust against statemenreordering, insertion and deletion.

    https://code.google.com/p/dex2jar/http://wala.sourceforge.net/wiki/index.php/Main_Pagehttp://wala.sourceforge.net/wiki/index.php/Main_Pagehttps://code.google.com/p/dex2jar/
  • 8/12/2019 Android App Clones

    18/42

    Comparing PDGs-Excluding CommLibraries

    Ad library Admob, Facebook API, etc.

    Dumped both the package name and SHA-1 hash of known files and recorded the most frequent SHA-1 hashes for each

  • 8/12/2019 Android App Clones

    19/42

    Lossless and Lossy Filters

    Lossless filter: Removes PDGs from consideration that are smthan a specified size (< 10 nodes).

    Lossy filter: Calculate a frequency vector for each of the metthe pair.

    This vector counts how many times a specific node type occ

    PDG. Compare these two vectors using hypothesis testing (G-test

  • 8/12/2019 Android App Clones

    20/42

    Subgraph Isomorphism

    Find a mapping between nodes in

    and nodes in

    Subgraph isomorphism is NPComplete.

    VF2 algorithm.

  • 8/12/2019 Android App Clones

    21/42

    Computing Similarity Scores

    For each method(excluding the methods in known librarieapplication, let ||be the number of nodes in this methodFind the best match of this PDG in s PDGs and denote it a

    Similarity score: () = |()|

    ||

  • 8/12/2019 Android App Clones

    22/42

    Evaluation

    75,000 free apps from 13 Android markets.

    Randomly selected 9,400 pairs from the potential clones.

    Hadoop: parallelize DNADroid.

    HDFS: share data across a small cluster.

    The average throughput of DNADroid on this small cluster is

    application pairs per minute.

  • 8/12/2019 Android App Clones

    23/42

    Similarity between Applications

  • 8/12/2019 Android App Clones

    24/42

    Similarity between Applications

  • 8/12/2019 Android App Clones

    25/42

  • 8/12/2019 Android App Clones

    26/42

    Clustering Cloned Applications

  • 8/12/2019 Android App Clones

    27/42

  • 8/12/2019 Android App Clones

    28/42

    Filter Performance

  • 8/12/2019 Android App Clones

    29/42

    Filter Performance

  • 8/12/2019 Android App Clones

    30/42

    Visual and Behavioral Verification

  • 8/12/2019 Android App Clones

    31/42

    Case Studies

  • 8/12/2019 Android App Clones

    32/42

    Benign Cloning

    DNADroid found 30 pairs that both have a 100% similarity s

    Translation.

  • 8/12/2019 Android App Clones

    33/42

    Changes to Advertising Libraries

    We can see when an application has most likely been clonedmonetary gain.

    Ex: XWind Downloader

    For the 141 apps, we found that 91 (65%) of these pairs hadlibraries, all of which included changes to advertising librarie

  • 8/12/2019 Android App Clones

    34/42

    Malware Added to an Application

    HippoSMS is a malicious application requires 10 permissio

    It shares the same package name as a Chinese video player 11 permissions.

    6 permissions that video player doesnt use.

  • 8/12/2019 Android App Clones

    35/42

    Two Variants of the Same Malware

    Two malicious apps that are identified by VirusTotalas beingof the BaseBridge malware family.

    Both applications have been stripped of meaningful class annames.

    DNADroid found coverages of 35% and 28% between the tw

    U f F C ki T l i th

  • 8/12/2019 Android App Clones

    36/42

    Use of Freeware Cracking Tool in thWild AntiLVL

    Decompiling an app with baksmaliInserts a new file:SmaliHook.classAnd hide AntiLVLsmodifications from the app itself by returni

    original file size, MD5, and signatures.

    Android License Verification Library (LVL), Amazon Appstore DRMVerizon DRM.

    189 of 310 applications containing SmaliHook.class 235 of 310 containing references to AntiLVL in their signature file

    Only 8% of our total apps were acquired from Chinese markets, apps including AntiLVL traces were from Chinese markets.

  • 8/12/2019 Android App Clones

    37/42

    Discussion

  • 8/12/2019 Android App Clones

    38/42

    False Positive

    Since it is a serious allegation to claim an application is a clo

    design DNADroid to have a very low false positive rate.

  • 8/12/2019 Android App Clones

    39/42

    False Negative

    Cloned applications often have similar attributes as the orig

    There exist advancedprogram transformations that can evabased clone detection.

  • 8/12/2019 Android App Clones

    40/42

    Comparison to Other Approaches

    Androguard: miss 18%

    DEXCDhad problems running on the pairs DNADroid identif

    DroidMOSSis not currently publicly available.

  • 8/12/2019 Android App Clones

    41/42

    Performance

    DNADroid are more expensive but result in fewer false posit

    false negatives.

  • 8/12/2019 Android App Clones

    42/42

    Conclusion

    DNADroid is a tool for finding clones on a large scale.

    We evaluated DNADroid on applications crawled from 13 Anmarkets.

    Identified at least 141 apps that have been clonedAn additional 310 apps that were cracked with AntiLVL

    We describe five case studies

    DNADroid has a very low false positive rate

    DNADroid is an effective tool.