41
What Are the Characteristics of High-rated Apps? A Case Study on Free Android Applications David Lo Yuan Tian Meiyappan Nagappan Ahmed E. Hassan

What are the Characteristics of High-rated Apps

  • Upload
    sailqu

  • View
    48

  • Download
    1

Embed Size (px)

Citation preview

Page 1: What are the Characteristics of High-rated Apps

What Are the Characteristics of High-rated Apps?

A Case Study on Free Android Applications

David LoYuan Tian MeiyappanNagappan

Ahmed E. Hassan

Page 2: What are the Characteristics of High-rated Apps

2

Dramatic mobile app market & mobile app usage growth in recent years.

In 2012 the app economy was worth $53Bn and is expected to expand at a 28% CAGR up to 2016, reaching $143Bn.

Page 3: What are the Characteristics of High-rated Apps

3

Mobile app market is attractive, however building a successful app is hard.

Highly Competitive:With other 50,000 apps in the app store

Limited Chance: Smartphone users stick to limited mobile apps.

Page 4: What are the Characteristics of High-rated Apps

How to make a successful app? - Let’s contrast high- and low-rated apps!

Page 5: What are the Characteristics of High-rated Apps

5

Prior Studies

• Bavota et al. find that low-rated apps have method calls to APIs that are more change or fault prone.(Bavota et al., TSE 2015)

• Taba et al. find that low-rated apps have more complex user interface. (Taba et al., ICWE 2014)

• Chia et al. find that number of required permissions impacts app ratings (Chia et al. WWW 2012)

• Many other studies.

Page 6: What are the Characteristics of High-rated Apps

6

This Study

• Consider a comprehensive set of factors that may impact app ratings instead of only a few.

‒ Consider existing factors‒ Plus additional factors not considered before

• Compare the relative importance of each of the factors on app ratings

‒ In predicting high-rated and low-rated apps‒ On a reasonably large dataset of more than 1000 apps

Page 7: What are the Characteristics of High-rated Apps

7

Agenda

• Motivation• Factors• Case study• Discussion• Conclusion

Page 8: What are the Characteristics of High-rated Apps

8

We analyze 28 factors in 8 dimensions

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

Page 9: What are the Characteristics of High-rated Apps

9

App Size Dimension

Why App Size?

+ Large code => richer functionality

- Larger code => higher chance for bugs (Zimmermann, 2007)

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

Page 10: What are the Characteristics of High-rated Apps

10

Why Code Complexity?

+ More complex code => more advanced functionality

- More complex code => more bugs (Subramanyam, 2003)

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

Code Complexity Dimension

Page 11: What are the Characteristics of High-rated Apps

11

Why Library Dependence?

+ Higher dependence => richer functionality built upon third party code

- Higher dependence => difficulty to keep up with library evolution, which results in bugs. (Syer, 2014)

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

Library Dependence Dimension

Page 12: What are the Characteristics of High-rated Apps

12

Why Library Quality?

- Buggy library code => buggy apps built on top of them.

- Frequently changed libraries => bugs if apps are not properly maintained

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

Library Quality Dimension

Page 13: What are the Characteristics of High-rated Apps

13

Why UI Complexity?

- More complex UI => app is harder to use.

+ More complex UI => more functionality.

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

UI Complexity Dimension

Page 14: What are the Characteristics of High-rated Apps

14

Why User Requirements?

+ Larger target SDK version => incorporation

of latest feature, active maintenance effort.

+ Number of permission request => more features.

- Number of permission request => privacy risk.

User Requirement Dimension

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

Page 15: What are the Characteristics of High-rated Apps

15

Why Marketing Effort?

+ More marketing effort => better first impression,

more functionality.

Marketing Dimension

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

Page 16: What are the Characteristics of High-rated Apps

16

Why Category?

Different category => different user expectation.

Category Dimension

Cate-gory

App Size

Code.Comp.

LibraryDepend.

LibraryQuality

UIComp.

User Req.

Market-ing

Page 17: What are the Characteristics of High-rated Apps

17

Factors in App Size Dimension

• Binary size of the APK file (measured in KB) Install Size

• Total number of classes (including library code). Total classes

• Total number of app specific classes. App classes

• Total number of activities defined in the AndroidManifest.xml file.

# Activities

• Total number of services defined in the AndroidManifest.xml file.

# Services

Page 18: What are the Characteristics of High-rated Apps

18

Factors in Code Complexity Dimension

• Chidamber and Kemerer’s object oriented complexity metrics, e.g., the number of methods in each class.

• Note that we compute the mean over all classes in each app.

Six CK metrics

• Mean of the number of other classes that depend upon each class.

Afferent coupling

• Mean of the number of public methods in each class.

Number of public methods

Page 19: What are the Characteristics of High-rated Apps

19

Factors in Library Dependence Dimension

• Total number of (percentage of) calls to libraries that start with “android.”.

Absolute (percentage) dependence on Android

• Total number of (percentage of) calls to third party libraries.

Absolute (percentage) dependence on third party libraries

Page 20: What are the Characteristics of High-rated Apps

20

Factors in Library Quality Dimension

• Mean number of methods changed in used Android APIs (Bavota et al. 2015)

Change of used Android APIs

• Mean number of bugs in the used Android APIs (Bavota et al. 2015)

Faultiness of used Android API

Page 21: What are the Characteristics of High-rated Apps

21

Factors in UI Complexity Dimension

• Mean number of input elements per layout.

Input elements per layout

• Mean number of output elements per layout.

Output elements per layout

Page 22: What are the Characteristics of High-rated Apps

22

Factors in User Requirements Dimension

• The minimum SDK version required for the app to run.

Minimum SDK version

• The SDK version that the app targets. If not set, the default value equals to minSDK.

Target SDK version

• Number of required features from user’s device (e.g., camera).

Required device features

• Number of permissions needed from user. Required user permission

Page 23: What are the Characteristics of High-rated Apps

23

Factors in Marketing Effort Dimension

• Number of words appearing in the description of the app in its Play Store page.

Length of description

• Number of images shown on the app’s store page.

Promotional images

Page 24: What are the Characteristics of High-rated Apps

24

All the factors could be calculated by using tools including: ApkTool, dex2jar, BCEL based on app apk and info on app store.

Page 25: What are the Characteristics of High-rated Apps

25

Meta Data

Extract: Category, SizeRating, Rating Count

Google Play

Extract: Marketing

Extract: Size

APKsApkTool

AndroidManifest Files, Resource

Extract: Requirements on Users, UI

BCEL

dex2jar

Extract: Code Complexity, Library Dependence, quality of library Code

Jars

Android API Changeand Bug Logs

Step 1

Step 2 Step 3

Page 26: What are the Characteristics of High-rated Apps

26

Meta Data

Extract : Marketing

Extract : Category, Size

Rating, Rating Count

Google Play

Step 1

Page 27: What are the Characteristics of High-rated Apps

27

Extract: Size

Step 2

APKsApkTool

AndroidManifest Files, Resource

Extract: Requirements on Users, UI

Page 28: What are the Characteristics of High-rated Apps

28

BCEL

Step 3

APKs dex2jar

Extract: Code Complexity,Library Dependence, quality of library Code

JarsExtract: Size

Android API Changeand Bug Logs

Page 29: What are the Characteristics of High-rated Apps

Our case study is done on 1,492 android apps:

29

Step 1: Randomly selected and crawled 10,000 apps.

Step 2: Filter out apps that:‒ Have less than 10 ratings‒ Could not be processed by

our tools

Step 3: Sort apps in each category by

their ratings. Select the top 10% (high-rated) and bottom 10% (low-rated) apps.

Page 30: What are the Characteristics of High-rated Apps

30

Research Questions

RQ2: What are the important factors that could be

used to predict app ratings?

RQ1: Is there a relationship between each factor an app rating?

Page 31: What are the Characteristics of High-rated Apps

31

RQ1: Relation between factors and rating

• Compare the values of each factor between high-rated and low rated apps.

• Analyze the statistical significance and effect size of the difference between the two groups of apps.

‒ Use Mann-Whitney U test at p-value of 0.01.‒ Compute Cliff’s Delta (or d).

Page 32: What are the Characteristics of High-rated Apps

32

-ve

+ve

+ve

RQ1: Relation between factors and rating

Page 33: What are the Characteristics of High-rated Apps

33

RQ1: Summary of Findings

High-rated apps are statistically significantly different from low-rated apps in 17 out of the 28 factors.

Generally, high-rated apps are larger with more complex code, more preconditions, more marketing efforts, more dependence on libraries, and they make use of higher quality Android libraries.

Page 34: What are the Characteristics of High-rated Apps

34

RQ2: Important factors for prediction

• Remove highly correlated factors• Remove redundant factors• Use random forest

‒ Ten fold cross validation‒ Measure performance in terms of F1 and AUC‒ Repeat the process with different factors omitted

• Employ Scott-Knott test ‒ Identify groups of factors that are statistically

significantly different from one another

Page 35: What are the Characteristics of High-rated Apps

35

The RF model could achieve:F-measure of 0.74 + AUC of 0.81

RQ2: Important factors for predictionTop-3

Size of the app

# Promotional Images

Target SDK

RQ2: Important factors for prediction

Page 36: What are the Characteristics of High-rated Apps

36

RQ2: Summary of Findings

The size of an app, the number of promotional images on its store page, and the target SDK are the three most influential factors in determining the likelihood of an app being a high rated app.

Page 37: What are the Characteristics of High-rated Apps

37

Discussion: Comparison with Past Findings

• Reinforce findings by Bavota et al. (TSE’15)‒ API quality influence rating‒ However, it is not the most important factors (#5)

• Refute findings by Taba et al. (ICWE’14)‒ UI complexity is not statistically significantly related with

app ratings.

• Reinforce findings by Chia et a. et al. (WWW’12)‒ # required permission is weakly associated with rating

• Highlight additional factors: app size, promotion effort, target sdk.

Page 38: What are the Characteristics of High-rated Apps

38

Discussion: Power of Multi-Factor Analysis

Precision Recall F-measure0

0.10.20.30.40.50.60.70.8

All Size Code ComplexityDependence Library Quality UI ComplexityRequirement on User Marketing Category

Single dimension factors are not enough to successfully differentiate high-rated from low-rated apps.

0.7

Page 39: What are the Characteristics of High-rated Apps

39

Page 40: What are the Characteristics of High-rated Apps

40

Future Work

• Explore additional factors• Do a more fine-grained study (individual category)• Employ a causality analysis to get a deeper

understanding• Interview Android app developers to get their

insight

Page 41: What are the Characteristics of High-rated Apps

41

Questions? Comments? Advice?{yuan.tian.2012,davidlo}@smu.edu.sg

[email protected], [email protected]

Thank You !