135
Idan Felix @idanfelix 30/05/2016 https://goo.gl/xekmHy GPU Performance 2

Performence #2 gpu

Embed Size (px)

Citation preview

Page 1: Performence #2  gpu

Idan Felix@idanfelix30/05/2016

https://goo.gl/xekmHy

GPU Performance2

Page 2: Performence #2  gpu

First,

Page 3: Performence #2  gpu

Idan Felix

3

I’m 33 years old

VaronisAndroid Academy TLV

Page 4: Performence #2  gpu

Yonatan LevinGoogle Developer

Expert & Android @ Gett

Idan FelixSenior Android &

Redhead Varonis

Jonathan Yarkoni

Android Developer & Advocate Ironsource

Android Academy Staff

Britt Barak

Android LeadReal

Muiriel Felix

Android Design

Page 5: Performence #2  gpu

Logistics

Page 6: Performence #2  gpu

https://www.facebook.com/groups/android.academy.ils/

Page 7: Performence #2  gpu

What’s next?13/6 - Britt- View, Animations

4/7 - Yonatan- Networking, JSON, Batching, Location

10/8 - Felix- Battery & CPU

14/9 - Britt- Threading

Page 8: Performence #2  gpu

30 / 10 / 2016New course coming

Page 9: Performence #2  gpu

Register to Meetup,

Join our facebook

Learn Android

Be Awesomeרקורסיה#

Page 10: Performence #2  gpu
Page 11: Performence #2  gpu

What a hell did you do @ San Francisco?

Google IO 2016+

Page 12: Performence #2  gpu
Page 13: Performence #2  gpu
Page 14: Performence #2  gpu
Page 15: Performence #2  gpu
Page 16: Performence #2  gpu

Firebase 2.0

Page 17: Performence #2  gpu
Page 18: Performence #2  gpu

Analytics

Page 19: Performence #2  gpu

Cloud Messaging

Page 20: Performence #2  gpu

Analytics

Page 21: Performence #2  gpu

Analytics

Page 22: Performence #2  gpu

Analytics

Page 23: Performence #2  gpu
Page 24: Performence #2  gpu

Seamless Update

No need to do anything.

Prompt only when download & Install is ready.Just reboot device.

Page 25: Performence #2  gpu

Multi Window Mode and Picture in Picture

Page 26: Performence #2  gpu

Multi Window Mode and Picture in Picture

Page 27: Performence #2  gpu

Notifications

Page 28: Performence #2  gpu

Quick Settings

Page 29: Performence #2  gpu

Doze On the Go

Page 31: Performence #2  gpu

Direct boot

Till user open the deviceOnly apps that was configured allow to runDifferent storage

Page 32: Performence #2  gpu

Java 8android { ... defaultConfig { ... jackOptions { enabled true } } compileOptions { sourceCompatibility JavaVersion.VERSION_1_8 targetCompatibility JavaVersion.VERSION_1_8 }}

Page 33: Performence #2  gpu

Fragments

BugFixescommitNow()

Page 34: Performence #2  gpu

Android Studio 2.2

Page 35: Performence #2  gpu

Constrain Layout

Page 36: Performence #2  gpu

APK Analyzer

Page 37: Performence #2  gpu

Android Wear 2.0

Page 38: Performence #2  gpu

Android Wear 2.0

Page 39: Performence #2  gpu

Instant Apps

Page 40: Performence #2  gpu

“With Instant Apps, tapping a link can take you with Deep Links into an Android app in just a few seconds without having to install the app,”

Michael Siliski

Page 41: Performence #2  gpu

#PerfMatters

Colt

McAnlis

Page 42: Performence #2  gpu

Layers of our talk(s)

- Understanding The Theory

- What Can Go Wrong?

- Weapons for the hunt

Page 43: Performence #2  gpu

Warning

Page 44: Performence #2  gpu

Sorry, (Almost) No code

today

Page 45: Performence #2  gpu

But a lot of links!

Page 46: Performence #2  gpu

TheoryA“Understanding is the first step to acceptance,

And only with acceptance can there be recovery.”

J. K. Rowling, Harry Potter and the Goblet of Fire

Page 47: Performence #2  gpu

- Assess the problem and establish acceptable behavior.

- Measure perf b4 modification.

- Identify bottleneck.- Remove bottleneck.- Measure perf after

modification.- If better, adopt.

If worse, put it back.

Methods of Systematic performance improvement

https://en.wikipedia.org/wiki/Performance_tuning

- Figure Out Where You Need to Be

- Determine Where You Are Now

- Decide Whether You Can Achieve Your Objectives

- Develop a Plan for Achieving Your Objectives, and Execute

- Conduct an Economic Analysis

http://www.perfeng.com/papers/step5.pdf

Page 48: Performence #2  gpu

A Word on Premature Optimizations

Page 49: Performence #2  gpu

Meet Prof. Donald KunthProfessor at Stanford, Wrote “The art of computer programming”Few quotes:

“Beware of bugs in the above code; I have only proved it correct, not tried it.”https://en.wikipedia.org/wiki/Donald_Knuth

Page 50: Performence #2  gpu

Meet Prof. Donald Kunth

https://en.wikipedia.org/wiki/Donald_Knuth

Professor at Stanford, Wrote “The art of computer programming”Few quotes:

The psychological profiling [of a programmer] is mostly the ability to shift levels of abstraction, from low level to high level.

Page 51: Performence #2  gpu

Meet Prof. Donald Kunth

https://en.wikipedia.org/wiki/Donald_Knuth

Professor at Stanford, Wrote “The art of computer programming”Few quotes:

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

Page 52: Performence #2  gpu

However...

Knowing these kind of things- Helps you avoid mistakes and bugs- Makes you a better developer- Teaches you the internal workings of the system

- Cost-Effectiveness-wise it’s just being smart:“...Yet we should not pass up our opportunities in that critical 3%”

Colt McAnlis: https://medium.com/google-developers/the-truth-about-preventative-optimizations-ccebadfd3eb5

Page 53: Performence #2  gpu

Target Audience

This lecture is for you if…:- You have a custom view in your app- You have a lot of images in your app- You have an app, or developing an app,

and want to do a better job- You want to become a better developer

Page 54: Performence #2  gpu

Memory

Battery

Radio

GPU

APK Size

CPU

Layout

Page 55: Performence #2  gpu

The Magic<!-- Layout for weather forecast list item for today --><LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" android:layout_width="match_parent" android:layout_height="wrap_content" android:gravity="center_vertical" android:minHeight="?android:attr/listPreferredItemHeight" android:orientation="horizontal" android:background="@drawable/today_touch_selector">

<LinearLayout android:layout_height="wrap_content" android:layout_width="0dp" android:layout_weight="7" android:layout_marginTop="16dp" android:layout_marginBottom="16dp" android:layout_marginLeft="60dp" android:orientation="vertical">

<TextView android:id="@+id/list_item_date_textview" android:layout_width="match_parent" android:layout_height="wrap_content" android:textAppearance="?android:textAppearanceLarge" android:fontFamily="sans-serif-condensed" android:textColor="@color/white" />SRC:

https://github.com/udacity/Sunshine-Version-2/blob/sunshine_master/app/src/main/res/layout/list_item_forecast_today.xml

Magic

Page 56: Performence #2  gpu

The Magic

Measure Layout

Inflate

Draw

Page 57: Performence #2  gpu

Step 0: Inflate the View Tree- During onCreate() we call setContentView()

- The XML is parsedand a tree of objects is created

- The tree is traversed during nextsteps

Page 58: Performence #2  gpu

Now we have a Data Structure to use

Things get interesting!

Page 59: Performence #2  gpu

Step 1: Measure- Starts with the root- Recursively ask the views (+Children) to measure

themselves.

- This is done by calling onMeasure(int, int)- It’s a negotiation, so onMeasure may be called

multiple times.

- When it’s done, All the views in the tree know their size.

REF: http://developer.android.com/reference/android/view/View.html#onMeasure(int, int)

Page 60: Performence #2  gpu

Step 2: Layout- Starts with the root- Recursively position each child

- Done by onLayout(boolean, int, int, int, int)

- Stores the position, and set position for all children

- When it’s done, All the views in the tree knows their position

REF: http://developer.android.com/reference/android/view/View.html#onLayout(boolean, int, int, int, int)

Page 61: Performence #2  gpu

- Now that a view knows its position, The onDraw(Canvas) method is called

- That’s where a view draws itself before asking its children to draw.

- The Canvas object generates (or Updates) a list of OpenGL-ES commands (draw-list) to send to the GPU.

Step 3: Draw (AKA Update)

REF: http://developer.android.com/reference/android/view/View.html#onDraw(android.graphics.Canvas) Guide: http://developer.android.com/training/custom-views/custom-drawing.html

Page 62: Performence #2  gpu

Canvas Methods (and responsibility)drawARGB

drawArc

drawBitmap

drawColor

drawLine

drawPicture

drawText

clipPath

clipRect

quickReject

rotate

scale

skew

translate

Ref: https://developer.android.com/reference/android/graphics/Canvas.html

Page 63: Performence #2  gpu

The Magic

Measure Layout

Inflate

Draw

Page 64: Performence #2  gpu

Plus ça change, plus c'est la même chose

When things change (text, color, size, padding, margin, etc.), A view notifies the system, by callingInvalidate - which will call the onDraw again, orrequestLayout - which will call the entire process again.

Page 66: Performence #2  gpu

It doesn’t end there

Page 67: Performence #2  gpu

Step 4: Execute

The GPU executes the command list,That was generated in onDraw()And was cached

But what if this takes too long?

Page 68: Performence #2  gpu
Page 69: Performence #2  gpu

Solution: Use a double buffer

http://openglbook.com/chapter-1-getting-started.html

Page 70: Performence #2  gpu

Solution: Use a double buffer

http://openglbook.com/chapter-1-getting-started.html

Page 71: Performence #2  gpu

Step 5: VSync

Old CRT screens had to be “synced”, so that the monitor starts a frame at the correct time.

Similarly, Android holds the copy from a back buffer if it is currently drawing to the screen.

https://en.wikipedia.org/wiki/Analog_television#Vertical_synchronization

Page 72: Performence #2  gpu

Britt’s Lecture (13/6)

The Magic

Animate

Measure Layout

TodayInflate

Draw

Execute Sync

Page 73: Performence #2  gpu

Let’s talk about timing

Page 74: Performence #2  gpu

But How Does It Work?

SmoothMotion

60

No Difference

60+

Flip Book

12

Movies

Frames Per Second

Fluid Motion

24+effects

Page 75: Performence #2  gpu

We Have A Winner!

SmoothMotion

60

No Difference

60+

Flip Book

12

Movies

Frames Per Second

Fluid Motion

24+effects

SmoothMotion

60

Colt McAnlis: https://youtu.be/CaMTIgxCSqU

Page 76: Performence #2  gpu

Things to remember- Britt will give a similar explanation on her lecture

on 13/6

- Know thee process,Appreciate, and

Know that the entire process should take less than 16ms

- Things on the GPU:- Should get there as soon as possible

(early)

- Should get there as quickly as possible (small)

- Should stay there for as long as possible(late)

Page 77: Performence #2  gpu

Any Questions?

Page 78: Performence #2  gpu

Part 2:What can possibly

go wrong

Page 79: Performence #2  gpu

Few things in this process can go wrong- Allocations in onDraw()

- A reminder from last week

- Avoiding Redundant Work- Overdraw

- ClipRect

- QuickReject

- Invalidations

- Bitmap Abuse- So many things can go wrong here, OMG...

Page 80: Performence #2  gpu

Allocating objects (new XXX()) might cause a GC (blocking)

So you might drop a frame.

Allocations in onDraw()

But it goes much deeper. See Ian Ni-Lewis: https://youtu.be/HAK5acHQ53E

Page 81: Performence #2  gpu

Solution and Avoidance

Solution: DON’T ALLOCATE OBJECTS IN onDraw() METHOD.

Avoidance: USE LINT (built into the Studio)

https://developer.android.com/studio/write/lint.html || Colt McAnlis: https://youtu.be/Z_huaXCsYyw

STUDIO

Page 82: Performence #2  gpu

But wait -

How can I know if frames gets dropped?

Page 83: Performence #2  gpu

The all-mighty LogCatI/Choreographer(1378): Skipped 55 frames! The application may be doing too much work on its main thread.

Page 84: Performence #2  gpu

But there’s a much cooler tool

Page 85: Performence #2  gpu
Page 86: Performence #2  gpu

GPU Profiling

Page 87: Performence #2  gpu

GPU ProfilingDisplays a graph for each visible app, showing how much time each frame took:The taller the bar, the longer it took to renderThe green line marks the 16 millisecond target. Every time a frame crosses it, your app is missing a frame.https://developer.android.com/studio/profile/dev-options-rendering.html

Page 88: Performence #2  gpu

16ms line

Dropped Frames :(

14

Page 89: Performence #2  gpu

https://developer.android.com/studio/profile/dev-options-rendering.html || Colt McAnlis https://youtu.be/VzYkVL1n4M8

OnDraw work

Copy the list

Execute

glSwapBuffers

Page 90: Performence #2  gpu

Any Questions?

Page 91: Performence #2  gpu

16ms line

Dropped Frames :(

14

accented if >16ms

Shiney colors!

Page 92: Performence #2  gpu

https://youtu.be/erGJw8WDV74

Page 93: Performence #2  gpu

This is where you tell them the biggest lie of the GPU profiler,

Before moving to the next part

BTW - Now for reals, Any questions?

Page 94: Performence #2  gpu

BThe number of files that a given pixel is drawn in a frame.

Overdraw

Colt McAnlis: https://youtu.be/T52v50r-JfE

Page 95: Performence #2  gpu

What is Overdraw

When a pixel is drawn multiple times - That’s overdraw.

it might waste time and energy.

When the GPU executes the display list,It can count how many times each pixel is drawn.

Not like this

Page 96: Performence #2  gpu

Detecting Overdraw

Page 97: Performence #2  gpu

Detecting Overdraw

https://developer.android.com/studio/profile/dev-options-overdraw.html

Page 98: Performence #2  gpu

There’s even a Code-Lab online:https://io2015codelabs.appspot.com/codelabs/android-performance-debug-gpu-overdraw

Page 100: Performence #2  gpu

Fixing Overdraw

There are 2 common reasons for overdraw:- Redundant backgrounds / Redundant transparency - Wasteful onDraw

- Things that aren’t visible at all gets drawn (not using quickReject)

- Things that will be overdrawn gets drawn (not using clipRect)

Colt McAnlis: https://youtu.be/vkTn3Ule4Ps

Page 101: Performence #2  gpu

QuickReject

A method to tell if something can be not drawn at all.Call quickReject to see if you can skip drawing of things that will be off screen.

REF: https://developer.android.com/reference/android/graphics/Canvas.html

Page 102: Performence #2  gpu

ClipRect

ClipRect is a way to avoid OverDraw, By keeping your GPU from drawing pixels that you know that will be obscured by other stuff, you refrain from overdraw.

Page 103: Performence #2  gpu

Step 1:quickReject

Step 2:clipRect

Page 104: Performence #2  gpu

ClipRect vs. QuickReject

Method return type:

Detects... Helps to...

QuickReject boolean Fully invisible stuff

Avoid redundant calls to drawXXX(), Keeping the drawlist short.

ClipRect void Fully and Partially invisible stuff

Avoid drawing pixels, but still executing the draw-list!

booleanBut is usually used as void

Page 105: Performence #2  gpu

Any Questions?

Page 106: Performence #2  gpu

InvalidationsC

Page 108: Performence #2  gpu

Plus ça change, plus c'est la même chose

When things change (text, color, size, padding, margin, etc.), A view notifies the system, by callingInvalidate - which will call the onDraw again, orrequestLayout - which will call the entire process again.

Page 109: Performence #2  gpu

Simply Avoid calling

invalidate

unless you have to,And watch this: https://youtu.be/we6poP0kw6E

Page 110: Performence #2  gpu

The horrible things that we do with bitmaps are

incredible, but there are ways to fix it!

Bitmap AbuseD

Page 111: Performence #2  gpu

Bitmaps flow

Nothing here is accurate, but as an overview, it’s OK.

Image stored in the APK, or in media pack

Image downloaded from internet, kept in device storage

Image loaded into memory (to the Heap)

Image loaded into GPU

Content is drawn on screen

Page 112: Performence #2  gpu

Bitmaps flow

Nothing here is accurate, but as an overview, it’s OK.

Image stored in the APK, or in media pack

Image downloaded from internet, kept in device storage

Image loaded into memory (to the Heap)

Image loaded into GPU

Content is drawn on screen

Uses PNG or JPEG formats, files are stored compressed

Optimized by AAPT

Uses one of 4 formats:ALPHA_8 1 byte per pixel, only alphaARGB_4444deprecatedARGB_8888 4 bytes per pixel, +alphaRGB_565 2 bytes per pixel, no alpha

Page 113: Performence #2  gpu

Bitmaps flow - Downloading Image stored in the APK, or in media pack

Image loaded into memory (to the Heap)

Image loaded into GPU

Content is drawn on screen

Image downloaded from internet, kept in device storage

Which image format is downloaded?Which Quality?Which Size?Which network? WIFI? Metered?For how long are these images kept? Any size limits?

Page 114: Performence #2  gpu

Image downloaded from internet, kept in device storage

Bitmaps flow - In Package

Image loaded into memory (to the Heap)

Image loaded into GPU

Content is drawn on screen

Which image format is stored?Which Quality?Which Size(s)?Which network? WIFI? Metered? → APK SIZEFor how long are these images kept? → ForeverAny size limits? → 100mb, but users will hate you!

Image stored in the APK, or in media pack

Page 115: Performence #2  gpu

Image stored in the APK, or in media pack

Image downloaded from internet, kept in device storage

Bitmaps flow - In Memory

Image loaded into GPU

Content is drawn on screen

What Pixel-Format is used?Does the image need to re-scale?How to scale the image efficiently?Does the image has enough room?

Image loaded into memory (to the Heap)

Page 116: Performence #2  gpu

Image stored in the APK, or in media pack

Image downloaded from internet, kept in device storage

Bitmaps flow - In GPU

Content is drawn on screen

Image loaded into memory (to the Heap)

How much time does it take to copy the image?Do you causing rendering to another buffer?

Image loaded into GPU

Page 117: Performence #2  gpu

Why is it important?java.lang.OutofMemoryError: bitmap size exceeds VM budget.

Nexus 5x takes pictures at 3840x2160 resolution, at 24-bit.

That’s 33,177,600 bytes of data.

Well, good-luck!https://developer.android.com/training/displaying-bitmaps/index.html

Page 118: Performence #2  gpu

Lucky us, We can optimize almost

everything,

And there’s A LOT of information out there

Page 119: Performence #2  gpu

Bitmaps flow - Optimize download Image stored in the APK, or in media pack

Image loaded into memory (to the Heap)

Image loaded into GPU

Content is drawn on screen

- Work with your server folks to provide you with smaller files and better formats

- Use Cache (LRU Cache is great!)

- Use an image handling library, like Glide or Picasso

Image downloaded from internet, kept in device storage

Page 120: Performence #2  gpu

Image downloaded from internet, kept in device storage

Bitmaps flow - Optimize APK size

Image loaded into memory (to the Heap)

Image loaded into GPU

Content is drawn on screen

- Choose the right format- Choose a good-enough quality- Add PNG/JPEG compression tools to your

build process- Remove unused resources

Image stored in the APK, or in media pack

Page 121: Performence #2  gpu

Image stored in the APK, or in media pack

Image downloaded from internet, kept in device storage

Bitmaps flow - Optimize loading & memory

Image loaded into GPU

Content is drawn on screen

- Choose the right pixel format- Resize - Load on non-UI thread- Reuse memory- Use object pools if needed

Image loaded into memory (to the Heap)

Page 122: Performence #2  gpu

Most Important Tip: Use a good image handling

library

and let’s check out *some* of the other things that we can do

Page 123: Performence #2  gpu

Trick #1: Load images efficiently

Step 1: Don’t load the image at all, Only decode its size.BitmapFactory.Options options = new BitmapFactory.Options();options.inJustDecodeBounds = true;BitmapFactory.decodeResource(getResources(), R.id.myimage, options);int imageHeight = options.outHeight;int imageWidth = options.outWidth;String imageType = options.outMimeType;

https://developer.android.com/training/displaying-bitmaps/load-bitmap.html

Page 124: Performence #2  gpu

Trick #1: Load images efficiently

Step 2: Don’t load the entire image, Sub-sample

https://developer.android.com/training/displaying-bitmaps/load-bitmap.html

Page 125: Performence #2  gpu

Trick #2: Cache and off-load

Use LRU-Cache to load bitmaps into memory and re-use them. This will help with list-views and similar, and help to limit the amount of memory used.

Move things off the UI thread. Use AsyncTask, but with caution, and handle concurrency, recycling, and lifecycle.https://developer.android.com/training/displaying-bitmaps/process-bitmap.html

https://developer.android.com/training/displaying-bitmaps/cache-bitmap.html

Page 126: Performence #2  gpu

Trick #3: Re-use bitmaps, in pools- Supported from 3.0 (API 11)- if possible, use the same memory.- Works only on mutable bitmaps

(BitmapFactory.Options)- Works only if image size is smaller than the buffer

(since KitKat, 4.4) Works only if image sizes are exactly the same

(before)

- Not easy workColt McAnlis: https://youtu.be/_ioFW3cyRV0 , and https://developer.android.com/training/displaying-bitmaps/manage-memory.html

Page 127: Performence #2  gpu

Trick #4: Be smarter than AAPT with PNGs

AAPT optimizes PNG files - but uses only these 3 optimizations:

- Is the image Grayscale

- Is the image Transparent- Is the image cheap to index

All loss-less optimizations. Apply a lossy optimization tool.

Page 128: Performence #2  gpu

Trick #4: Be smarter than AAPT with PNGs

You will probably want some more info.PNG:

How it works: https://medium.com/@duhroach/how-png-works-f1174e3cc7b7

Reducing size:https://medium.com/@duhroach/reducing-png-file-size-8473480d0476

AAPT:https://medium.com/@duhroach/smaller-pngs-and-android-s-aapt-tool-4ce38a24019d JPG:

How it works:https://medium.freecodecamp.com/how-jpg-works-a4dbd2316f35

Reducing size:https://medium.com/@duhroach/reducing-jpg-file-size-e5b27df3257c Both:

Watch this at home: https://youtu.be/r_LpCi6DQME (Google I/O 2016, Colt McAnlis)

Page 129: Performence #2  gpu

From everything we learned, What you should remember

SummaryΣ

Page 130: Performence #2  gpu

Theory

1.Systematic Performance Improvement2.Across-the-board impact3.Preventative / Premature optimizations4.The Getting-XML-to-Screen scheme

Animate

Measure Layout

Inflate

DrawExecut

e Sync

Page 131: Performence #2  gpu

Profiling tools

Page 132: Performence #2  gpu

Bitmap AbuseUse a good image handling libraryKnow the theoryPick the right formats

Try to reduce image sizeTry to reuse bitmaps if possible

Page 133: Performence #2  gpu

Any Questions?

Page 134: Performence #2  gpu

Thank you,Drive home safely

Page 135: Performence #2  gpu

Thank you,Drive home safely