22
Coprocessors – Uses, Abuses Solutions 26 // SEPTEMBER // 2016 COPYRIGHT 2016 BLOOMBERG FINANCE L.P. ALL RIGHTS RESERVED. Esther Kundin (With guest appearance by Clay Baenziger)

HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Embed Size (px)

Citation preview

Page 1: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Coprocessors – Uses, Abuses Solutions

• 26 // SEPTEMBER // 2016

COPYRIGHT 2016 BLOOMBERG FINANCE L.P. ALL RIGHTS RESERVED.

Esther Kundin(With guest appearance by Clay Baenziger)

Page 2: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Coprocessors

Page 3: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

What is a coprocessor?– Custom jar loaded into HBase daemon process– Endpoint – like a stored procedure– Observer – like a trigger

Page 4: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Observers– Region Observer

• preGet, postGet• prePut, postPut

– WAL Observer– Master Observer

• runs in HBase master• Create, Delete, Modify table

Page 5: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Why use a coprocessor?– Simple filter or aggregation run on your data– Reduces amount of data being sent to the client– NOT for complex data analysis– Ex: Apache Phoenix (“We put the SQL back in

NoSQL”)

Page 6: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

PORT – A sample use case

Page 7: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Post-Get example

RegionServer

postGet

Key Col1 Col2 Col3 Col4 Col5Key1Abc 1 4 5

Key1Def 2 2 2

Key1Xyz 10 11 12

Key1 Abc-col1 Def-col2 Abc-col3 Abc-col4 Xyz-col5Key1 1 2 4 5 12

Table Representation:

Coprocessor Result:

Page 8: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Problems and Solutions

Page 9: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Coprocessors crash regionservers– Exceptions (other than IOExceptions) in the

coprocessor bring down the RegionServer– In other cases, the coprocessor silently unloads

Page 10: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Solution – catch all exceptionspublic final void prePut(...) throws IOException { try { prePutImpl(…); } catch(IOException ex) { // Allow IOExceptions to propagate // They won't cause an unload throw ex; } catch(Throwable ex) { // Wrap other exceptions as IOException LOG.error("prePut: caught ", ex); throw new IOException(ex); }}

Page 11: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Coprocessors can hog memory– Memory is shared with RegionServer memory and

coprocessor memory– Memory hogging slows RegionServer Performance

Page 12: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Solutions - defensive Java code– Profile all coprocessor code for memory usage

• Use a generic profiler with a driver for your coprocessor

– Use common Java tricks for limiting memory usage• Use primitive types and underlying arrays where

possible• Use immutable objects• StringBuilder vs String concatenation

Page 13: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Problems with deployment– Manual Deployment

• disable table• assign new coprocessor• enable table

– Rollout of non-backward-compatible coprocessor difficult

Page 14: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Solutions– HBASE-7639 – online schema update is enabled,

perhaps it will work– Hard-code jar path in hbase-site.xml

• Used by Apache Phoenix• Not the best approach for user-defined coprocessors

Page 15: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Logging and metrics tips– Update log4j.properties file with a separate log

parameter for coprocessors– Use MDC context to pass parameters to all parts of

the coprocessor(http://www.slf4j.org/api/org/slf4j/MDC.html)

– Create an extra column in a Result to pass back an object populated with metrics

Page 16: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

– Bad request can bring down the whole cluster– Missing jar will bring down the RegionServerERROR org.apache.hadoop.hbase.coprocessor.CoprocessorHost: The coprocessor fooCoprocessor threw java.io.FileNotFoundException: File does not exist: /path/to/coprocessor.jar java.io.FileNotFoundException: File does not exist: /path/to/coprocessor.jar

Unsolved issues

Page 17: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

(Preventing) Abuses

Page 18: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

– Affects all region servers – one at a time– HTable descriptors contain coprocessor class:

Clean-up can be messy HBASE-14190 - Assign system tables ahead of user region assignment

– Set table property:hbase.coprocessor.abortonerror to false2016-09-24 02:32:07,366 ERROR org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Failed to load coprocessor net.clayb.hbase.coprocessor.RegionObserverjava.io.FileNotFoundException: File does not exist: hdfs://Test/user/foo/clayCoprocessor.jar(Region server stays alive only table stays disabled)

Load Failures

Page 19: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Handler Failure– Affect some operations and not others (e.g. scan works, not get)– RPC starvation is simple and non-obvious failure:

public class RegionObserverInfinity extends BaseRegionObserver { public void preGetOp(…) throws IOException { for(;;){ LOG.trace(“Off I go…”); }}

– Use jstack to see what is up in a region server:clay@hbase-regionserver:~$ sudo jstack 3990[…]net.clayb.RegionObserverInfinity.preGetOp(…) @bci=12, line=28 (Compiled frame; information may be imprecise)

Page 20: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Coprocessor Whitelisting– Coprocessors are key to HBase operation:

• AccessController• TokenProvider• SecureBulkLoadEndpoint• MultiRowMutationEndpoint

– hbase.coprocessor.user.enabled – disables all user coprocessors (e.g. Apache Phoenix)

– HBASE-16700 – “Allow for coprocessor whitelisting” or abuse HBASE-15686

Page 21: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Recap– Coprocessors are dangerous:

Coprocessors are an advanced feature of HBase and are intended to be used by system developers only. – HBase Book

– Write defensive code!– Needed from the community

• Story for coprocessor deployment• Process isolation• JMX metrics

Page 22: HBaseConEast2016: Coprocessors – Uses, Abuses and Solutions

Thank you!