Meetup presenation 06192013



Accumulo presentation from June 19, 2013 Meetup Presenter: Joe Cavanaugh

Citation preview

sqrrl Secure.  Scale.  Adapt  

Sqrrl Data, Inc., All Rights Reserved

Security  of  data  within  Hadoop  

2  Sqrrl Data, Inc., All Rights Reserved


<5%  of  Data  


General Data Problems

Source:    Forrester  

3  Sqrrl Data, Inc., All Rights Reserved

What about security?


4  Sqrrl Data, Inc., All Rights Reserved

What is the market saying?

security  becomes  an  “enabler”  by  making  it  possible  to  bring  together  huge  stores  of  data    

You  want  security  to  be  just  as  scalable,  high-­‐performance  and  self-­‐organizing  as  the  clusters  

most  big  data  technologies  don’t  have  any  security  features  built  in  

want  fine-­‐grained  security  and  policy  control  at  the  database-­‐level  

5  Sqrrl Data, Inc., All Rights Reserved


•  With  every  copy  of  data,  there  is  an  increased  risk  of  unintended  disclosure  

•  Every  now  and  then  people  with  access  and  privileges  take  a  look  at  records  without  a  legiCmate  business  purpose  e.g.,  an  employee  of  a  banking  system  looking  up  their  neighbor  

A few more risks…

6  Sqrrl Data, Inc., All Rights Reserved

The Perfect Storm


Security  Analysis  

Customer  Support  

Customer  Profiles  

Sales  &  MarkeCng  

Social  Media  

Business  Improvement  

Big  Data  

Regula+ons  &  Breaches   Increased


Increased profits

Increased profits

Increased profits

Increased profits

Increased profits

7  Sqrrl Data, Inc., All Rights Reserved

•  Big  Data  is  a  Cme-­‐bomb  based  on  how  things  are  coming  together  

•  Big  Data  deployment  is  growing  fast;  rushing  into  it  

•  Shortage  in  Big  Data  skills  

•  Big  Data  security  soluCons  are  not  effecCve  

•  General  shortage  in  security  skills  

The Perfect Storm


8  Sqrrl Data, Inc., All Rights Reserved

So  what  can  we  do?  

9  Sqrrl Data, Inc., All Rights Reserved


(Def.)  A  form  of  security  in  which  data  carries  with  it  the  elements  of  provenance  that  are  required  to  make  policy  decisions  on  its  visibility:    •  Separate  data  modeling  for  security  and  analysis  •  Data  comes  with  security  aYributes  governing  its  

visibility…  is  self-­‐describing  •  Reusability  of  applicaCons  across  security  domains  

•  Distributed  development  of  ingest  and  query  applicaCons  •  Supported  by  Accumulo’s  cell-­‐level  security  

Data-Centric Security

10  Sqrrl Data, Inc., All Rights Reserved

Data-Centric Security

Within  Accumulo,  a  key  is  a  5-­‐tuple,  consis+ng  of:      "   Row:  Controls  Atomicity  "   Column  Family:  Controls  Locality    "   Column  Qualifier:    Controls  Uniqueness  "   Visibility  Label:    Controls  Access  "   Timestamp:    Controls  Versioning  

Row   Col.  Fam.   Col.  Qual.   Visibility   Timestamp   Value  

John  Doe   Notes   PCP   PCP_JD   20120912   PaCent  suffers  from  an  acute  …  

John  Doe   Test  Results   Cholesterol   JD|PCP_JD   20120912   183  

John  Doe   Test  Results   Mental  Health   JD|PSYCH_JD   20120801   Pass  

John  Doe   Test  Results   X-­‐Ray   JD|PHYS_JD   20120513   1010110110100…  

Accumulo  Key/Value  Example  

11  Sqrrl Data, Inc., All Rights Reserved

Data-Centric Security

12  Sqrrl Data, Inc., All Rights Reserved

Data-Centric Security

Row Col Value 1 Name Jones 1 Sales 100 1 Age 28 2 Name Smith 2 Sales 350 2 Age 25 2   Quota   1000  

Row Col Value 1 Name Anon1 1 Sales 100 2 Name Smith 2 Sales 350 2   Quota   1000  

User 1 User 2 Data  Store  

Data-­‐centric  security  approach  allows  all  the  data  to  be  stored  on  a  single  pla9orm  and  only  authorized  data  is  returned  to  the  user  

Pushing  security  to  the  data-­‐level,  simplifies  applica@on  development  and  enables  more  powerful  queries  

13  Sqrrl Data, Inc., All Rights Reserved

We  now  have  user  access  to  the  data  secured.    But  what  about  your  

HDFS  administrators?  

Encryption of Files

14  Sqrrl Data, Inc., All Rights Reserved

Encryption of Files By  encrypCng  the  files  we  write  into  HDFS  we  further  eliminate  who  can  access  the  data!