31
Page 1 © Hortonworks Inc. 2014 Discover HDP 2.1 New Features for Security & Apache Knox Hortonworks. We do Hadoop.

Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Embed Size (px)

DESCRIPTION

Discover enterprise security features in Hortonworks Data Platform 2.1 (HDP) with Apache Knox

Citation preview

Page 1: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 1 © Hortonworks Inc. 2014

Discover HDP 2.1 New Features for Security & Apache Knox

Hortonworks. We do Hadoop.

Page 2: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 2 © Hortonworks Inc. 2014

Speakers

Justin Sears

Hortonworks Product Marketing Manager

Vinay Shukla

Hortonworks Director of Product Management & owner of Hortonworks security roadmap

Kevin Minder

Hortonworks Engineer & Committer for Apache Knox Gateway project

Page 3: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 3 © Hortonworks Inc. 2014

Agenda •  Security for Hadoop REST/HTTP API – Knox Gateway

•  HDFS Security – ACLs

•  SQL Security – Next Generation Hive Authorization

Page 4: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 4 © Hortonworks Inc. 2014

OPERATIONS*TOOLS*

Provision, Manage & Monitor

DEV*&*DATA*TOOLS*

Build & Test

A Modern Data Architecture APPLICATIONS*

DATA**SYSTEM*

REPOSITORIES*

RDBMS* EDW* MPP*

Business**

Analy<cs*

Custom*

Applica<ons*

Packaged*

Applica<ons*

Gov

erna

nce

&

Inte

grat

ion

ENTERPRISE HADOOP

Secu

rity

Ope

ratio

ns

Data Access

Data Management

SOURCES*

OLTP,&ERP,&CRM&Systems&

Documents,&&Emails&

Web&Logs,&Click&Streams&

Social&Networks&

Machine&Generated&

Sensor&Data&

GeolocaCon&Data&

Page 5: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 5 © Hortonworks Inc. 2014

HDP 2.1: Enterprise Hadoop

HDP 2.1 Hortonworks Data Platform

**

Provision,*

Manage*&*

Monitor*

&Ambari&

Zookeeper&

Scheduling*

&Oozie&

Data*Workflow,*

Lifecycle*&*

Governance*

*

Falcon&Sqoop&Flume&NFS&

WebHDFS&YARN*:*Data*Opera<ng*System&

DATA**MANAGEMENT*

SECURITY*DATA**ACCESS*GOVERNANCE*&*

INTEGRATION*

Authen<ca<on*

Authoriza<on*

Accoun<ng*

Data*Protec<on*

&Storage:&HDFS&

Resources:&YARN&Access:&Hive,&…&&Pipeline:&Falcon&Cluster:&Knox&

OPERATIONS*

Script*

&Pig&*

*

Search*

*

Solr&*

*

SQL*

*

Hive/Tez,&HCatalog&

*

*

NoSQL*

*

HBase&Accumulo&

*

*

Stream*

**

Storm&

&*

*

Others*

*

InTMemory&AnalyCcs,&&ISV&engines&

1& °& °& °& °& °& °& °& °& °&

°& °& °& °& °& °& °& °& °& °&

°& °& °& °& °& °& °& °& °& °&

°&

°&

N*

HDFS**

(Hadoop&Distributed&File&System)&

Batch*

*

Map&Reduce&

*

*

Page 6: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 6 © Hortonworks Inc. 2014

HDP 2.1: Enterprise Hadoop

HDP 2.1 Hortonworks Data Platform

**

Provision,*

Manage*&*

Monitor*

&Ambari&

Zookeeper&

Scheduling*

&Oozie&

Data*Workflow,*

Lifecycle*&*

Governance*

*

Falcon&Sqoop&Flume&NFS&

WebHDFS&YARN*:*Data*Opera<ng*System&

DATA**MANAGEMENT*

DATA**ACCESS*GOVERNANCE*&*

INTEGRATION*OPERATIONS*

Script*

&Pig&*

*

Search*

*

Solr&*

*

SQL*

*

Hive/Tez,&HCatalog&

*

*

NoSQL*

*

HBase&Accumulo&

*

*

Stream*

**

Storm&

&*

*

Others*

*

InTMemory&AnalyCcs,&&ISV&engines&

1& °& °& °& °& °& °& °& °& °&

°& °& °& °& °& °& °& °& °& °&

°& °& °& °& °& °& °& °& °& °&

°&

°&

N*

HDFS**

(Hadoop&Distributed&File&System)&

Batch*

*

Map&Reduce&

*

*

SECURITY*

Authen<ca<on*

Authoriza<on*

Accoun<ng*

Data*Protec<on*

&Storage:&HDFS&

Resources:&YARN&Access:&Hive,&…&&Pipeline:&Falcon&Cluster:&Knox&

Page 7: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 7 © Hortonworks Inc. 2014

Security: Rings of Defense

Perimeter Level Security •  Network Security (i.e. Firewalls) •  Apache Knox (i.e. Gateways)

Authentication •  Kerberos

OS Security

Authorization •  MR ACLs •  HDFS Permissions •  HDFS ACLs •  HiveATZ-NG •  HBase ACLs •  Accumulo Label Security

Data Protection •  Core Hadoop •  Partners

Page 8: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 8 © Hortonworks Inc. 2014

Security for Hadoop REST API – Apache Knox Gateway

Page 9: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 9 © Hortonworks Inc. 2014

Current Hadoop Client Model

• FileSystem and MapReduce Java APIs • HDFS, Pig, Hive and Oozie clients (that wrap the Java APIs) • Typical use of APIs is via “Edge Node” that is “inside” cluster • Users SSH to Edge Node and execute API commands from shell

Hadoop User Edge Node

SSH!

Page 10: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 10 © Hortonworks Inc. 2014

Why Knox?

Simplified Access

Single Hadoop access point

Rationalized REST API hierarchy

Consolidated API calls

Multi-cluster support

Client DSL

Centralized Security

Eliminate SSH “edge node”

Central API management & audit

Service-level authorization

Identity Management

SSO Integration

LDAP & AD integration

Knox eliminates the client’s requirements for intimate knowledge of cluster topology

Page 11: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 11 © Hortonworks Inc. 2014

Hadoop REST API Security: Drill-Down

REST Client

Enterprise Identity Provider LDAP/AD

Knox Gateway

GW GW

Firewall

Firewall DMZ

LB

Edge Node/

Hadoop CLIs

Edge Node/

Hadoop CLIs

RPC

HTTP

HTTP HTTP

LDAP

RPC

Hadoop Cluster 2 Masters

Slaves

NN

RM Oozie Web HCat HS2

HBase

DN NM

Hadoop Cluster 2 Masters

Slaves

NN

RM Oozie Web HCat HS2

HBase

DN NM

Page 12: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 12 © Hortonworks Inc. 2014

Knox Summary

• Simplifies Client Interaction with REST Web Services

• Abstracts away complexities of Kerberos

•  Integrates with LDAP, Site Minder & other protocols in future

• Provides Authorization to each Web Service with IP, User, Group policies

• Able to secure multiple clusters through a single-endpoint

Page 13: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 13 © Hortonworks Inc. 2014

HDFS Access Control List (ACL)

Page 14: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 14 © Hortonworks Inc. 2014

HDFS Permissions Model Before HDP 2.1

• HDFS permissions at a File & Directory level • Managed by a set of 3 distinct user classes

– “owner”, “group” and “others”

• 3 permissions for each user class – Read (“r”), Write (“w”), Execute (“e”) – For Files, “r” for read, “w” for write – For Directories, “r” to list content, “w” to create/delete files +

directories, “x” for access child of directory

Owner

Group

Others

HDFS Directory

… rwx

… rwx

… rwx

Page 15: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 15 © Hortonworks Inc. 2014

HDFS File Permissions Example

• Authorization requirements: –  In a sales department, they would like a single user Maya (Department

Manager) to control all modifications to sales data – Other members of sales department need to view the data, but can’t modify it. – Everyone else in the company must not be allowed to view the data.

• Can be implemented via the following:

Read/Write perm for user maya

User

Group Read perm for

group sales

File with sales data

Page 16: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 16 © Hortonworks Inc. 2014

HDFS Extended ACLs in HDP 2.1

• Problem – No longer feasible for Maya to control all modifications to the file

–  New Requirement: Maya, Diane and Clark are allowed to make modifications

–  New Requirement: New group called executives should be able to read the sales data

– Current permissions model only allows permissions at 1 group and 1 user

• Solution: HDFS Extended ACLs – Now assign different permissions to different users and groups

Owner

Group

Others

HDFS Directory

… rwx

… rwx

… rwx

Group D … rwx

Group F … rwx

User Y … rwx

Page 17: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 17 © Hortonworks Inc. 2014

HDFS Extended ACLs in HDP 2.1

New Tools for ACL Management (setfacl, getfacl)

– hdfs dfs -setfacl -m group:execs:r-- /sales-data!– hdfs dfs -getfacl /sales-data # file: /sales-data # owner: maya # group: sales user::rw- group::r-- group:execs:r-- mask::r-- other::--!

How do you know if a directory has ACLs set? – hdfs dfs -ls /sales-data Found 1 items -rw-r-----+  3 maya sales          0 2014-03-04 16:31 /sales-data!

Page 18: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 18 © Hortonworks Inc. 2014

HDFS Extended ACLs in HDP 2.1

Default ACLs – hdfs dfs -setfacl -m default:group:execs:r-x /monthly-sales-data!– hdfs dfs -mkdir /monthly-sales-data/JAN!– hdfs dfs –getfacl /monthly-sales-data/JAN!–  # file: /monthly-sales-data/JAN # owner: maya # group: sales user::rwx group::r-x group:execs:r-x mask::r-x other::--- default:user::rwx default:group::r-x default:group:execs:r-x default:mask::r-x default:other::---"

Page 19: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 19 © Hortonworks Inc. 2014

SQL-Style Security for Hive –ATZ-NG

Page 20: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 20 © Hortonworks Inc. 2014

Hive Authorization Before HDP 2.1

HiveAuthorizationProvider(HAP) as the base interface 1.  StorageBasedAuthorizationProvider

– Uses HDFS permissions to make authorization decision

– HDFS dir permission = Table Permission

– Coarse grained, no column level security

– Secure://hive.apache.org/docs/hcat_r0.5.0/authorization.pdf

2.  DefaultHiveAuthorizationProvider – BROKEN HORTONWORKS RECOMMENDATION: DO NOT USE – RDBMS style authorization provider

– Does not check all operations

– Does not check policy grants

Page 21: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 21 © Hortonworks Inc. 2014

Hive Authorization in HDP 2.1 • Many paths into Hive

– Hive CLI, Beeline, Oozie, Hue, Pig, HCatalog, etc. – Admin type users use CLI, Pig, HCatalog – Business users use O/JDBC, Beeline

• Other security concerns – Authentication is enforced. It is a pre-requisite to meaningful

authorization – No direct access to HDFS – cluster is Kerberized and restricts

access – Hive Metastore is protected and allows only authorized access – Views are used to provide row/column level access with ATZ-NG

Page 22: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 22 © Hortonworks Inc. 2014

Hive ATZ-NG – Architecture

HDFS

Metastore

HiveServer2

O/JDBC Beeline CLI

•  ATZ-NG is called for O/JDBC & Beeline CLI •  Standard SQL GRANT / REVOKE for management •  Privilege to register UDF restricted to Admin user •  Policy integrated with Table/View life cycle

Storage Based Authorization

Hive CLI

Oozie Hue

PIG HCat

Ambari

0. Enable HiveATZ-NG 1. Authentication

UDFs

Protected – fine grained

Protected -- coarse grained

Restrict direct access to Metastore Protect HDFS with Kerberos & HDFS ACL

ATZ-NG 2. Authorization

Page 23: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 23 © Hortonworks Inc. 2014

Hive ATZ-NG Details Hive ATZ NG SQL standard-based authorization Manually config Hive to enable, Hive restart required

Grants on tables or views to roles or users GRANT/REVOKE action ON [table | view] to role | user!

Policy stored in Hive Metastore Table/View lifecycle auto-synced with policy stored in Hive Metastore

Grant/Revoke does integrity check, prevents invalid policies

Show grants on user | table | view | role & shows policy

Supports delegated administration All data need to be readable/writable by Hive user, combined with HDFS ACL, need not be owned by Hive user Back up of Policy same as Hive Metastore backup Check on the ability to register UDF

Page 24: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 24 © Hortonworks Inc. 2014

What about MR/Pig/Hive CLI?

• All these are ETL run by privileged users

• Protect them at coarse grained level with StorageBasedAuthorization

Page 25: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 25 © Hortonworks Inc. 2014

Summary

ATZ-NG is a superior approach for Hive Authorization because it delivers: 1.  Familiar & DBA-friendly approach for defining security policies

for Hive Tables. No additional education required to understand how to take advantage of this.

2.  Integrated and error-free policy definition approach which

works in lock-step with the lifecycle of tables and views. 3.  Minimal additional operational overhead to take advantage of

ATZ-NG; from no required MR/YARN restart through leveraging pre-existing Hive Metastore (and associated handling - back-up, recovery, etc.)

Page 26: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 26 © Hortonworks Inc. 2014

Hive ATZ-NG Example

Page 26

Page 27: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 27 © Hortonworks Inc. 2014

Scenario

• Objective: Share Product Management Roadmap securely

• Actors:

– Admin Role – Specified in hive-site – Admin role controls role memberships

– Product Management Role – Should be able to create, read all road map details.

– Members: Vinay Shukla, Tim Hall

– Engineering Role – Should be able to read (see) all roadmap details

– Members: Kevin Minder, Larry McCay

Page 28: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 28 © Hortonworks Inc. 2014

Step 1: Admin role Creates Roles, Adds Users

1.  CREATE ROLE PM; 2.  CREATE ROLE ENG; 3.  GRANT ROLE PM to user timhall with admin option; 4.  GRANT ROLE PM to user vinayshukla; 5.  GRANT ROLE ENG to user kevinminder with admin option; 6.  GRANT ROLE ENG to user larrymccay;

Page 29: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 29 © Hortonworks Inc. 2014

Step 2: Super-user Creates Tables/Views

create table hdp_hadoop_plans ( id int, hadoop_roadmap string, hdp_roadmap string );

Page 30: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 30 © Hortonworks Inc. 2014

Step 3: Users or Roles Assigned To Tables

1.  GRANT ALL ON hdp_hadoop_plans TO ROLE PM; 2.  GRANT SELECT ON hdp_hadoop_plans TO ROLE

ENG;

Page 31: Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apache Knox

Page 31 © Hortonworks Inc. 2014

Learn More

Hortonworks.com/labs/security/

Register for the other six Discover HDP 2.1 Webinars Hortonworks.com/webinars

Next on the Security Roadmap