78
Site Access Control Grid Middleware 3 David Groep, lecture series 2005-2006

Site Access Control

  • Upload
    wesley

  • View
    49

  • Download
    0

Embed Size (px)

DESCRIPTION

Site Access Control. Grid Middleware 3 David Groep, lecture series 2005-2006. Outline. Authorisation Framework Policies, policy combination and enforcement PDP, PIP and PEP LCAS, GridSite, JavaTrustManager Credential Mapping and local policies unix domain for jobs - PowerPoint PPT Presentation

Citation preview

Page 1: Site Access Control

Site Access Control

Grid Middleware 3

David Groep, lecture series 2005-2006

Page 2: Site Access Control

Grid Middleware III 2

Outline

Authorisation Framework Policies, policy combination and enforcement PDP, PIP and PEP LCAS, GridSite, JavaTrustManager

Credential Mapping and local policies unix domain for jobs account mapping across clusters (LDAP, or NSS-JR) workspace service and virtualisation

Policy languages SAML and XACML

Scheduling and batch systems getting into the mess: preserving priorities?

Network and firewall issues

Page 3: Site Access Control

Grid Middleware III 3

Authorization Stakeholders, a reminder

Key Material

Group of unique names Organizational role

Server

UserAttributesVO

Policy

ResourceAttributesSite

Policy

Policy

Authorization PolicyArchitecture

Local SiteKerberosIdentity

PolicyEnforcement

Point

VOOther

Stakeholders

Site/Resource

OwnerAuthorization

Service/PDP

Policy andattributes.

Allow orDeny

Resource

Standardize

Delegation

User

Process actingon user’s behalf

PKI/KerberosIdentity

TranslationService

PKIIdentity

Delegation Policy

Page 4: Site Access Control

Grid Middleware III 4

Site Access Control

Proxycert

service

AuthenticationAuthentication

Hostcert

Authorization

ServiceContainer

Revo-cation

Sitepolicy

Accesscontrol

Trustanchors

delegation

Deleg.cert

Resource

Credstore

AAVO

policy

Logging

Sand-boxing

user space

Transport level security...

Networkperimeter

Site Proxy

Page 5: Site Access Control

Grid Middleware III 5

What Problems should we solve

Fit ‘external’ users into existing infrastructure fork jobs, batch queuing systems, disks & mass stores, …

enforce (local) security policies e.g. a panic button, blacklisting of users

ensure that global (VO) policies are satisfied locally in as far as possible, given local mechanisms

Page 6: Site Access Control

Grid Middleware III 6

What To Do?

1. collect policies that need to be enforced2. collect the information

on which to base the decision3. make the decision

Page 7: Site Access Control

Grid Middleware III 7

Distinct pieces in the process flow

Authentication

Authorization results in a yes/no decision possibly with obligations

Service access For legacy job execution services (Unix, Win32), need an execution

environment with a system credential:Credential Mapping needed (site-local, to preserve autonomy)

For hosted servicesexecute the service request, taking into account service-specific access controls (such as ACLs for file catalogue access)

Page 8: Site Access Control

Grid Middleware III 9

Architecture for Local Access Control

PIP PIP PDPPDP

Context cache

User inblacklist?

RetrieveVO info

Retrievelocal info

evaluate

ORChain 1 ...

Chain 2 ...

decision

PEPService ormessage

interceptor

= PAP accessible interface

Page 9: Site Access Control

Grid Middleware III 10

Authorization Framework components

Policy Decision Point (PDP) makes a binary “yes/no” decision based on a policy, that can be nested

Policy Information Point (PIP) extract information from sources, needed to provide the

assertions for making the decision Policies can be ‘hybrid’

“SAML with obligations”, where the policy evaluates to Yes/No, but a Yes carries specific obligations for the PEP to enforce (such as an account mapping)

Policy Enforcement Point (PEP) enforce the yes/no decision of a PDP

Page 10: Site Access Control

Grid Middleware III 11

A ‘perfectly’ integrated model

1. Collect policies and assertions through PIPs attributes pushed by the client, e.g. VOMS-enabled proxies attributed retrieved via a pull mode (Shib-style, e.g. via WAYF) policies that constrain usage (restricted delegation) site-local policies resource-local policies

2. Assert their authenticity and validity incoming policies and assertions should be signed (authentic) not expired and recognised by the deciding party (valid)

3. Evaluate all these policies together in a (single) PDP

4. Possibly, return obligations for e.g. unix system integration

Page 11: Site Access Control

Grid Middleware III 12

Issues with an integrated model

Requires harmonisation (or translation via a context cache)

Should include precedence information (site prevalence)

In practice, some ‘quick’ decision should be made earlier for performance reasons (black-listing, for instance)

today a mixture of course-grained only (compute) ACL based (storage) simple binary (brokering)

PIP PIP PDPPDP

Context cache

User inblacklist?

RetrieveVO info

Retrievelocal info

evaluate

ORChain 1 ...

Chain 2 ...

decision

PEPService ormessage

interceptor

= PAP accessible interface

Page 12: Site Access Control

Grid Middleware III 13

Issues with the model (2)

AuthZ cannot always be cleanly separated from the service

Example: data access access to the service ‘read file’ can be allowed, but access to

any specific file restricted by ACLs on the file so the AuthZ front-end should either ‘snoop’ into the service

and use the ACLs or the business logic should do the AuthZ (again)

see access control to services slide later

Page 13: Site Access Control

Grid Middleware III 14

Local Policy today: mapping credentials

LocalPolicy

LocalPolicy

Map tolocal name

Map tolocal name

GridIdentity

graphic: Frank Siebenlist, Argonne Natl. Lab, Globus Security with SAML, Shibboleth and GridShib, May 2005

Grid user is mapped to local identities to determine policy

At the site, policies are expressed in these local identities (like unix groups)

Page 14: Site Access Control

Grid Middleware III 15

Implementations today

All separate authN and authZ most separate mapping from authZ

where necessary, e.g. for databases and legacy execution in most general sense, it is a handle for local policy

Faced with two worlds: Java and native (C) Need a solution for both, but no unification yet

Examples: C world: LCAS & LCMAPS

(so this talk misses out on Prima, GUMS, gPlazma, …) Java: GT4 Authorization Framework

(so this talk misses out on Generic AAA, PERMIS, …) papers on the others are in the bundle!

Page 15: Site Access Control

Grid Middleware III 17

The C World: job submission and GridFTP

LCAS authorization based on credentials and job information (RSL) returns a yes/no answer pluggable framework of PDPs, using dlopen(3) system

LCMAPS credential mapping based on user credential and additional

handle information enforcement within the process space needed

GUMS: access based on a site-local database of grid-to-local creds gPlazma: gid and uid assignments for ACLs on storage

Page 16: Site Access Control

Grid Middleware III 18

Job submission components

most simple case1. make access

decision2. figure out a local

policy mapping (unix account)

3. log what you did4. run the job

Page 17: Site Access Control

Grid Middleware III 19

Job Submission (classic case)

1. user authenticates to a gatekeeper provides proxy with identity & attributes provides job information (RSL)

2. gatekeeper verifies integrity and authN3. do an authZ callout

(in our case LCAS) the AuthZ framework

calls PDPs

4. do credential mapping5. run the job manager

GatekeeperLCAS

VOMS

banlist

exectbl

C=IT/O=INFN /L=CNAF/CN=Pinco Palla/CN=proxy

VOMS

pseudo-cert

Job Managerfork+exec args or submit

LCMAPS open, learn,&run:

… and return legacy uid

LCMAPS open, learn,&run:

… and return legacy uid

LCAS authZ call out

Authentication

accept

Context+ JobInfo

User

Page 18: Site Access Control

Grid Middleware III 20

Policy Decision modules

Typical PDPs Blacklist and whitelist VOMS attributes (group/role based access) Proxy lifetime constraints Checking quality of authentication tokens (OIDs) timeslots

As many of these PDPs are (still) local, the grid is missing the ‘big red button’ …

Page 19: Site Access Control

Grid Middleware III 21

PDP configuration

A PDP is driven by a policy language typical choice is XACML

Page 20: Site Access Control

Policy languages

Driving the PDP

Page 21: Site Access Control

Grid Middleware III 23

Policy Evaluation (a reminder)

Page 22: Site Access Control

Grid Middleware III 24

Policy Languages

Home-grown languages infinite number out there …

XACML structured “Subject, Resource, Action” triplets supports logical constructs (and, or) also includes a retrieve/push operation**

SAML assertion mark-up, but can also be used to express policies** not typically used as such (XACML has a preference in the community,

since it is more expressive for ACLs) GACL (Grid ACL)

semantically a subset of XACML can be translated back and forth comprehensible for sysadmins

Page 23: Site Access Control

Grid Middleware III 25

SAML – conveying assertions

Security Assertion Mark-up Language XML format for exchanging assertions over the wire a message exchange protocol: how to ask and get assertions OASIS standard

in itself, SAML does not define the integrity of the assertion SAML assertion itself is typically signed using XML Signature

when travelling between untrusted end-points

Page 24: Site Access Control

Grid Middleware III 26

SAML assertion types

Authentication Assertion Attribute Assertion Authorisation Decision

Page 25: Site Access Control

Grid Middleware III 27

All assertions have some common information

Issuer and issuance timestamp Assertion ID Subject

Name plus the security domain Optional subject confirmation, e.g. public key

“Conditions” under which assertion is valid SAML clients must reject assertions containing unsupported

conditions Special kind of condition: assertion validity period

Additional “advice” E.g., to explain how the assertion was made

Page 26: Site Access Control

Grid Middleware III 28

Attribute Assertion

An issuing authority asserts that: subject S is associated with attributes A, B, … with values “a”, “b”, “c”…

Typically this would be gotten from an LDAP repository “john.doe” in “example.com” is associated with attribute “Department” with value “Human Resources”

Page 27: Site Access Control

Grid Middleware III 29

SAML Attribute Assertion

<saml:Assertion …> <saml:Conditions …/> <saml:AttributeStatement> <saml:Subject> <saml:NameIdentifier SecurityDomain=“smithco.com” Name=“joeuser” /> </saml:Subject> <saml:Attribute AttributeName=“PaidStatus” AttributeNamespace=“http://smithco.com”> <saml:AttributeValue> PaidUp </saml:AttributeValue> </saml:Attribute> </saml:AttributeStatement></saml:Assertion>

Page 28: Site Access Control

Grid Middleware III 30

AuthZ Decision Assertion

An issuing authority decides whether to grant the request: by subject S for access type A to resource R given evidence E

The subject could be a human or a program The resource could be a web page or a web

service, for example

Page 29: Site Access Control

Grid Middleware III 31

SAML Authorization Decision Assertion

<saml:Assertion …> <saml:Conditions …/> <saml:AuthorizationStatement Decision=“Permit” Resource=“http://jonesco.com/rpt_12345.htm”> <saml:Subject> <saml:NameIdentifier SecurityDomain=“smithco.com” Name=“joeuser” /> </saml:Subject> </saml:AuthorizationStatement></saml:Assertion>

Page 30: Site Access Control

Grid Middleware III 32

XACML: access control

XACML well suited to express local policies can interoperate with incoming SAML assertions by translation works on triplet

(subject, resource, action) – RULE –> (decision [,obligation])

Page 31: Site Access Control

Grid Middleware III 33

XACML policy type

Page 32: Site Access Control

Grid Middleware III 34

XACML rules

a (set of) conditions that evaluate the incoming triples and result in a decision

Page 33: Site Access Control

Grid Middleware III 35

XACML request example <AAARequest> <Subject Id="subject"> <Attribute AttributeId="subject:subject-id" Issuer="[email protected]"> <AttributeValue>[email protected]</AttributeValue> </Attribute> <Attribute AttributeId="subject:role" Issuer="[email protected]"> <AttributeValue>Analyst</AttributeValue> </Attribute> </Subject> <Resource> <Attribute AttributeId="resource:resource-id" Issuer="[email protected]"> <AttributeValue>http://xps1.example.org/XPS1</AttributeValue> </Attribute> </Resource> <Action> <Attribute AttributeId="action:token" Issuer="[email protected]"> <AttributeValue>SubmitJob</xacmlcontext:AttributeValue> </Attribute> </Action> <Environment> <Attribute AttributeId="environment:issue-time" Issuer="[email protected]"> <AttributeValue>2001-12-17T10:30:00</AttributeValue> </Attribute> </Environment> </AAARequest>

XACML example from Yuri Demchenko, collaboratory.nl

Page 34: Site Access Control

Grid Middleware III 36

XACML policy: login from 9 AM to 5 PM<Policy PolicyId="SamplePolicy" RuleCombiningAlgId="urn:oasis:names:tc:xacml:1.0:rule-combining-algorithm:first-applicable"> <Target> <Subjects> <AnySubject/> </Subjects> <Resources> <Resource> <ResourceMatch MatchId="urn:oasis:names:tc:xacml:1.0:function:string-equal"> <AttributeValue>SampleServer</AttributeValue> <ResourceAttributeDesignator AttributeId="urn:oasis:names:tc:xacml:1.0:resource:resource-id"/> </ResourceMatch> </Resource> </Resources> <Actions> <AnyAction/> </Actions> </Target> <Rule RuleId="LoginRule" Effect="Permit"> <!-- Only use this Rule if the action is login --> <Target> <Subjects> <AnySubject/> </Subjects> <Resources> <AnyResource/> </Resources> <Actions> <Action> <ActionMatch> <AttributeValue>login</AttributeValue> <ActionAttributeDesignator AttributeId="ServerAction"/> </ActionMatch> </Action> </Actions> </Target> <Condition FunctionId="urn:oasis:names:tc:xacml:1.0:function:and"> <Apply FunctionId="urn:oasis:names:tc:xacml:1.0:function:time-greater-than-or-equal"> <Apply FunctionId="urn:oasis:names:tc:xacml:1.0:function:time-one-and-only"> <EnvironmentAttributeDesignator AttributeId="urn:oasis:names:tc:xacml:1.0:environment:current-time"/> </Apply> <AttributeValue>09:00:00</AttributeValue> </Apply> <Apply FunctionId="urn:oasis:names:tc:xacml:1.0:function:time-less-than-or-equal"> <Apply FunctionId="urn:oasis:names:tc:xacml:1.0:function:time-one-and-only"> <EnvironmentAttributeDesignator AttributeId="urn:oasis:names:tc:xacml:1.0:environment:current-time"/> </Apply> <AttributeValue>17:00:00</AttributeValue> </Apply> </Condition> </Rule> </Policy>

Page 35: Site Access Control

Grid Middleware III 37

Simpler policy languages

XACML is too expressive to be understood by people

so lots of graphical editors (not so handy for large-scale

adminstration) translators from and to simpler languages

Example: GACL, the Grid ACL language

Page 36: Site Access Control

Grid Middleware III 38

GACL example

<?xml version="1.0"?><gacl version="0.0.1"><entry>

<person><dn>/O=dutchgrid/O=users/O=nikhef/CN=Willem van Leeuwen</dn></person><allow><read/><write/></allow><deny><admin/></deny>

</entry>

<entry><voms-cred><vo>iteam</vo><group>/iteam</group></voms-cred><allow><read/><write/></allow><deny><list/><admin/></deny>

</entry></gacl>

This piece of GACL is for use with the SlashGrid filesystem, i.e. is geared towards file system operations.see www.gridsite.org

Page 37: Site Access Control

Credential mapping

and virtual workspaces

Page 38: Site Access Control

Grid Middleware III 40

Credential Mapping

For legacy jobs, need to privision an environment Unix account

persistent (grid-mapfile) group account (looses tracability/accountability) dynamic assignment pool-accounts pool-accounts with expiration (**)

Virtual machines

Page 39: Site Access Control

Grid Middleware III 41

grid-mapfile

static mapping between grid user and local uid cannot support multiple-VO membership

or user needs a new identity for each VO

combines authZ and mapping

Page 40: Site Access Control

Grid Middleware III 42

poolaccounts

minimal modification to the gridmap-file code same file format uses a state directory where DNs are associated with unix

accounts (using hard links, as they are atomic across NFS) account mapping to “.pool” -> pool000, pool001, pool002, …

when the account mapping is requested1. look in a state directory if this DN has an account mapping2. if so, return this mapping3. otherwise, pick next unused account from the pool and make

a hardlink between DN and accountname file4. return that mapping

periodic clean up of the gridmapdir is possible but dangerous, as jobs may still be running or files left

Page 41: Site Access Control

Grid Middleware III 43

Gridmapdir associations

$ ls -li /share/grid-security/gridmapdir/

...7635756 –rw-r--r-- 2 Jan 8 19:44 %2fo%3ddutchgrid%2fo%3dusers%2fo%3dnikhef%2fcn%3ddavid%20groep

7635756 -rw-r--r-- 2 Jan 8 19:44 dteam060

...

Page 42: Site Access Control

Grid Middleware III 44

LCMAPS

There are many ways to collect credentials in a site /etc/passwd, NIS, NIS+, LDAP, Hesiod, MySQL need a extensible, pluggable framework

Backward compatible with existing systems (grid-mapfile) Support for multiple VOs per user (and thus multiple UNIX

groups) Mimimum system administration

Poolaccounts Pool”groups”

Boundary conditions Has to run in privileged mode Has to run in process space of incoming connection (for fork jobs)

Page 43: Site Access Control

Grid Middleware III 45

LCMAPS – functionality view

Unix mapping based on VOMS groups, roles Supports pool groups as well as pool accounts Granularity set by the site administrator (see example) Primary group set to first VOMS group

for accounting purposes for use by schedulers in setting priority

Identity set before job gets to batch system accounting based on batch system data needs specific USER id for each job

submitted traceability of system/network activity inside the job or ~wrapper limit impact of a compromised identity (so as not to hurt production)

Page 44: Site Access Control

Grid Middleware III 46

LCMAPS – control flow

User authenticates using (VOMS) proxy

LCMAPS library invoked Acquire all relevant credentials Enforce “external” credentials Enforce credentials on

current process tree at the end

Run job manager Fork will be OK by default Batch systems may need

primary group explicitly Batch systems will need updated

(distributed) UNIX account info

Order and function: policy-based

CREDs

LCMAPSCredential Acquisition

& Enforcement

Job Mngr

GK

Page 45: Site Access Control

Grid Middleware III 47

LCMAPS Acquisition and Enforcement

Need two phases

1. Acquisition (while in privileged mode) access to root-only-readable files with, e.g., LDAP passwords access to host certificate/key for remote communications

2. Enforcement Do actual setuid(2) and initgroups(2) calls from that point on, it’s irreversible must be within the running process

Page 46: Site Access Control

Grid Middleware III 48

LCMAPS – modules Modules represent atomic functionality

VOMS extract VOMS credentials from the proxy (A) PoolAccounts from username assign unique uid (A) PoolGroups from (VOMS) groupname assign unique gid (A) LocalAccount from username assign local existing unique

uid (A) LocalGroups from (VOMS) groupname assign local existing

gid (A) VOMS PoolAccounts from (VOMS) username assign unique

uid (A) AFS/Krb5 get token based on user DN info (A)

POSIX process setuid() and setegid() (E) POSIX LDAP update distributed user database (E) Krb5 run job via k5cert (E) …

Page 47: Site Access Control

Grid Middleware III 49

LCMAPS – policy evaluation

State machine approach (superset of boolean expressions)

Policy description file:

VOMS-groupLocalAccount

PoolAccount

LDAP POSIX

FALSE

TRUE

path = /opt/edg/lib/lcmaps/modules

localaccount ="lcmaps_localaccount.mod \ -gridmapfile /etc/grid-security/grid-mapfile"poolaccount = "lcmaps_poolaccount.mod -gridmapfile /etc/grid-security/grid-mapfile"posix_enf = "lcmaps_posix.mod -maxuid 1 -maxpgid 1 -maxsgid 32"voms = "lcmaps_voms.mod -vomsdir /etc/grid-security/certificates \ -certdir /etc/grid-security/certificates"standard:voms -> poolaccount | localaccountlocalaccount -> posix_enfpoolaccount -> ldapldap -> posix_enf

/opt/edg/etc/lcmaps/lcmaps.db

Page 48: Site Access Control

Grid Middleware III 50

VOMS to Unix domain mapping

# groupmapfile

"/VO=atlas/GROUP=/atlas/phys-*/ROLE=production" atlphprd "/VO=atlas/GROUP=/atlas/det-*/ROLE=production" atldtprd "/VO=atlas/GROUP=/atlas/simul-*/ROLE=production" atlsiprd "/VO=atlas/GROUP=/atlas/*" atlgnusr

"/VO=EGEE/GROUP=/EGEE/picard/Role=Manager" iteamsgm

“/VO=Wilma/GROUP=/Wilma/bfys/Role=prod” wilmgr "/VO=Wilma/GROUP=/Wilma/*" .wilma

example groups

Groups can be used for setting scheduling priorities, accounting &c. If site supports it, a single uniform pool of UIDs can be used for all GIDs For the “fork” JM, and for sites with a modifiable userdb, the job will collect all

supplementary gids, as per the VOMS proxy

* Syntax as per VOMS attribute naming in 2.6

Page 49: Site Access Control

Grid Middleware III 51

VOMS to Unix domain mapping

# gridmapfile

"/VO=atlas/GROUP=/atlas/phys-*/ROLE=production" .aphpd "/VO=atlas/GROUP=/atlas/det-*/ROLE=production" .adtpd "/VO=atlas/GROUP=/atlas/simul-*/ROLE=production" .asipd "/VO=atlas/GROUP=/atlas/*" .agnu

"/VO=EGEE/GROUP=/EGEE/picard/Role=Manager" .isgm

“/VO=Wilma/GROUP=/Wilma/bfys/Role=prod” .wgr #"/VO=Wilma/GROUP=/Wilma/*" CANNOT BE ENFORCED

example gridmap

… or using poolaccounts with a static group association

But you need a sufficiently large pool for each group Only the primary group and its fixed assoc. supplementaries will be set

* Syntax as per VOMS attribute naming in 2.6

Page 50: Site Access Control

Grid Middleware III 52

LCMAPS – caveats Unix mapping based on VOMS groups, roles, and capabilities Possibly pool groups as well as pool accounts Granularity set by the site administrator (see example following) Primary group set to first VOMS group – accounting

More than one VO/group per grid user allowed [but…] Each VOMS unique FQAN listed translates into 1 Unix group id Each user-FQAN combination translates into 1 Unix user id

New mechanisms could mitigate issues: groups-on-demand, support granularity at any level Central user directory support (nss_LDAP, pam-ldap)Not ready – and priorities have not been assigned to this yet.

Page 51: Site Access Control

Grid Middleware III 53

Work Space Service

On the road towards virtualized resources:

Work Space Service

Managed accounts enable life cycle management controlled account management (VO can request/release) “special” QoS requests Use to request credentials (groups) with specific prios?

future: provision a virtual machine WS-RF style GT4 service

uses LCMAPS as a back-end for account leasing

http://www.mcs.anl.gov/workspace/

Page 52: Site Access Control

Grid Middleware III 54

Workspaces and accounts

Unix account is ‘just’ a specific kind of workspace ought to be provides like one

Page 53: Site Access Control

Grid Middleware III 55

Work Space Service

Work by Kate Keahey, et al., ANL

provision workspaces before the job starts static account mappings: grid-mapfile with poolaccounts (using LCMAPS) provisioning of workspace using VM technology

provides abstraction of resources option for a ‘feel at home’ environment for applications if VM technology can deliver enough performance

parts of this now part of a GT4 tech preview

Page 54: Site Access Control

Grid Middleware III 56

LCMAPS usage with the WSS

Page 55: Site Access Control

Grid Middleware III 57

Virtualisation

A VM can serialize all of its state (including RAM) A VM image is simply a collection of files

Disk partitions, RAM, configuration file Such image can be easily moved (migrated) between

hypervisors of the same type Such image can also be saved and used for rollbacks

Hardware

Virtual Machine Monitor (VMM) / Hypervisor

Guest OS(Linux)

Guest OS(NetBSD)

Guest OS(Windows)

VM VM VM

AppApp AppAppApp

Page 56: Site Access Control

Grid Middleware III 58

Types of virtualisation

Depending on the layer you virtualize you will end up with a different VM API: language VMs (JVM) ISA: system VMs (VMware)

Different types of system virtual machines Full virtualization (VMware)

Run multiple unmodified guest OSs Para-virtualization (Xen, UML, Denali)

Run multiple guest OSs ported to a special architecture Single OS image (Vserver)

What is the cost of using VMs? Paper by Kate Keahey et al. “From Sandbox to Playground:

Dynamic Virtual Environments in the Grid”, Grid 2004

from: Kate Keahey’s PPAM 2005 talk

Page 57: Site Access Control

Grid Middleware III 59

The Need for Speed

L X V U

SPEC INT2000 (score)

L X V U

Linux build time (s)

L X V U

OSDB-OLTP (tup/s)

L X V U

SPEC WEB99 (score)

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

Benchmark suite running on Linux (L), Xen (X), VMware Workstation (V), and UML (U)

Paper: “Xen and the Art of Virtualization”, SOSP 2003

Page 58: Site Access Control

Grid Middleware III 60

What Makes VMs Great

Summary of VM properties: Good isolation properties

Generally enhanced security, audit forensics Excellent enforcement potential

Details depend on implementation Customizable software configuration

Library signature, OS, maybe even 64/32-bit architectures Serialization property

VM images (include RAM), can be copied The ability to pause and resume computations

Allow migration

How do we make VMs available over the network and manage them so as to leverage this potential? Challenges: security, enforcement, protocols

from: Kate Keahey’s PPAM 2005 talk

Page 59: Site Access Control

Grid Middleware III 61

What are Virtual Workspaces?

Virtual Workspaces: environments that can be made available dynamically the Grid well-defined properties in terms of environment definition and resource

usage enforcement

Examples: A physical cluster booted to a desired configuration (e.g. Cluster on

Demand) A Grid3 node dynamically configured using Pacman A cluster partition configured with a hypervisor A VM representing an OSG configuration enforcing memory and CPU

usage

Workspaces can be implemented using a variety of technologies VMs are the most promising

from: Kate Keahey’s PPAM 2005 talk

Page 60: Site Access Control

Grid Middleware III 62

Virtual Workspace

Environment Aspect (workspace meta-data) Information/state that outlives its deployment

Generic information (name, time to live) Attested software partition information: OS, “OSG configuration”,

“application installation”, etc. Services: ssh, GRAM, pre-configured job

Resource allocation request (deployment time) Flexibly negotiated within desired constraints

See GGF WS-Agreement standard Memory, disk, networking, etc.

See GGF JSDL standard On deployment the actual resource allocation information

becomes available for inspection

Atomic workspaces and virtual clusters Clusters are simply aggregate workspaces

from: Kate Keahey’s PPAM 2005 talk

Page 61: Site Access Control

Grid Middleware III 63

Deploying Workspaces in the Grid

Define workspace environment

Manage workspace

Negotiate workspace deployment characteristic

WorkspaceWizard

(VW Factory)

Workspace Management

Service(VW Repository)

Workspace Service

(VW Manager)

request a workspace

workspace meta-data

manage workspace environment

workspace metadata

Workspace

terminate workspace deployment

negotiate workspace deployment

manage/monitor/renegotiate workspace deployment

manage activities within the workspace

from: Kate Keahey’s PPAM 2005 talk

Page 62: Site Access Control

Grid Middleware III 64

Current Implementation

Current prototype using Globus Toolkit 4 Leveraging standard Grid Service features

Workspace Wizard Returns workspace meta-data Very rudimentary implementation

Workspace Service Create: takes workspace meta-data and a deployment descriptor Manage:

renegotiate resource allocation Also traditional Grid Service management: TTL, etc.

Destroy Different options: pause, shutdown or destroy

First tech preview release expected later this month

from: Kate Keahey’s PPAM 2005 talk

Page 63: Site Access Control

Grid Middleware III 65

How dynamic is the deployment?

Automatic Protocol-based Moving towards better articulation of migration Renegotiation of resource allocation

How fast is this deployment? Deployment of workspace for EMBOSS suite:

Manual: ~45 minutes Based on pre-configured Vmware VMs: ~6 minutes Based on pre-configured Xen VM: < 1 second

How much overhead does workspace deployment add over what we have today?

from: Kate Keahey’s PPAM 2005 talk

Page 64: Site Access Control

Grid Middleware III 66

How much deployment overhead are we adding?

Using a paused VM allows us to “save” on initiation time

8

8

8

0.7

0.7 1.7

0.8

0.8

0 2 4 6 8 10 12

a)

b)

c)

job

sta

rtu

p s

ce

na

rio

time (in seconds)

VM setup

VM boot

job setup

GRAM job

a) GRAM job executionb) GRAM job execution in a paused Xen VMc) job execution in a booted Xen VM (pre-configured job)

from: Kate Keahey’s PPAM 2005 talk

Page 65: Site Access Control

Grid Middleware III 67

Virtual Grids?

In principle, VM technology opens the possibility for virtual grids, built as overlay networks for full virtualisation need a VPN between the VMs as well

but policy and deployment issues remain regulatory and system sanity requirements

for hosting providers to know who is running where

Page 66: Site Access Control

Grid Middleware III 68

Virtual Playgrounds

Application

Virtual Grid

from: Kate Keahey’s PPAM 2005 talk

Page 67: Site Access Control

Access Control to Services

Fine grained control within the integrated framework

Page 68: Site Access Control

Grid Middleware III 70

AuthZ in the container

authorisation decision taken in the stub (hosted by the container) service provider business logic does not see the authZ today, does not even get the list of attributes back for perusal

challenge for services that themselves are concerned with ACLs …

Page 69: Site Access Control

Grid Middleware III 71

AuthZ for fine-grained services, pre-WS

LRCPolicy Engine

Policy Database

LFN1 PIDA

LFN2 PIDB

PIDAgroup1: read; group2: all;

group 3: none; user7: read

PIDBgroup1: read, write;

group2: all; group 3: all

user7, read, PIDA

permit

Client Request

auth

z ca

llout

From: Ann Chervenak Authorization for Globus Replica Services, EGEE Design Team Meeting, ANL 2005

Page 70: Site Access Control

Grid Middleware III 72

Policy Identifiers and Policy Engines

Associate entries in the RLS with one or more policy identifiers E.g., policy identifier could represent an ACL

When a client attempts to create, update or access an RLS entry, the authorization framework invokes a policy engine

Policy engine understands how to enforce the policy associated with each policy ID

Assumption is that there will be a relatively small number of unique policies

Typically, data organized as datasets or collections, with permissions on all objects the same

From: Ann Chervenak Authorization for Globus Replica Services, EGEE Design Team Meeting, ANL 2005

Page 71: Site Access Control

Grid Middleware III 73

AuthZ in a WS world with AuthZ stubs

LRC

Policy Engine

Policy Database

LFN1 PIDA

LFN2 PIDB

PIDAgroup1: read; group2: all;

group 3: none; user7: read

PIDBgroup1: read, write;

group2: all; group 3: all

(1) Client Request

GT

4 A

utho

rizat

ion

Fra

mew

ork

(3) Request PIDs for logical names

(6) Query policies for PIDs

(2) Custom auth callout (includes client request)

(8) permit or deny

(9) If permitted, pass client request to LRC

Custom PDP

(5) Pass policy ID, subject, object, action

(7) permit or deny

(4) PIDs

From: Ann Chervenak Authorization for Globus Replica Services, EGEE Design Team Meeting, ANL 2005

Page 72: Site Access Control

Grid Middleware III 74

WS-RF Version The client request (1) is handled at the container level, where

a custom authorization callout is performed. The authorization callout passes the entire client request to a

custom Policy Decision Point (PDP) (2). Then the PDP queries the LRC (3) to obtain the policy

identifiers associated with the request. LRC returns list of unique policy identifiers to the PDP (4). PDP passes the client information, requested operations,

policy IDs and objects of the request to policy engine (5). Policy engine queries associated policy database (6) to obtain

the policies associated with policy identifiers. The policy engine makes decisions to permit or deny the

request and returns these decisions to the PDP (7). The permit/deny decisions are passed to the custom

authorization framework (8). For permitted operations, the authorization framework passes

the client request to the LRC for execution (9).

From: Ann Chervenak Authorization for Globus Replica Services, EGEE Design Team Meeting, ANL 2005

Page 73: Site Access Control

Network Access

Provisioning connectivity

Page 74: Site Access Control

Grid Middleware III 76

Firewalls and NAT

Traditional site network protection has been centred around the concept of firewalls

some sites also adopt NAT wrongly advertised as being a security solution or because of supposed ‘lack of address space’ in fact, it is still used as it is the path of least resistance …

firewalls and NAT are the most common obstacles to grid computing many protocols are not firewall friendly (like GridFTP); or require inbound connectivity to the farm (service containers)

Page 75: Site Access Control

Grid Middleware III 77

Firewall connectivity ‘white’ solutions

explicit application-level solutions outbound connectivity only via external proxy boxes move to transport over port 80 (http) and hope for dumb

firewalls that are not packet-inspectingJabber, R-GMA MON boxes, proprietary solutions

explicit solutions connectivity provisioning & grid-aware firewalls/routers similar to provisioning network links in a point2point setup like

lambda provisioningEGEE DCS, GGF Firewall Issues RG work, …

Page 76: Site Access Control

Grid Middleware III 78

Firewall avoidance, the ‘black’ methods

implicit brokering trap program syscalls and route the traffic at the user level dynamically select the systems that allow incoming traffic or interpose a connection brokerCondor Generic Connection Broker, Sonny’s work, …

tcp splicing with two co-operating applications, transmit TCP packet serials

out-of-band, synch them up, and try to fool the firewall may take a long time, but it doesn’t break to the protocol it is a dedicated abuse of the protocol that admins may detect

and consider a crack.IBIS Java toolkit

Page 77: Site Access Control

Grid Middleware III 79

Example: Dynamic Connectivity Servicesee DJRA3.2 chapter 6

Page 78: Site Access Control

Next: once you’ve got access …

Compute and brokering services