Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
AWS mass adoption in Higher Education and research.
Pervasive cloud, data science, machine learning, big data and HPC education.
RosettaHUB for AWS
80 higher education institutions including 4 among the top 10 universities in the World
20,000 students, educators and researchers
16 Countries including the UK, Ireland, France and Germany
100% Automation of onboarding, resources and consumption monitoring and users management
Fully automated digital university model
RosettaHUBfor Amazon Web Services
RosettaHUB Overview
Governance, Federation and Management for AWS E-learning and E-research platform EduOps and ResOps platform
Account ManagementAutomated users enrollment and processing AWS Accounts full life cycle management Full integration with Liferay's accounts, organizations and rolesFull mapping of organizational hierarchy and responsibilities.Seamless accounts limits management and traceability.Integrated ticketing system
Budget Control and optimizationCosts and resources real-time monitoring and control Management of user budgets' and AWS permissions Safeguards and cost optimizationSeamless Spot market managementAWS grants full life cyle management
Compliance enforcement and cloud access sandboxing Automated AWS accounts limits management Detailed auditing and reportingRosettaHUB management Web Services
Universal Workbench and notebooks at scaleRosettaHUB collaborative Workbench Jupyter servers Unlimited RStudio servers, Shiny Apps serversZeppelin servers, DaaS
Ubuiquitous sharing and real-time collaborationEasy sharing of all RosettaHUB artifacts withusers, groups or organizationsAccess to the RosettaHUB publishing and sharing platform for e-Learning and e-research
Big Data, HPC and Deep Learning made easyRosettaHUB-managed Elastic Map Reduceclusters RosettaHUB nvidia-docker-based virtualenvironments for deep learningRosettaHUB-managed CfnCluster and Alces Flight HPC clustersRosettaHUB spreadsheets for technicalcomputing
End-to-end reproducible e-learning and e-researchManagement platform for Docker containers
Convergent IaC, containers and technical computing APIsSoftware Development Kits: Python, Java, R, C# and js sdksOffice integration: Word and Excel Add-insRosettaHUB meta-cloud and technicalcomputing APIsProgrammable hybrid-kernel (R/Python/Java/Scala)Reactive programming framework.
Full auditabilityDetailed auditing and reportingEnd-to-end traceability
Integration capabilitiesDedicated RosettaHUB portalDedicated publishing and sharing platformSAML/Oauth 2.0/LDAP/Active Directory integrationSsl certificates management platform Programmable emailing platform Advanced scheduling
RosettaHUB, state of the art governance and management platform for AWS
RosettaHUB provides every student and every educator with an account on a social collaboration portal. Each portal account is linked to a private AWS account created, managed and monitored by RosettaHUB.
The portal makes advanced AWS capabilities easy to understand and operate by students and educators. It also makes all cloud artifacts easy to share.
RosettaHUB fully automates the onboarding processes and gives institutions flexibility on budget allocation.
The building blocks of AWS democratization
The institution’ Central Point Of Contact (CPOC) and educators can monitor on real-time (*) the students’ interaction with AWS and the portal.
The CPOC can manage students: adjust their budgets, their rights on AWS, their resources allowances, etc.
The CPOC can create sub-organizations and assign roles to colleagues for a multi-tenant management of students.
System administrators can generate reportson users activities and cloud usage. They can measure and assess effectiveness of the use of cloud resources.
Repositories of pedagogic cloud artifacts can be prepared and shared with students.
(*) ..
End-to-end monitoring, management and audit
The RosettaHUB students and educators dashboards display an access button to the AWS console as well as access keys for programmatic access to AWS. It provides detailed aggregated real-time information about the resources being used on AWS, the budget amount left and the estimated overall hourly cost.
Students and educators can request:
1. Limit increase to access higher capacity machine instance types (eg. p2.*, p3.*, g3.* GPU instances).
2. Access to optional AWS services
3. Budget increase and budget transfer to other users
4. Support
44 AWS Services are accessible by default. Access is available to IAM in a proxied manner to preserve the accounts sandboxing. IAM users and IAM roles can be easily and safely created and managed from the dashboard.(*)
Limits and budget requests are automatically processed by the RosettaHUB pipelines within a predefined scope. RosettaHUB creates and tracks tickets with AWS support.
Students and educators dashboards
Cost optimization and safeguards
Accounts get automatically disabledand all on-demand EC2 instances are stopped if the user goes above 100% of his/her budget or if the estimated hourly price exceeds the maximum hourly price. Spot instances are snapshotted then terminated. No data is deleted when a user is disabled.
Auto-stop on idle EC2 instances: the user can set the maximum idle time or disable this feature. By default it is set to 6 hours.
Notification emails at 50%, 70%, 90% and 100% of budget consumption.
Use of Spot instances is promoted in the RosettaHUB launch panels, spot instances are the first choice when launching instances or clusters.
Users monitoring panel in the CPOC’s management console
Institutions, educators and students take no financial risks as all AWS accounts are guaranteed by RosettaHUB.
RosettaHUB acts as a procurement adapter: It allows Higher Education institutions and research laboratories to top-up their RosettaHUB institutional account with cloud credits in compliance with their regulatory frameworks and administrative constraints.
A dedicated RosettaHUB infrastructure can be fully integrated with the institution’s Information system.
Full technical and compliance integration
RosettaHUB, Next generation e-research and e-learning platform
The RosettaHUB platform closes the technology gap between clouds, containers, data science software, real-time collaboration frameworks, social portals and people.
The RosettaHUB data science platform makes it easy for educators to compose containers-based virtual e-learning environments and for researchers to compose virtual e-science environments.
Jupyter, RStudio, Spark, Zeppelin, Shiny Apps, virtual desktops, HPC clusters, etc. can be added to the virtual environments and made accessible in a secure and highly scalable-manner to thousands of students or collaborating researchers.
Democratic and pervasive data science
Defining the meta-cloud: RosettaHUB Web Services& managed images
Public Cloud
Private Cloud
RosettaHUB delivers :• A docker-based meta-cloud.• A universal data science workbench.• A meta-kernel for data science• A man-cloud and man-data interaction
design• A sharing model for cloud artifacts• A SOAP/Restful API with ~1000 functions• SDKs and add-ins• A cloud and data products marketplace.
RosettaHUB fosters • Usability• Reproducibility• Shareability• Auditability at all layers of interaction between students, educators and researchers and their software tools, infrastructures and peers.
Data scientist
The RosettaHUB dashboard displays the cloud and data science related artifacts as customizable icons structured in categories.
RosettaHUB meta-formations: they enable one-click provisioning and access to fully-managed complex infrastructures for e-learning and e-Research.
RosettaHUB meta-keys: they map AWS access keys and a default VPC, they allow rapid access to AWS services and they can be shared.
RosettaHUB meta-images:
• Managed: they come with agents to orchestrate all service components and expose a composable virtual workbench to the end user
• Semi-managed: they map any EC2 AMI
RosettaHUB meta-storages: they map S3 buckets, EFS or EBS volumes. They can be used as the working or reference volumes for managed instances and clusters.
One-click access to AWS-powered data science
Seamless creation of Hadoop and Spark clusters based on AWS EMR, the RosettaHUB smart proxies and the RosettaHUB workbench.
Support for both on-demand and spot.
Seamless access to clusters with shells and notebooks including RosettaHUB notebooks, Zeppelin, Jupyter, Spark-Notebook, etc.
Real-time collaborative access, cluster sharing, security and access control for Hadoop and Spark.
Seamless data management, seamless mounting of S3 and EFS volumes on master and slave nodes.
Very rapid big data applications prototypingusing the RosettaHUB reactive programming frameworks, web applications designers and spreadsheet engines.
User-friendly Spark and Hadoop clusters for research and education
Launching an EMR cluster can be done in one click by choosing an available formation or by creating a custom formation with custom settings
Access the cluster’s master in the browser from the RosettaHUB collaborative workbench
Seamless creation of NVIDIA-docker based virtual environments for deep learning on GPU.
Seamless creation and access to HPC clusters based on Alces Flight or cfnCluster, the RosettaHUB smart proxies and the RosettaHUB workbench.
Real-time eagle-view on resources, billing and hourly cost for HPC clusters.
Seamless data management, seamless mounting of S3 and EFS volumes on master and slave nodes.
Extended support for spot and autoscaling.
Out-of-the-box cluster security and access control.
Notebooks, cluster sharing and real-time collaboration for Alces Flight and cfnCluster.
Seamless scheduling using cron and rate tasks.
Interactive Scientific Web UIs and reactive programming frameworks for HPC clusters.
User-friendly managed HPC for research and education
Launching a HPC cluster can be done in one click by choosing an available formation or by creating a formation with custom settings
RosettaHUB ResOps/EduOpsVirtual-labs-as-code
RosettaHUBmeta Formation
Machine
Spot Machine Pool
Machine Pool
Spot EMR Cluster
Spot HPC Cluster
EMR Cluster
Spot Machine
Instance type: p2.xlarge
SSL certificate
Machine Image: Tensorflow GPU
Image
Maximum Bid Price
Spot Machine
Cloud Keys: AWS Keys
RosettaHUB meta-Formations
Reference and Working Volumes
HPC Cluster
Master Instance type: m4.large
SSL certificate
Proxy Image: Standard CPU Image
Slave Instance type: m4.large
EMR Cluster
Cloud Keys: AWS Keys
Reference and Working Volumes
Proxy Instance Type
eg. Deep learning assignments
eg. Big data workshop
RosettaHUB creates for each student and educator a default S3 storage and a default EFS storage which map an S3 bucket and an EFS volume
Formations are configured with working volumes and reference volumes which can be mappings of EFS, EBS, S3 or FTP. These are automatically mounted on the EC2 instances including nodes of HPC and EMR clusters
Any public formation that the user launches automatically uses the default user’s EFS as its working volume: Data generated by students and educators is persistent and survives the termination of machine instances
The reference volume can by synched at start-up to the working volume
Students and educators persistent workspaces
EFS, EBS and S3 Volumes can be automatically mounted on the docker container of the RosettaHUB managed instances
The RosettaHUB meta-formations and Images can be used to create RosettaHUB Sessions. Sessions provide access to the universal workbench and they can be shared with a user or a group of users. Users have the same view on the workbench and can collaboratively create and adjust widgets, interact with tools and data.
Composable widgets include:
• Real-time collaborative consoles, notebooks and code editors on the most commonly used tools for data analysis: R, Pyhton, Scala, RStudio etc.
• Applications access (Jupyter, Zeppelin, etc.)
• Real-time collaborative RStudio
• Real-time collaborative remote desktop access in the browser.
• Data visualization and interaction components such as charts, sliders, buttons.
Universal collaborative workbench
The universal workbench allows the remote interactive control of RosettaHUB meta-kernels created and managed by the RosettaHUB docker agents.
The RosettaHUB meta-kernels are processes merging the virtual machines of Java, R and Python. Meta-kernels allow intercommunication and in-memory transfer of variables from one language to the other
Meta-kernels data access is fully managed by RosettaHUB.
Meta-kernels can be shared as well as their working volumes and reference volumes.
Meta compute kernels& seamless data management
Semi-managed images allow users to easily launch a machine from the RosettaHUB web console using their RosettaHUB keys
Launching semi-managed images can be done in one click from the RosettaHUB dashboard
Access to the instances is managed by RosettaHUB, ie. RosettaHUB generates and saves the private keys associated with the instance as well as the password for Windows instances.
Users can retrieve their private keys and passwords anytime .
Instructions on how to connect to Linux and Windows instances are provided to the user
Semi-managed images
RosettaHUB mass onboarding process
The RosettaHUB automated mass onboarding processfor AWS: Oxford University
Students/Educators register individually at
https://ox.rosettahub.com
Students/Educators verify their email addresses by clicking on a link on the
verification email sent by RosettaHUB
Users with emails ending with the institution’s domain get approved automatically and receive an email
with credentials after a few minutes
Users who register with emails not linked to the institution get approved
manually by the CPOC
CPOC uploads in Excel format lists of students and
educators (first name, last name, email, graduation, bio
link etc.)
CPOC selects the valid student and educators registrations and clicks
process from the RosettaHUB users panel
After a few minutes users receive their credentials for
RosettaHUB
Institution’s CPOC registers at:
https://www.rosettahub.com/institutions
Set default limits for institution: budgets,
budget limits, EC2 instance perimeters, regions, services etc.
Create CPOC’s RosettaHUB account
Allocate domain name to institution, create
dedicated registration website
Create CPOC email linked to institution’s domain
ending with @subdomain.rosettahub.
com
Create AWS master account and assign it to
the CPOC
Enable detailed billing, cost explorer, create
Organization
Create support ticket to increase AWS
Organizations limit
Configure CPOC’s AWS account for
resources/billing monitoring
Configure CPOC’s RosettaHUB account with
default keys, S3, EFS
Affiliation of students & educators using Excel files
Initial setup for a new institution
Students/Educators register individually at
https://subdomain.rosettahub.com
Students verify their email addresses by clicking on a link on the verification email sent by
RosettaHUB
Users with emails ending with the institution’s domain get approved automatically and receive an email
with credentials after a few minutes
Users who register with emails not linked to the institution get approved
manually by the CPOC
Affiliation via individual registrations
The RosettaHUB automated mass onboarding processfor AWS
Add user to the RosettaHUB portal
Create email account on RH email server ending with
@subdomain.rosettahub.com
Create AWS Sub-Account linked to RH email using
AWS Organization
Create IAM user with rights based on the institution’s settings (instance types,
regions, etc.)
Create Roles for EMR, and ElasticBeanstalk and service
roles for all allowed services
Add monitoring to each user’s account: Lambda
function, Cloudtrail
Create RH VPC where all managed RH EC2 instances
will be running
Create secondary IAM user for RH keys enabling spot
instances access
Create user’s default S3 bucket as well as the RH S3 storage artifact that maps
the bucket
Create EFS storage to be used as a default working volume for RH managed
instances
Send welcome email with user’s credentials for
RosettaHUB
Fully automated process for registering students and educators to AWS and RosettaHUB
The RosettaHUB automated mass onboarding processfor AWS
RosettaHUB governance and management platform, modus operandi
RosettaHUB uses AWS building blocksto harness the AWS platform and makeit work seamlessly for research andeducation.
It leverages:
Organizations to streamline the affiliation of students and faculty members.
IAM to restrict the students and educators’ perimeters of action.
CloudWatch, SNS and Lambda to monitor and control resources and budget consumption in real-time.
STS to federate users access to the AWS console.
The AWS building blocks
Amazon CloudWatch
AmazonSNS
AWSLambda
AWSCloudTrail
AmazonS3
AWSOrganizations
IAM
AWS STS
A Lambda function is inserted in each AWS account for real-time monitoring.
The Lambda function on the master account is triggered a few times per day when a new billing report is made available by AWS. This triggers on RosettaHUB computation of all sub-accounts usage. Actions are taken to disable sub-accounts which over-consumed.
The Lambda functions on sub-accounts are triggered whenever EC2, RDS, EBSresources are created or updated.
They send information about compute and storage resources to the platform which estimates consumption on real-time and disables sub-accounts which exceed their hourly cost limits.
Monitoring and audit at scale Institution
master AWS Account
Students and Educators AWS Accounts
Monitors resources on real-time EC2, EMR, ECS, RDS, EBS, S3, EFS ...
Monitors costs on each sub-account
Users can authenticate through institutional SAML or Active Directoryinfrastructures.
Registrations’ lifecycle management actions can be triggered programmatically by the Institutional students management system.
Notification emails can be customized for the institution and custom Email servers can be used.
Cloud resources lifecycle management and sharing actions can be scheduledwith cron and rate tasks.
A dedicated marketplace can be used as an institutional sharing platform for pedagogic and research artifacts (files and data, virtual labs, machines and containers images, etc.)
DedicatedRosettaHUB
Contacts: [email protected]
RosettaHUB Website:https://www.rosettahub.com
To register a new institution:https://www.rosettahub.com/institutions