Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Advanceswith respectto the StateOf The Art
atmosphere-eubrazil.eu
ATMOSPHERE - Adaptive, Trustworthy, Manageable, Orchestrated, Secure, Privacy-assuring Hybrid, Ecosystem for Resilient Cloud Computing (2017-2019) is a Research Innovation Action funded by the European Commission under the Horizon 2020 Programme, Call identifier: H2020-EUB-2017, grant agreement No. 777154, topic: EUB-1-2017 Cloud Computing, including security aspects, and the Secretary of Politics of Informatics (SEPIN) of the Brazilian Ministry of Science and Technology (MCTI) under the corresponding matching Brazilian Call for proposals: 4ª Chamada Coordenada de Programa de Cooperação Brasil-União Europeia em Tecnologias da Informação e Comunicação - TIC.
This document contains information on core activities, findings, and outcomes of the ATMOSPHERE project. Any references to content in both website content and documents should clearly indicate the authors, source, organisation and date of publication.
The document has been produced with the co-funding of the European Commission and the Secretary of Politics of Informatics of Brazil. The content of this publication is the sole responsibility of the ATMOSPHERE consortium and cannot be considered to reflect the views of the European Commission nor the Secretary of Politics of Informatics of Brazil.
3
Table of Contents
pag.4 I ATMOSPHERE Consortium
pag.5 I Glossary
pag.6 I Foreword by Ignacio Blanquer & Francisco Brasileiro
pag.9 I Summary of ATMOSPHERE
pag.10 I ATMOSPHERE Targets
pag.11 I Progress with respect of SOTA by components
pag.11 I Progress in TMA
pag.12 I Progress in TDPS
pag.13 I Progress in TDMS
pag.14 I Progress in IMS
pag.15 I Progress with respect of SOTA in global
4
ATMOSPHERE Consortium
ATMOSPHERE is led by Ignacio Blanquer (Full Professor at Universitat Politècnica de València - Spain) and Francisco
Brasileiro (Full Professor at Universidade Federal de Campina Grande - Brazil). ATMOSPHERE brings together 15
institutions from Europe and Brazil to collaborate on designing and implementing a framework and platform relying on
lightweight virtualization, hybrid resources and Europe and Brazil federated infrastructures to develop, build, deploy,
measure and evolve trustworthy, cloud-enabled applications.
EUROPEAN
EUROPEANEUROPEAN
EUROPEAN
EUROPEAN
EUROPEAN
EUROPEAN
BRAZILIAN
BRAZILIAN BRAZILIAN
BRAZILIAN
BRAZILIAN BRAZILIAN
EUROPEAN BRAZILIAN
5
Glossary
ATMOSPHEREAdaptive, Trustworthy, Manageable,Orchestrated, Secure, Privacy-assuring,Hybrid Ecosystem for REsilient Cloud Computing
CTICCentro de pesquisa e desenvolvimento em ecnologias digitais para informação e comunicação
CLUESCluster Energy Savings
VALLUMCore component of the ATMOSPHERE TDMS
EC3Elastic Compute Clusters in the Cloud
ECEuropean Commission
FogbowFramework for federating clouds
GDPRGeneral Data Protection Regulation
GPUGraphics Processing Unit
ICTInformation and communications technology
IaaSInfrastructure as a Service
IMSInfrastructure Management Services
IMInfrastructure Manager
INCTInstituto Nacional de Ciência e Tecnologia
K8SKubernetes
LGDPLei Geral sobre a Proteção de Dados
LEMONADELive Exploration and Mining of massive Amountsof Data coming from Everywhere
MCTICMinistério da Ciência, Tecnologia, Inovaçõese Comunicações
CNPqConselho Nacional de Desenvolvimento Científicoe Tecnológico
NoSQLNot Only SQL
ODBC/JDBCOpen DataBase Connectivity / Java Database Connectivity
PAFPrivacy Assessment Framework
RNPRede Nacional de Ensino e Pesquisa
SCONESecure Linux Containers with SGX
SGXSoftware Guard Extensions
SOTAState of the Art
TOSCATopology and Orchestration Specificationfor Cloud Applications
IM-TOSCATOSCA-compliant version of Infrastructure Manager
TEETrusted Execution Environment
TDMSTrustworthy Data Management Services
TDPSTrustworthy Data Processing Services
TMATrustworthy Monitoringand Assessment
VMVirtual Machine
6
Foreword byIgnacio Blanquer & Francisco Brasileiro
With the increasing trend towards data processing in the cloud (79% in 20191 and 94% expected in 20212), the need for
security, privacy, fairness, and transparency in big data applications looms ever larger in the public consciousness. The
Big Data Applications should comply with security and other trustworthiness properties, namely privacy, fairness, and
transparency, to avoid collateral damage. This is where ATMOSPHERE comes into play.
Over a 24 month period, the ATMOSPHERE project (www.atmosphere-eubrazil.eu), funded by the European Commission
and the Brazilian Government, designed and developed a set of toolboxes and federation services to build and evaluate
trustworthy data analytics applications in federated clouds. The ATMOSPHERE platform makes life easier for different
experts dealing with digital data. As a result, data owners, system administrators, application developers & managers,
and data scientists can now develop more trustworthy and secure cloud computing applications while being compliant
with data protection regulations on both sides of the Atlantic.
Under the savvy and experienced management and technical coordination of the University of Valencia and the
University of Campina Grande, the fifteen partners of the ATMOSPHERE consortium developed eight services which,
together, support an entirely new spectrum of trustworthy services usable by a diverse set of organisations from
different business sectors: DNAt, Fogbow, IM-TOSCA/EC3, LEMONADE, Capacity Planner, SCONE, TMA and Vallum.
1 Source: RightScale 2019 - State of the Cloud Report 2 Source: Cisco Global Cloud Index: Forecast and Methodology, 2016-2021 White Paper
7
Figure 1: The ATMOSPHERE EU-Brazil international use-case improving trustworhtiness of distributed big data
applications
An additional service, the RHD screening Artificial Intelligence tool, focuses on the health sector and enables the
secure and efficient processing of medical images, metadata, and clinical information, with the ability to evaluate
trustworthiness in performance, privacy, availability, robustness, and dependability. It processes a large set of medical
images, along with additional metadata and clinical information, efficiently and securely, leading to better and quicker
disease diagnosis. Assets are already being used by research and business organisations in both Europe & Brazil. These
include Dell, Vodane, Talkdesk, INDRA, as well as EGI Foundation and the Brazilian National Education and Research
Network (RNP). In addition, funded initiatives such as PRIMAGE, RESECA-CPS, EOSC Synergy and TETRAMAX will apply
the services in their work plan, and yet others are expected to use the assets, particularly those available in the European
Open Science Cloud (EOSC) marketplace portal.
ATMOSPHERE constitutes the latest step in a long trajectory of Europe-Brazil collaborative projects in cloud computing,
started back in 2010. The EUBrazil Cloud Connect (eubrazilcloudconnect.eu) set up the basis for creating federated
infrastructures for scientific collaborations between Europe and Brazil. In EUBraBIGSEA (www.eubra-bigsea.eu), it
was created a platform for data analytics to build data science applications on top of cloud resources. SecureCloud
(securecloudproject.eu) built the basic foundations for the secure access and processing of data, used on SCONE.
Finally, ATMOSPHERE provided the concept of Trustworthiness and implement the means to measure, monitor, assess
and improve it for data analysis applications.
Furthermore, ATMOSPHERE delivered a “Final Research & Innovation Research Priorities" report, proposing three
research topics for future joint initiatives between the regions with high potential economic impact in both. This
analysis is the new iteration of an analysis performed by EUBrasilCloudFORUM project (eubrasilcloudforum.eu), which
also constructed a Final Research Roadmap between Brazil and Europe, in 2017, that was transferred and updated by
ATMOSPHERE.
8
The Cloudscape Brazil series, a forum to showcase success stories between Brazil and Europe, was first initiated by
the EUBrazil Cloud Connect project, later transferred to EUBrasilCloudFORUM and then to ATMOSPHERE. Following
up with the legacy of connecting ICT experts from both sides of the Atlantic, the Cloudscape Brazil editions organised
by ATMOSPHERE were the meeting point for business people, public sector representatives, research scientists and
developers behind some of the most exciting developments in cloud technologies from both regions. The event has
been facilitating consensus on actions that matter for European and Brazilian economies and socio-economic aspects,
as well as connecting innovative ICT SMEs to promote international collaboration with potential to have a positive
impact on local citizens.
Figure 2: The EU-Brazil long story of ICT collaboration
The trans-oceanic federated infrastructure built by ATMOSPHERE, resulting in technology transfer to new initiatives
and industry players, could have not been set up without a long story of collaboration between Europe and Brazil. This
collaboration has been benefiting both societies at large and it has been an honour for us to have such an active role
to contribute to society. Looking forward to joining to new opportunities that will arise in the future, to keep with the
legacy that was created in the past decade.
Ignacio Blanquer & Francisco BrasileiroFull Professor at Universitat Politècnica de València (Spain) &
Full Professor at Universidade Federal de Campina Grande (Brazil)
High-level Data Analytics Framework (LEMONADE), a TOSCA orchestrator (IM) and a performance modelling system(DAGSIM)
Federation capabilities(Fogbow) to work in a
trans-continental scenario
Cloudscape Brazil & Research RoadmapBasic foundations (SCONE)
for the secure access and processing of data
9
Summary of ATMOSPHERE
Adaptive, Trustworthy, Manageable, Orchestrated, Secure, Privacy-assuring, Hybrid Ecosystem for REsilient Cloud
Computing (2017-2019) --hereinafter “ATMOSPHERE”-- is a 24-month Research and Innovation Action, funded by the
European Commission under the H2020 Programme, Call identifier: H2020-EUB-2017, Grant Agreement No 777154,
topic: EUB-1-2017 Cloud computing, including security aspects; and the Secretary of Politics of Informatics (SEPIN) of
the Brazilian Ministry of Science and Technology (MCTI) under the corresponding matching Brazilian Call for proposals:
4ª Chamada Coordenada Programa de Cooperação Brasil-União Europeia em Tecnologias da Informação e Comunicação
– TIC.
ATMOSPHERE has designed and developed a framework and a platform to implement trustworthy cloud services on
a federated intercontinental resource pool. Trust in a cloud environment is considered as the reliance of a customer
on a cloud service and, consequently, on its provider. ATMOSPHERE focuses on a broad spectrum of trustworthiness
properties and their measures such as Security, Privacy, Coherence, Isolation, Stability, Fairness, Transparency and
Dependability. Based on the given definition of trust in cloud computing, trustworthiness can be defined as the
worthiness of a service and its provider for being trusted.
ATMOSPHERE supports the development, build, deployment, measurement and adaptation of trustworthy cloud
resources, data management and processing services, demonstrated on a sensitive scenario of distributed telemedicine,
achieving the following three technical results:
• A hybrid federated VM and container platform;
• A development framework with four sets of services:
• Infrastructure Management Services (Cloud Computing Platform)-IMS;
• Trustworthy Monitoring and Assessment Framework;
• Trustworthy Distributed Data Management Services-TDMS;
• Trustworthy Data Processing Services-TDPS.
• A pilot use case on Medical Imaging Processing.
10
ATMOSPHERE Targets
The layers of the ATMOSPHERE platform, described in the previous section, and the avatar representations used to
denote such roles of the potential target users, are indicated below.
ApplicationDevelopersWho are the technical experts that use the software libraries and services of the ATMOSPHERE platform to build trustworthy cloud applications.
ApplicationManagersWho are the users that hold infrastructure credentials and deploy a specific application and the ATMOSPHERE services on top of an ATMOSPHERE infrastructure. They will also manage the needs of the users with respect to the application.
DataScientistsWho make use of both the final applications and the high-level Trustworthy Data Processing Services deployed on top of the ATMOSPHERE platform by Application Managers.
Data OwnersWho upload and share their data on the Trustworthy Data Management Services of the ATMOSPHERE platform.
SiteAdminWho install and configure the Infrastructure Management Services on their cloud sites to implement the federation.
APPLICATION
APP
LICA
TIO
N
TRUSTWORTHY DATAPROCESSING SERVICES
(TDPS)
TRUSTWORTHY DATAMANAGEMENT SERVICES
(TDMS)
INFRASTRUCTUREMANAGEMENT SERVICES
(IMS)
FEDERATED INFRASTRUCTURE
11
Progress with respectof SOTA by components 1. Progress in TMAThe Trustworthy Monitoring and Assessment (TMA) component provides a quantitative evaluation of the trustworthiness
of a service or application by composing the metrics obtained from different services and components. TMA defines
Quality Models, which are a hierarchical representation of several entities. Multiple Quality Models can be combined to
evaluate aggregated/higher level metrics taking into account customisable weights.
The TMA is a service relevant to application managers, who can monitor the application and virtual infrastructure,
application developers, who could develop call backs to adapt the applications when trustworthiness thresholds are
not met, data scientists who could gather information on the high-level trustworthiness metrics related to stability and
fairness, and data scientists, who could define and evaluate the privacy reidentification risks of the datasets they want
to share.
Despite the existing solutions for monitoring, without ATMOSPHERE, it is hard to monitor the cloud health, as typically
we get different measures from different metrics that are individually unlinked. There is a lack of data that represents
the health of the application together with the cloud services. Moreover, adaptations in case of a recess in the
trustworthiness of an application are manual and potentially error-prone. As a final consideration, all these tasks are
time-consuming.
Figure 2: Issues (left) and solutions (right) addressed by TMA.
ATMOSPHERE provides an integrated, detailed and configurable measure of the application cloud health through
navigable quality models that aggregate the information from the infrastructure services, platform services and
application services. Moreover, the TMA can trigger adaptation plans without the need for human interaction to mitigate
even complex scenarios such as anomalies in the execution time, lack of accuracy, reduced isolation or unacceptable
reidentification risks. ATMOSPHERE TMA reduces the time for human interaction and provides historical data of the
status of the application.
Before ATMOSPHERE After ATMOSPHERE
12
-
-
-
-
-
-
2. Progress in TDPSThe Trustworthy Data Processing Services (TDPS) of ATMOSPHERE provide a framework to implement trustworthy
data analytic applications in the cloud. The TDPS provides means for the annotation of the legal framework for data and
services and a programming environment that generates executable code for data analytics from graphic workflows
(Live Exploration and Mining of A Non-trivial Amount of Data from Everywhere - LEMONADE). LEMONADE provides
Data Scientists and Application Developers with measures for fairness, explainability, privacy and stability of Machine
Learning Applications integrated in a Quality Model. Application Managers can link application back-ends to LEMONADE
for the execution of data analytics with automatic management of resources through horizontal elasticity.
Without ATMOSPHERE, developers and data scientists have a hard work to find whether models use sensitive attributes
or discriminate (i.e., are unfair). This could lead to embed too much privacy information from the training data in the
models, increasing the risk of indirectly exposing such data through the models. Additionally, model-based estimations
and predictions are black-boxes and data scientists face a hard time debugging and calibrating them. Therefore,
developers and data scientists may determine whether sensitive attributes are the basis of models and audit their
fairness. Developers should also code themselves mechanisms for understanding the tradeoff between bias and
variance, as well as the model's stability.
Figure 3: Issues (left) and solutions (right) addressed by TDPS.
With ATMOSPHERE data scientists can easily develop data processing workflows without requiring programming
skills and incorporate components that implement the evaluation of fairness and stability. Application developers
can leverage several stability, explanation evaluation and fairness evaluation mechanisms to incorporate them on
their applications in a straightforward way. Moreover, data scientists will get both model outcomes and respective
explanations than can be used to debug and calibrate models more easily.
Before ATMOSPHERE After ATMOSPHERE
13
•
•
•
•
•
•
•
Before ATMOSPHERE After ATMOSPHERE
3. Progress in TDMSThe Trustworthy Data Management Services (TDMS) in ATMOSPHERE include a set of services and components
to securely store and access sensitive information even in untrusted cloud services (Vallum). The TDMS includes
mechanisms for Data Scientists and Application Managers to store data encrypted on disk, process them encrypted in
memory and guarantee that only authorised processes can access such data, preventing users even with administrative
credentials to access data in disk or memory. Application Developers can leverage TDMS for the secure storage and
data access of sensitive data. The TDMS provides Data Owners and Data Scientists with means for privacy attestation
and annotation of legal grounds.
Without ATMOSPHERE, Application Managers have to rely on the Infrastructure provider administrators, as they may
be able to access data volumes or could dump the memory of a running Virtual Machine. When a data owner shares
access to valuable, but sensitive, data for training AI models, access tracking is limited as typically credentials are given
at the level of the user, who could be able to do any kind of processing. Data owners and data scientists do not know
whether operations may lead to privacy violation.
Figure 4: Issues (left) and solutions (right) addressed by TDMS.
By using ATMOSPHERE, a data owner can share sensitive data with the data scientist, but, as the data is sensitive, the
data owner explicitly limits which applications will use them and where the applications could run (e.g. only trustworthy
providers). By running on enclaves, the information remains encrypted in memory, so privacy is preserved even in
untrusted cloud offerings. Data owners are aware of the risk of privacy violation associated with a given operation and
may request counter measures to prevent its execution. Finally, by means of the Privacy Access Forms application, both
the data owner and the data scientist can annotate the legal grounds that require the processing of the data even in an
international scenario.
14
4. Progress in IMSThe Infrastructure Management Services (IMS) of ATMOSPHERE is a framework for cloud federation, cloud orchestration
and performance modelling that provides upper layers of ATMOSPHERE with a reliable and efficient framework for
running distributed cloud applications in international collaborations. The IMS includes a lightweight cloud federation
framework (fogbow) that can be used to deploy cloud applications along several cloud sites, using a federated private
network among them, easing the work of site administrators when managing distributed infrastructures. On top of this
federated cloud, the IMS provides a cloud orchestrator (IM-TOSCA) that can be used to deploy virtual infrastructures
described as code. This way application managers can easily deploy applications and their associated dependencies in
the cloud without strongly binding to the back-ends. The work of application managers in monitoring the applications
is reduced as performance models characterise their expected behaviour, so deviations can be automatically detected.
Without ATMOSPHERE, site administrators have to manually configure cloud federation resources. This needs a deep
coordination among site administrators to federate cloud sites and to allow private connectivity between resources of
different sites. Application managers have to manually configure the applications or develop scripts to automate them,
which are typically bounded to platform specificities, reducing the repeatability and portability. Finally, applications run
without a priori performance guarantees, which requires an extra effort by the application managers to monitor them
and manually fine tune allocated resources if applications require strict deadlines for their execution.
Figure 5: Issues (left) and solutions (right) addressed by IMS.
ATMOSPHERE provides templates for the automatic deployment and adaptation of complex and distributed
applications on cloud resources and network federations. The cloud federation frees site administrators from the
burden of managing credentials of external users, trusting on the application managers, and provides the application
managers with a federated private network, which reduces the need for public IPs, which are both a scant resource
and a vulnerability risk. Finally, performance modelling can define the rightmost allocation of resources for minimising
waste of resources to achieve the results on a given deadline.
•
•
Before ATMOSPHERE After ATMOSPHERE
15
Progress with respect of SOTA in global
The whole ATMOSPHERE platform constitutes a novel approach for the complete management of trustworthiness in
cloud applications related to data analytics, providing components for secure storage and processing, user-friendly
building of data analytic pipelines, efficient execution and legal compliance support in a cloud-agnostic and federated
environment. A summary of the benefits for the four user roles is provided in Figure 6.
Figure 6: Issues (left) and solutions (right) addressed by ATMOSPHERE.
Before ATMOSPHERE After ATMOSPHERE
ATMOSPHERE is funded by the European Union under the Cooperation Programme, Horizon 2020 grant agreement No. 777154.Este projeto é resultante da 4ª Chamada Coordenada BR-UE em Tecnologias da Informação e Comunicação (TIC), anunciada pela Rede Nacional de Ensino e Pesquisa (RNP) e pelo Ministério da Ciência, Tecnologia, Inovações e Comunicações (MCTIC), no âmbito do acordo de cooperação Número 51119.