10
Enterprise Security for Big Data Environments A Multi-Layered Architecture for Defense-in-Depth Protection THOUGHT LEADERSHIP FROM ORACLE AND INTEL® | JULY 2016 IN ASSOCIATION WITH

A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

Embed Size (px)

Citation preview

Page 1: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

Enterprise Security for Big Data Environments A Multi-Layered Architecture for Defense-in-Depth Protection T H O U G H T L E A D E R S H I P F R O M O R A C L E A N D I N T E L ® | J U L Y 2 0 1 6

IN ASSOCIATION WITH

Page 2: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

1 | ENTERPRISE SECURITY FOR BIG DATA ENVIRONMENTS: A WHITE PAPER FROM ORACLE AND INTEL

Introduction: More Data, More Risks IT professionals have always been tasked with ensuring the safety and provenance of corporate information. Today that responsibility is magnified for two primary reasons: there is more data and there is more risk. At home and at work, our appetite for data is insatiable. From book recommendations to climate research, manufacturing to healthcare, tiny sensors to giant mainframes, the sources and amounts of data continue to expand, along with our ability to store, exchange, and analyze that data for personal and professional use. According to researchers at IDC, the digital universe will be 44 times bigger in 2020 that it was in 2009, a byproduct of the digitization of nearly every activity in personal, public, and commercial life.

Even as data grows and becomes more essential to our livelihood, the risk of its misuse escalates as well. In every industry organizations are embarking on big data initiatives. More data and more information systems means more opportunities for intrusion. Practically every day we hear news stories about data breaches involving high-profile banks, vendors, and retailers—not to mention countless other cases of personal attacks on our individual systems and identities. Even big data that has been anonymized can be correlated with other data sets to discern personally identifiable information.

Business and IT executives are learning through harsh experience that big data brings big security headaches. – MIT Technology Review

Unfortunately, most security technologies aren’t foolproof, leaving organizations exposed to malicious code and criminal intentions. Traditional IT security takes a haphazard approach to quelling threats, even as destructive malware is set loose in plain view on the Internet. The majority of IT security budgets are used to protect the network, with less than a third used to directly protect the data and intellectual property that reside inside the organization, according to CSO Market Pulse. Network firewalls and antivirus software packages do little to prevent these security breaches, most of which involve tricking end-users into running malicious programs on their desktops, thus invalidating firewall protection.

This white paper describes a comprehensive big data security strategy from Oracle and Intel that protects big data environments at multiple levels. This three-pillared strategy focuses on essential controls that secure data at the source. The strategy includes:

• Preventive controls to mitigate unauthorized access to sensitive systems and data

• Detective controls that reveal unauthorized system and data changes through auditing, monitoring, and reporting

• Administrative measures that help keep track of sensitive data, so you always know where all your big data resides, and who is authorized to access it.

The Many Uses of Big Data Architecturally, big data consists of highly distributed systems, linked by inter-node communication technologies. In most cases, that data is online and is shared across many different functional components. It is accessible to authorized users on internal networks. IDC identifies three primary big data use cases:

Page 3: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

2 | ENTERPRISE SECURITY FOR BIG DATA ENVIRONMENTS: A WHITE PAPER FROM ORACLE AND INTEL

Operational intelligence focuses on high-velocity data streaming and event processing that facilitates up to the moment decision-making. It is often tied to sense-and-respond processes that entail monitoring a stream for specific events and then queuing up an appropriate response. These systems may involve a feedback loop in which a real-time data stream is monitored for events, and then the raw data from the stream is loaded into a database for additional analysis. For example, sensors on an assembly line can detect when a machine is out of tolerance. After-the-fact analytics can determine what is causing a recurring problem.

Exploration and discovery is geared towards discovering signals, relationships, and patterns in the data. The goal is to uncover insights that impact decision making as well as to monitor organizational performance to establish best practices, make informed predictions, and deliver actionable insights from a steady stream of information.

Performance management involves strategic decisions about past performance. By supplementing traditional data warehouse analytics with Big Data analytics you can increase the timeliness of business reporting as well as entertain new types of data sources, from IoT sensors to cellular call data to social media streams.

Security Threats and Limitations Generally speaking, outsiders are prevented from accessing big data environments by traditional perimeter security at the boundaries of a private network. However, with today’s sophisticated break-in strategies, perimeter security is no longer adequate. Hacking has evolved from “crime for kicks,” carried out by mischievous youth, to global espionage, hacktivist, and black market criminals that are part of sophisticated crime syndicates and money launderers. These malicious organized criminals exist solely to rob individuals and organizations of their money and intellectual property. Criminals often try to lift health information, credit card numbers, and other vital information in order to sell it on the black market.

No company wants its data to be compromised or its systems to be breached. However, most traditional IT security practices aren’t strong enough to resist the new types of malware, phishing schemes, netbots, and SQL injection attacks unleashed by cybercriminal organizations. When it comes to detrimental security breaches, it is no longer a question of if, but when.

Perimeter-based approaches to security are no longer sufficient. A CSO Market Pulse survey found that two-thirds of security budgets are used to protect the network, with less than a third used to directly protect the data and intellectual property that reside inside the organization.

Today’s big data environments often include both sensitive and nonsensitive data (including anonymous data). Hackers can correlate de-anonymized data sets to identify people and their preferences. For example, one high profile test case involved hacking an anonymous data set from Netflix. Security professionals correlated this data with Internet Movie Database (IMDB) data to identify members of both services, and then compared the two data sets to show how they could discover political leanings, sexual preferences, and other personal information, all based on the movies people watched. Another company looked at a data set about New York City taxicab services (pickups, drop-offs, fare amounts) and then correlated it to people in the area to figure out where certain people tended to go, including celebrities. They could have potentially extended this method to tracking political figures as well.

Security Issues With Hadoop

Many of today’s big data projects incorporate Apache Hadoop, an open-source framework for storing and processing big data in a distributed fashion. Business analysts load data into Hadoop to detect patterns and extract

Page 4: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

3 | ENTERPRISE SECURITY FOR BIG DATA ENVIRONMENTS: A WHITE PAPER FROM ORACLE AND INTEL

insights from structured, semi-structured, and unstructured data. Unfortunately, not all organizations have strong data security in place for these activities. There may be personally identifiable information and intellectual property loaded into these data sets.

Initially developed as a way to distribute big data processing jobs among many clustered servers, the Hadoop architecture wasn’t built with security in mind. Namely, it lacks access controls on the data, including password controls, file and database authorization, and auditing. As such, it doesn’t comply with important industry standards such as the Health Insurance Portability and Accountability Act (HIPAA) and the Payment Card Industry Data Security Standard (PCI DSS). In the European Union, General Data Protection Regulations (GDPR) introduce many additional obligations for companies, with fines of up to 4 percent of annual turnover or €20 million for companies that don’t comply. According to the regulations, both data controllers and data processors may be subject to court proceedings and have to pay compensation to victims for infringements of the regulations.

A Hybrid Approach to Big Data Security from Oracle and Intel Many big data projects begin with a small test group in an isolated sandbox environment and then steadily grow into large-scale production implementations. At some point they go online—often before proper security controls have been implemented. This progression endangers not only the data environment but other production systems as well.

Whether moving data to the cloud or storing data on premises, customers want to know how to secure all of their structured and unstructured data. Oracle and Intel offer a hybrid approach that preserves investments in existing

databases while allowing you to leverage data coming in from other sources.

The Oracle big data environment consists of several different technologies including Oracle Big Data Appliance with Hadoop Distributed File System along with Oracle Database and Oracle NoSQL Database. Intel adds industry-leading encryption technology within its Xeon® processor family. Oracle also offers a multi-layered defense-in-depth security architecture, which will be covered in more detail below.

Cloudera Enterprise software enables real-time analytics on massive data sets with enterprise-class data protection. These innovative capabilities enhance open-source

Apache Hadoop solutions. Oracle and Intel enhance these implementations as follows:

• Cloudera Enterprise can be run on Oracle Big Data Appliance to achieve industry-leading performance

• Oracle supplies integrated security, with access and data protection at each layer via accelerated encryption capabilities

• Cloudera Enterprise software includes built-in support for enterprise-class access controls. It is also optimized for Intel® Advanced Encryption Standard New Instructions (Intel® AES-NI), a technology that is built into Intel® Xeon® processors.

In the remainder of this paper, we will explain how these unique and complementary technologies enable a complete security strategy for big data implementations that addresses all the crucial aspects of infrastructure security, data privacy, data management, data integrity, and reactive security.

Introduction to Oracle Defense-in-Depth Security What are the main causes of most security breaches? Inadequate security controls, excessive privileges granted to internal users, and an over-reliance on network and perimeter security. Defense-in-depth is an information

Page 5: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

4 | ENTERPRISE SECURITY FOR BIG DATA ENVIRONMENTS: A WHITE PAPER FROM ORACLE AND INTEL

assurance strategy in which multiple layers of security are established throughout the IT infrastructure. Oracle and Intel use this proven approach to extend security and encryption technology all the way down to the silicon layer.

Having redundant controls provides exceptional resiliency in the event of a security breach. If a vulnerability is discovered and exploited in one layer, the attacker will invariably be stopped in another layer—much like a medieval castle with a moat, iron doors, heavily guarded ramparts, and so forth. When properly implemented, a defense-in-depth security strategy not only prevents breaches, but also buys an organization time to detect and respond to attacks, reducing or mitigating the consequences of the breach.

Oracle’s multilayered, defense-in-depth security strategy utilizes three sets of controls: preventive, detective, and administrative.

In a layered, defense-in-depth security architecture, everything on top inherits the security from below—from the silicon to the firmware to the operating system to the applications to the middleware to the data. This is the most secure and efficient way to set up a big data environment since it maximizes the security controls at each layer. For example, database security is more efficient than application security since the database underlies the applications. You might have hundreds or even thousands of applications. Rather than coding encryption instructions into each application, it is much easier and more effective to handle encryption in the database. As each of the applications call the database, the data is encrypted—at rest and in transit.

Another reason to “push security down” as low as possible in the stack is because it has less of an impact on performance. The ultimate goal is to push security down to the silicon layer, so that data is encrypted within the processor, or “chip.” This allows for safe, high performance, in-memory processing.

Oracle’s in-memory processing technology allows you to process many gigabytes of data in memory, at the silicon layer, rather than retrieving data from disc drives. Thanks to Intel AES-NI technology, data in memory as well as in transit can be programmed to be encrypted with minimal impact to performance.

Page 6: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

5 | ENTERPRISE SECURITY FOR BIG DATA ENVIRONMENTS: A WHITE PAPER FROM ORACLE AND INTEL

Preventive Controls Preventive controls stop intruders from gaining unauthorized access to systems and data. They also help to govern administrative access by putting realms around the database. Administrators can only access data from the realms for which they are authorized.

Along with encryption, Oracle Advanced Security controls include data redaction, which redacts sensitive data out of the application layer. Users looking at an application may see asterisks instead of actual information. For example, social security numbers might reveal only the last four digits for reference purposes. The data in the database is encrypted, but redacted when viewed.

Data masking is similar to redaction, but for nonproduction environments. With Oracle Data Masking and Subsetting, sensitive information such as credit card numbers and social security numbers can be replaced with non-factual values, allowing production data to be safely used for development, testing, or sharing with partners. This comes into play in situations such as when a third party is testing an organization’s code. During testing, information such as credit card numbers is substituted with appropriate data, rather than actual numbers.

Oracle Database Vault increases the security of the Oracle database by preventing unlimited, ad-hoc access to application data from administrative accounts as well as by governing legitimate administrative activity.

Oracle Label Security protects sensitive data by assigning a data label or data classification to each row in an application table. It mediates access by comparing the data label against the label of the user requesting access.

Page 7: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

6 | ENTERPRISE SECURITY FOR BIG DATA ENVIRONMENTS: A WHITE PAPER FROM ORACLE AND INTEL

The Encryption Paradox

Data encryption is at the heart of a good prevention strategy. It helps address privacy and regulatory requirements by encrypting personally identifiable information such as social security and credit card numbers. Unfortunately, many businesses don't use encryption because of a perceived performance hit associated with encrypting and decrypting the data. Traditional encryption software requires compute-intensive process that can slow down querying, reporting, and analytics, putting a thorn in the side of big data security.

Thanks to a close engineering relationship between Oracle and Intel, customers no longer need to choose between performance and data protection. Intel® Data Protection Technology with Advanced Encryption Standard New Instructions (AES-NI) reduces performance latency for encryption and decryption operations at the silicon level, for all big data operations.

Details on the Intel Encryption Solution

Intel tests have shown that Intel AES-NI can accelerate encryption and decryption performance in an Apache Hadoop cluster by up to 17x and measured by in memory data processing with AES CTR mode. The process is transparent to users. It can be applied on a file-by-file basis, and it works in combination with a broad range of standards-based key management solutions. When an encrypted file enters the Apache Hadoop environment, it remains encrypted in HDFS. It is decrypted as needed for processing and re-encrypted before it is moved back into storage. The results of all analysis activities are also encrypted, including intermediate results. Data and results are never stored or transmitted in unencrypted form.

These advanced encryption capabilities allow you to take full advantage of Apache Hadoop while protecting sensitive data and complying with industry regulations including the Payment Card Industry (PCI) security standard and the Health Insurance Portability and Accountability Act (HIPAA). You can enable HDFS Transparent Encryption for an entire Cloudera Enterprise cluster with no significant performance penalty.

Detective Controls

Detective controls reveal enterprise wide changes through auditing and reporting. While encryption and access control are key components to protecting data, a comprehensive monitoring system must also be in place. In the same way that video surveillance cameras supplement alarm systems inside and outside business buildings, monitoring inbound requests inside file servers, operating systems and databases is core to data protection. Detective controls centralize auditing and reporting across your organization so you can detect if a security breach has occurred, or your system has been compromised.

No question about it: In the age of big data, organizations need to adopt a data-centric approach to security. Specifically, they need to employ three key types of security controls: Preventive, Detective, and Administrative. – MIT Technology Review

Administrative Controls Oracle’s administrative controls include security processes and procedures that help you keep track of sensitive data. Knowing precisely where all your big data resides enables you to systematically administer the environment while ensuring that there are no unauthorized changes in the database environment.

Page 8: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

7 | ENTERPRISE SECURITY FOR BIG DATA ENVIRONMENTS: A WHITE PAPER FROM ORACLE AND INTEL

One of the identity management challenges enterprises face is the lack of a single source for identity data and the proliferation of identity stores, including directories and databases. Oracle solves this problem with Oracle Internet Directory, a general-purpose LDAPv3 compliant directory storage that serves as a central user repository for defining access to Big Data applications, simplifying user administration and providing a standards-based application directory for the entire enterprise.

Oracle Internet Directory works in conjunction with Oracle Access Manager, a comprehensive solution for web access management and user identity administration that includes an Access System and an Identity System. The Access System secures Big Data applications by providing centralized authentication, authorization and auditing to enable single sign-on and secure access control across enterprise resources. The Identity System manages information about individuals, groups and organizations. It also enables delegated administration of users, as well as self-registration interfaces with approval workflows.

Oracle Big Data Management Oracle offers a hybrid approach to big data processing that accommodates relational, NoSQL, and unstructured data. Hadoop, Oracle Database, and Oracle NoSQLDatabase become key components of the big data ecosystem, thanks to Oracle’s industry-leading big data technologies. Many technologies come into play, but big data management is anchored by two key products:

• Oracle Big Data Cloud Service running Cloudera Enterprise and Oracle NoSQL Database

• Oracle Big Data SQL for unifying queries across Oracle Database and Hadoop

Many organizations gravitate to the cloud or to pre-built clusters such as the Oracle Big Data Appliance so they won't have to spend the time and effort to create a commodity cluster, which requires specialized engineering skills to deploy, optimize, and tune for real-time data analysis. Powered by fast, efficient Intel® Xeon® processors, Oracle Big Data Cloud Service and Oracle Big Data Appliance are optimized for Apache Hadoop and other types of large-scale data analytics. This multi-purpose environment is ideal for Hadoop-only workloads such as MapReduce, Spark, and Hive as well as for interactive SQL workloads that use Oracle Big Data SQL. These capabilities are available for on-premises deployment using Oracle Big Data Appliance as well as in the cloud via Oracle Big Data Cloud Service.

Both cloud and on-premises offerings include a complete Hadoop security solution based on Apache Sentry and LDAP-based authorization, pre-configured Kerberos authentication, and centralized auditing with Cloudera Navigator. You can extend security and access policies from Oracle Database to data in Hadoop and NoSQL when querying through Oracle Big Data SQL. Intel Xeon processors ensure fast, secure, high performance encryption for big data analytics.

Both offerings support the latest innovations in encryption of data-at-rest by supporting HDFS Transparent Encryption with a key management facility, along with Intel AES-NI encryption. This implementation enables the tightest security on all data in HDFS. Combining Oracle Big Data Cloud Service or Oracle Big Data Appliance with Oracle Big Data SQL delivers the most comprehensive security of any big data system.

With in-memory processing for big data analytics, and chip-layer encryption, the Oracle/Intel solution is much faster and more secure than Hadoop solutions that are based on commodity hardware, which typically face bottlenecks of 100 MB/second.

Page 9: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

8 | ENTERPRISE SECURITY FOR BIG DATA ENVIRONMENTS: A WHITE PAPER FROM ORACLE AND INTEL

Intel Secure Key with True Random Number Generation

In addition, Intel Secure Key offers true random number generation using secure keys technologies that are extremely difficult to decipher or attack due to Intel’s unique Digital Random Number Generator (DRNG) hardware implementation. Cryptographic protocols rely on this technology for generating keys and refreshing session values to prevent replay attacks. This centralized key management platform accelerates the deployment of encryption across the enterprise.

Securing Big Data in the Cloud

Oracle’s big data security strategy encompasses IaaS, PaaS, and SaaS environments. Thus the same data encryption technology that is built into Oracle Database and powered by Intel Xeon processors is transparently available when data and applications are deployed in Oracle Cloud. Multitenant capabilities ensure that customer data is sequestered and maintained separately from other customer data. Integrated security policies protect every aspect of on-premises, private cloud, and public cloud environments.

Conclusion – Mitigating the Risks, Reaping the Rewards of Big Data With big data comes big responsibility. Given the prevalence of security breaches in nearly every industry, you can’t risk leaving big data unprotected. The traditional approach of securing the IT infrastructure is no longer enough. Today’s threats are multifaceted and often persistent, and traditional network perimeter security controls cannot effectively mitigate them. A holistic approach to big data security begins with protecting sensitive applications and data—both from external and internal threats.

Oracle and Intel are applying technologies, policies, and procedures developed over several decades to secure the big data landscape. Their complementary portfolio of layered, defense-in-depth solutions ensures data privacy, protects against insider threats, and simplifies regulatory compliance. This comprehensive security architecture protects the entire big data environment—on-premises and in the cloud. It includes preventive controls, detective controls, administrative controls, and physical access controls. Multiple security zones restrict access on a “need to know” basis for all IT staff. In addition, logical access controls encrypt data on staff computers, along with personal firewalls, two-factor authentication, and role based accounts.

From the chip level to the application level, Oracle has created a tightly interwoven set of layered defenses for big data initiatives. Built into the cloud, this big data security architecture provides multiple layers of protection, including IaaS, PaaS, and SaaS. Thanks to a tight engineering partnership with Intel, IT organizations no longer have to choose between performance and security when they wish to deploy industry-leading encryption technology. Oracle’s defense-in-depth security architecture includes security controls clear down to the chip level, with high performance data encryption from Intel embedded in the silicon. When run on Oracle Big Data Appliance, which is powered by Intel Xeon processors, big data analytics workloads are optimized for extreme performance, stability, manageability, and security. Thanks to these integrated capabilities, businesses can achieve the competitive advantages of big data analytics, with the confidence that their most sensitive data is protected. From the silicon to the firmware to the operating system to the applications to the middleware to the data—these layered defenses protect big data environments.

Page 10: A Multi-Layered Architecture for Defense-in-Depth … Multi-Layered Architecture for Defense-in-Depth Protection ... most security technologies aren’t foolproof, ... order to sell

Oracle Corporation, World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065, USA

Worldwide Inquiries Phone: +1.650.506.7000 Fax: +1.650.506.7200

Copyright © 2016, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only, and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document, and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. Enterprise Security for Big Data Environments: a White Paper from Oracle and Intel, July 2016

C O N N E C T W I T H U S

blogs.oracle.com/oracle

facebook.com/oracle

twitter.com/oracle

oracle.com