8
Looking at Big Data in a Whole New Light Featuring research from What is Big Data and what it could do for you The Huge Potential of Big Data Big Data Strategy Components: IT Essentials About CtrlS, Asia’s largest Tier 4 Datacenter Issue 1 2 3 8

Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

Looking at Big Data in a Whole New Light

Featuring research from

What is Big Data and what it could do for you

The Huge Potential of Big Data

Big Data Strategy Components: IT Essentials

About CtrlS, Asia’s largest Tier 4 Datacenter

Issue 12

3 8

Page 2: Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

2

The Huge Potential of Big Data

As small and medium enterprises witness the clear business advantage that Big Data gives to large corporations, their interest in harnessing the power of their own data is growing.

SMEs have never been known as early adopters of technologies but in 2013, the question that most companies are asking themselves is not ‘Do we even have big data?’ but ‘What can we do with our big data?’

What is Big Data?

Big Data refers to huge volumes of data that are gathered from multiple disparate sources.

Data streams can be anything from RFID sensors, twitter or other social media streams, credit card transaction data, user information, GPS coordinates, local weather information and other sources. The data points are heterogenous and in volumes of many hundreds of thousand or even millions each day.

What can Big Data do for your company?

If data is properly gathered, stored, sorted and analysed, it can yield business intelligence and knowledge that translates into real word advantages for organisations analyses data sets on a huge scale. Many trends and insights cannot be gained without Big Data.

An online store for example, may not be able to gauge its brand value or properly identify its most valued customers, its most profitable SKU’s or the promotions that are most effective, by analysing just daily or monthly sales and usage data.

If it were to analyse every single transaction ever made, every single promotional coupon ever used and other data that it had recorded, it would be able to accurately pinpoint most valued customers, analyse weekly and seasonal sales trends, judge the effectiveness of different promotions and other important data. By analysing social media feeds and mentions of its brand name, the organisations could get real time trend analysis of its brand value.

Huge data sets allow much more complex analysis and this makes many unique insights possible; big businesses have been using this

business intelligence for many years now, to give them an advantage in the market.

The challenges of Big Data

Though Big Data can be a very powerful tool if used correctly, it provides many challenges to companies of all sizes. These difficulties are tougher to deal with for SMEs, which is why most of these organisations have failed to take advantage of Big Data till very recently.

Collecting data is not a problem these days, there are a multitude of sources that generate huge streams of information. The problem is the storage of this data. How does an organisation record every data point and store it securely? If irrelevant data points are to be discarded and only important data points are to be stored, how is the data sorted?

Once the data is collected and stored, the most important task is the correct and speedy analysis of the data and correlation of seemingly disparate data sources to yield insights.

All of this requires resources that most SME’s do not have access to, such as huge amounts of cheap, secure storage, massively powerful parallel processing, large memory capabilities and fast connectivity.

Big Data for Small businesses: Made possible in the Cloud

This is where cloud computing comes in; the economies of scale of huge data centres from companies like CtrlS make it possible for companies with limited resources to use Big Data, economically and effectively.

CtrlS offers Big Data Solutions in the Cloud that allow SME’s to start manipulating their own huge data sets with zero capital investment, low operating costs and the ability to scale up quickly with their business needs.

Source : CtrlS Datacenters

Page 3: Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

3

Research from Gartner

Big Data Strategy Components: IT Essentials

If they haven’t done so already, CIOs need to take the lead in developing their enterprise’s “big data” strategy. Here we look at several IT-oriented strategy components that complement business strategy components.

Key Challenges

• Most enterprises are either embarking on an initiative related to big data or intend to do so in the near future, yet almost no organizations have an articulated strategy for such a move.

• Big data initiatives stretch an organization’s IT setup in new ways and can strain relations with business units.

• Many big data initiatives originate within business units. This puts added pressure on the IT organization to get adequately prepared to support them.

Recommendations

• Evaluate infrastructure and architecture elements to ensure adequacy for the anticipated volume, velocity and variety of data.

• Enact more stringent data governance controls to deal with the severe reputational and business continuity risks that come with many big data sources and uses.

• Expand analytics capabilities beyond basic business intelligence (BI) to be able to leverage the full depth and breadth of big data sources for higher-order business value.

• Prepare to make organizational adjustments and be aggressive about obtaining the skills required for specific data management, preparation and analytic needs.

Strategic Planning Assumption(s)

Through 2015, business analytics needs will drive 70% of investments in the expansion and modernization of information infrastructure.

Introduction

Internet searches for the term “big data” have increased 14 times since the beginning of 2011 (see Figure 1) and the number of Web pages mentioning the same term have increased 30 times during that same period (see Figure 2) according to data from Google Trends. It is also the top site search term among clients on Gartner.com thus far in 2012. Gartner finds that the juxtaposition of the perplexity about and popularity of the term “big data” has resulted in a wide strategy gap for both IT departments and business units intending to capitalize on it.

FIGURE 1 Google Search Results for the Term “Big Data”

Data from Google Trends Source: Gartner (October 2012)

Page 4: Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

4

This kind of strategy gap, fueled by both a burgeoning interest and a lack of understanding, creates both tremendous risk and tremendous opportunity for enterprises. On one hand, companies are at risk of overinvesting in big data before they are prepared to execute on it. On the other they are at risk of underinvesting and ceding competitive advantage. The opportunities of big data are only beginning to be realized. Regardless of the lack of an overarching strategy, many businesses have still achieved one-off, high-value successes but scant few CIOs (Gartner estimates put the figure at less than 1%) have explained to business executives how big data can be transformative or disruptive on an enterprise or industry scale. Nor have they laid out or fully

considered the many critical elements required to coalesce a big data IT and business strategy. Some of these strategy essentials are shown in Figure 2.

Analysis

Ensure Infrastructure AdequacyMany traditional and even state-of-the-art technologies were not designed for transaction processing or customary data warehousing – at least not for today’s or tomorrow’s level of data volume, velocity and variety. As discussed in “The Importance of ‘Big Data’: A Definition” big data demands “cost-effective, innovative forms of information processing for enhanced insight and decision making.” Even as data grows exponentially along these three dimensions,

FIGURE 2 Number of Internet Pages Mentioning Big Data

Data from Google Source: Gartner (October 2012)

Table 1. Business and IT Strategy Essentials

Business Strategy Essentials IT Strategy Essentials

• Acknowledge how big data initiatives are unique

• Generate big ideas for big data

• Identify potentially valuable data sources

• Build business leadership belief in data

• Become even more pragmatic about investments

• Ensure infrastructure adequacy • Consider alternate information architectures • Anticipate and govern risks • Expand analytic capabilities • Assemble necessary skills • Alter IT organization structures

Source: Gartner (October 2012)

Page 5: Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

5

investments required for scaling technologies like processors, storage, database management systems (DBMSs) and analytics to perform sufficiently can grow even faster. To counter these intractable economics CIOs need to consider a variety of ways to upgrade their infrastructure in support or anticipation of big data requirements.

These include:

• DBMS advances in loading, indexing and parallelism

• Grid computing

• Columnar DBMSs

• NoSQL databases

• Distributed file systems

• Map-reduce processing

• In-memory databases

• Data warehouse appliances

• Usage-driven tiered storage

• Cloud-based data and processing

• Complex event processing

• Enterprise content management

• Search

Consider Alternate Information ArchitecturesTraditional information architectures fail to accommodate massive, high-speed and flexible data flows. Customary data warehouse architecture that often incorporates staging areas, operational data stores, atomic data structures and data marts can create both untenable duplication of voluminous data and the kind of availability lags that are unsuitable for increasing real-time analytic demands. Similarly, the rigor of established dimensional modeling techniques make broad-based, unstructured and low-fidelity data challenging to design into well-articulated star and snowflake structures. In addition to some of the data management technologies above, supporting and alternate architectural considerations include:

• Data warehouse federation (logical data warehouse, for example)

• Sandboxes (for non-operational experimentation)

• Enterprise content stores

• Self-service (business unit managed) analytic environments

• Automated tagging and linking

• Data streaming

• Complex event processing

Anticipate and Govern RisksBig data also raises the specter of significant risk to business brand and compliance. Data sources frequently include personal, sensitive or proprietary information that can be more prone to mishandling and misuse. Even when individual data sources themselves do not contain explicit information, the integration of multiple sources may enable triangulation that could expose corporate secrets or identify individuals. This risk can be especially perilous when information is shared outside the organization with business partners, suppliers, trade organizations or government; or when information assets are packaged for sale on the open market.

In addition, since much big data is in the form of unstructured information, data structures themselves cannot be used to ascertain the sensitivity of the content within (what is written in emails or evident in multimedia, for example).

Attention to data governance programs to define a lattice of principles, guidelines, standards, policies and procedures – each with degrees of rewards and penalties for compliance – is critical. Big data sources demand dedicated data stewards to ensure their proper acquisition, curation and usage. Other risk mitigation methods include compliance alerts, reporting and forensics; information security procedures and technologies; contingency scenario planning and even electronic data insurance now offered by a few major underwriters.

Expand Your Analytic CapabilitiesAnalytics is the No. 1 use of big data, yet common BI solutions are limited in their analytic capacity – particularly when it comes to unstructured data or much more than hindsight-oriented reporting and extrapolation. As discussed in “Ten Reasons to Reach Beyond Basic BI” leading organizations look beyond traditional query and reporting capabilities to consider predictive analytics, data, text and even multimedia mining, increasingly illustrative and layered forms of visualization, complex event processing, rule engines and natural language query. These types of applications represent the “systems of innovation” layer of Gartner’s pace

Page 6: Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

6

layer model (see “Applying Gartner’s Pace Layer Model to Business Analytics”).

Gartner’s analytic ascendency model (see Figure 3) shows how the value of analytics increases as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented predictive and prescriptive analytics. As business benefits increase, so do the challenges in developing and implementing these approaches. Part of the challenge is in sourcing, integrating and analytically processing data. For this reason, big data’s value for hindsight-only analytics is limited and economically indefensible. Conversely, big data’s utility and value proposition is much greater for higher-order analytics that provide deeper understanding, broader relevancy and farther visibility.

Additionally, many great information-borne innovations today are the result of business analysts exploring and experimenting with available big data sources and unleashing advanced analytic techniques and technologies such as data and text mining, machine learning and animated visualization.

In short, IT executives need to plan to select and implement a range of analytic capabilities that ensure the ability to generate positive economic benefits from big data information assets.

These assets include:

• Predictive analytics

• Scenario planning

• Data mining

• Text and multimedia mining

• Visualization and animation

• Complex event processing

• Rule engines

• Natural language processing

• Mobile

• Gamification

Assemble the Necessary SkillsManipulating and understanding big data demands a range of necessary skills but the new talent required to manage and leverage these information assets is in exceptionally short supply. These skills include data integration and preparation; business and analytic modeling; collaboration and communication; and creativity.

FIGURE 3 Gartner’s Analytics Ascendency Model

Source: Gartner (October 2012)

Page 7: Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

7

Indeed, the role of the data scientist is emerging as somewhat of a panacea, not only for generating new insights, but also for finding ways to use available data in automating and optimizing business processes. Gartner has identified that the companies hiring for these roles are looking for skills that go beyond your basic BI analyst and statistician (see “Emerging Role of the Data Scientist and the Art of Data Science”). We are also seeing a flood of new graduate and certificate programs in advanced analytics that are highlighted in this research.

Moreover, accumulating, managing and preparing big data sources to add to their intrinsic value demands technical skills beyond traditional data modeling, data architecture, data integration and database management.

As these skills are in extremely short supply, acquiring them warrants a three-pronged strategy comprised of:

• Aggressive recruiting from universities, competitors and other industries

• Internal skill development through incentives and training

• Leveraging available consulting talent to fill gaps an mentor

Alter IT Organization StructuresBig data initiatives have a strong tendency to stretch and test traditional IT organizations in unique ways. Most are badly equipped to deal with an individual business unit’s desires or attempts to manage and leverage big data on its own. CIOs must be prepared to affect the necessary changes because resisting them to maintain IT standards and the status quo will result in being shut out of enterprise strategy dialogs. Big data initiatives are especially demanding on the partnerships between IT and the core business.

Current IT projects and responsibilities for maintaining and implementing core systems generally take priority over speculative big data projects. This results in business units getting a budgetary, resource and a technological jump on IT, leaving the IT organization behind the curve when it comes time to roll out the solution. To mitigate this, CIOs must ensure that, as a minimum, IT acts in a way that enables business units to take on an initial self-service role. This includes being a project advisor, a consultant and a standards bearer. In addition, IT should ensure that data governance and security precepts are enhanced as necessary and complied with. As big data solutions become ready for operation, IT should be in a position to take over the management of infrastructure, architectural and application components.

Big data also breeds the need for other emergent roles and organizational alignments. First, the aforementioned risk-related concerns that are frequently more pronounced with big data warrant formal data governance procedures and organization. Ideally the data governance council is chartered by the corporate governance function and includes both IT and line-of-business delegates as well as dedicated data stewards. Second, although the role of the data scientist is appearing in some organizations, organizationally we see team-based approaches to the role since individuals with the full skill set are so difficult to find (see “Emerging Role of the Data Scientist and the Art of Data Science”).

Much like the relationship between Peter Brand and Billy Beane biographied in the book and movie “Moneyball,” we are beginning to see data scientists appearing as dedicated executive team resources. This phenomenon is a testament to leaders who want instant answers and value information-fueled innovation as well as an indication of the rise of information as a true enterprise asset.

As enterprises mature from being information-intent to information-oriented to information-based, room for new senior roles in the IT organization emerge.

These include:

• Chief Data Officer: Responsible for sourcing, managing and deploying all information assets.

• Chief Analytics Officer: Responsible for BI solutions, advanced analytics (data mining and predictive analytics, for example), data science groups and as the steward for enterprise-standard algorithms and metrics.

Two other roles are just starting to emerge fueled by the explosion of data (especially content) that Gartner encourages and expects to be mainstream by the end of the decade:

• Information Asset Manager: Part of the information management and/or data governance organizations, these individuals are responsible for architecting the cradle-to-grave supply chain for information assets, borrowing from traditional asset management principles and practices.

• Information Product Manager: A business role responsible for conceiving, developing and implementing ideas for the direct or indirect monetization of information assets across business units, with industry partners, with existing customers and/or in nascent information markets.

Source: Gartner Research, G00 238944, Doug Laney, 15 October 2012

Page 8: Looking at Big Data in a Whole New Light · as capabilities expand from hindsight-oriented descriptive analytics, to insight-oriented diagnostic analytics and ultimately to foresight-oriented

8

Looking at Big Data in a Whole New Light is published by CtrlS Datacenters Ltd. Editorial content supplied by CtrlS Datacenters Ltd. is independent of Gartner analysis. All Gartner research is used with Gartner’s permission, and was originally published as part of Gartner’s syndicated research service available to all entitled Gartner clients. © 2013 Gartner, Inc. and/or its affiliates. All rights reserved. The use of Gartner research in this publication does not indicate Gartner’s endorsement of CtrlS Datacenters Ltd.’s products and/or strategies. Reproduction or distribution of this publication in any form without Gartner’s prior written permission is forbidden. The information contained herein has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. The opinions expressed herein are subject to change without notice. Although Gartner research may include a discussion of related legal issues, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a public company, and its shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartner’s Board of Directors may include senior managers of these firms or funds. Gartner research is produced independently by its research organization without input or influence from these firms, funds or their managers. For further information on the independence and integrity of Gartner research, see “Guiding Principles on Independence and Objectivity” on its website, http://www.gartner.com/technology/about/ombudsman/omb_guide2.jsp.

About CtrlS

Headquartered in Hyderabad, CtrlS Datacenters Ltd. was founded in 2007 by the INR 750 Crore Pioneer Group, which is primarily involved in IT - services, consulting and infrastructure. The Group has been growing at more than 100 percent compound annual growth rate of over the past 15 years.

CtrlS has datacenters in Hyderabad and Mumbai with an upcoming facility in Delhi. The company has developed the capabilities to provide platform level services like datacenter infrastructure, storage, backup, hardware, OS layers, network and security layers. It offers a host of outsourced business solutions and services such as Disaster Recovery on demand, Managed services, Private cloud-on-demand to enable clients to make the paradigm shift from the captive datacenter model to the outsourced one.

The CtrlS datacenter is Tier IV certified and provides 99.995% uptime guarantee, less than 22 minutes of downtime in a year and N+N redundancy. With 1.42 PUE, it is the most power efficient datacenter in India. Dual power sources and an additional captive power plant ensure uninterrupted power and cooling systems. It also provides high bandwidth availability and a choice from India’s leading TELCOs.

It is also the only one of its kind in India to provide 8-zone security, scalability for up to 10 years, guaranteeing the highest availability and least energy consumption. Armed with top-of-the-line features and the very best of infrastructure and technology, it offers clients an array of benefits which can drive a saving of up to 40 percent on total cost of ownership. It has ISO-20000-1, ISO-27001 and BS 25999 certifications.

CtrlS datacenters are widely recognized as the best in its class with awards such as the CII awards for the Most Energy efficient and the Most Innovative Energy Efficient unit, NASSCOM award for Top 50 emerging companies 2 years in a row and CIO choice award: Best Datacenter in the Managed Services.

Delhi HyderabadMumbai