5
Microsoft SQL Server Customer Solution Case Study Yahoo! Improves Campaign Effectiveness, Boosts Ad Revenue with Big Data Solution Overview Country or Region: United States Industry: Media and entertainment— Internet broadcasting Customer Profile Yahoo! is a leading digital media company that provides a range of online services, including a group of popular consumer websites. Business Situation The company wanted to give its advertisers more useful analytic advertising data and to increase data processing performance. Solution Yahoo! partnered with Microsoft to implement a solution integrating the Yahoo! Hadoop data processing framework with Microsoft SQL Server 2008 R2. Benefits Improves ad campaign effectiveness and increases advertiser spending. Cube produces 24 terabytes of data quarterly, making it the world’s largest SQL Server Analysis Services cube. Handles more than 3.5 billion daily ad impressions, with hourly refresh rates. “Yahoo! can now provide more relevant advertising data which has increased advertising spending and campaign effectiveness. We have achieved this by combining Hadoop and Hive technologies that handle large data sets with the powerful analytic insight provided by the Microsoft BI platform.” Dianne Cantwell, Lead TAO Developer, Yahoo! California-based Yahoo! operates one of the most popular websites in the world, with more than 700 million unique visitors a month globally. The company owns and operates an online advertising exchange for the large number of customers who purchase advertising on various Yahoo! sites. They take advantage of this exchange to better target and manage their ad campaigns. Because it sought to give these customers more meaningful and useful analytical data faster, Yahoo! implemented a solution that takes data from its vast data stores within the Apache Hadoop open-source framework and ultimately moves it to Microsoft SQL Server 2008 R2. Using this solution, Yahoo! has improved campaign effectiveness, while advertisers have increased spending with Yahoo! The company also provides more relevant advertising data, and the solution’s partitioning design means faster loading of large data sets.

Yahoo! Improves Campaign Effectiveness, Boosts Ad Revenue ...download.microsoft.com/.../710000001707/Yahoo_SQLServer2012…  · Web viewCalifornia-based Yahoo! operates one of the

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Yahoo! Improves Campaign Effectiveness, Boosts Ad Revenue ...download.microsoft.com/.../710000001707/Yahoo_SQLServer2012…  · Web viewCalifornia-based Yahoo! operates one of the

Microsoft SQL ServerCustomer Solution Case Study

Yahoo! Improves Campaign Effectiveness, Boosts Ad Revenue with Big Data Solution

OverviewCountry or Region: United StatesIndustry: Media and entertainment—Internet broadcasting

Customer ProfileYahoo! is a leading digital media company that provides a range of online services, including a group of popular consumer websites.

Business SituationThe company wanted to give its advertisers more useful analytic advertising data and to increase data processing performance.

SolutionYahoo! partnered with Microsoft to implement a solution integrating the Yahoo! Hadoop data processing framework with Microsoft SQL Server 2008 R2.

Benefits Improves ad campaign effectiveness

and increases advertiser spending. Cube produces 24 terabytes of data

quarterly, making it the world’s largest SQL Server Analysis Services cube.

Handles more than 3.5 billion daily ad impressions, with hourly refresh rates.

“Yahoo! can now provide more relevant advertising data which has increased advertising spending and campaign effectiveness. We have achieved this by combining Hadoop and Hive technologies that handle large data sets with the powerful analytic insight provided by the Microsoft BI platform.”

Dianne Cantwell, Lead TAO Developer, Yahoo!

California-based Yahoo! operates one of the most popular websites in the world, with more than 700 million unique visitors a month globally. The company owns and operates an online advertising exchange for the large number of customers who purchase advertising on various Yahoo! sites. They take advantage of this exchange to better target and manage their ad campaigns. Because it sought to give these customers more meaningful and useful analytical data faster, Yahoo! implemented a solution that takes data from its vast data stores within the Apache Hadoop open-source framework and ultimately moves it to Microsoft SQL Server 2008 R2. Using this solution, Yahoo! has improved campaign effectiveness, while advertisers have increased spending with Yahoo! The company also provides more relevant advertising data, and the solution’s partitioning design means faster loading of large data sets.

Page 2: Yahoo! Improves Campaign Effectiveness, Boosts Ad Revenue ...download.microsoft.com/.../710000001707/Yahoo_SQLServer2012…  · Web viewCalifornia-based Yahoo! operates one of the

SituationYahoo! Inc., based in Sunnyvale, California, is an Internet company that operates several popular websites. These sites, which include a search engine, web portal, and news feed, are visited by more than 700 million unique visitors each month, reaching more than 47 percent of the global online population.

Advertisers flock to these sites, attracted by the massive online audience. To help those advertisers analyze data on consumer reach and engagement, Yahoo! built Targeting, Analytics and Optimization (TAO), a powerful, scalable advertising analytics tool. TAO reports on the thousands of advertisers who run campaigns on Yahoo! sites such as Autos, Finance, Health, Mail, News, Search, Sports, and Travel, via the Right Media Exchange (RMX).

One component of the TAO platform is Apache Hadoop, an open-source software framework for reliable, scalable, distributed computing. Created by Yahoo!, Hadoop is used to analyze vast amounts of unstructured data using commodity server computers and distribute that data to live applications.

For the past several years, Hadoop has been the primary tool for managing big data for the company’s advertising analytics efforts. Each day, Hadoop handles more than 3.5 billion ad impressions, with hourly refresh rates. TAO’s source cluster handles data at a rate of 464 billion rows of data each quarter.

Although Hadoop has been successful at handling large data sets for Yahoo!, the company still needed to extract more

meaningful analytical information from that big data, in order to conduct more ad-hoc and in-depth analysis. With this capability, the company would be able to quickly respond to customer needs.

Specifically, Yahoo! advertisers wanted to provide more relevant ads to consumers, so the ads would be perceived as recommendations. For instance, targeted ads that are more relevant are more likely to make an impression on Yahoo! users, driving them to take action by viewing the ad or clicking on it.

To provide this information, Yahoo! sought more visibility into how consumers are responding to ads based on different categories such as websites, time of day, gender, age, location and interests. By offering this level of analytics, Yahoo! would be able to help advertisers reach their targeted audiences efficiently to achieve the best return on their investment.

Additionally, Yahoo! wanted to improve the performance of its TAO database so it could deliver more data to customers faster. Lower latency would help users optimize campaigns more frequently, which is important for campaigns lasting only a few days.

In early 2010, Yahoo! decided to seek a new, high-performance business intelligence (BI) solution that could work with Hadoop.

SolutionYahoo!, a Microsoft customer for many years, chose to capitalize on its strong relationship with Microsoft when choosing technologies for its new solution. Yahoo!

25

Page 3: Yahoo! Improves Campaign Effectiveness, Boosts Ad Revenue ...download.microsoft.com/.../710000001707/Yahoo_SQLServer2012…  · Web viewCalifornia-based Yahoo! operates one of the

worked closely with Microsoft to create a new BI solution that takes advantage of Microsoft SQL Server 2008 R2 Enterprise data management software.

Using SQL Server 2008 R2, Yahoo! enhanced its TAO infrastructure, which now takes data from a Hadoop cluster into a third-party database, where it is loaded into a SQL Server 2008 R2 Analysis Services cube. The cube then connects to client applications such as Tableau Desktop business analytics software and in-house custom applications. Employees use this software to create interactive data dashboards and perform ad hoc analysis.

The new infrastructure, implemented on IBM x3560 server computers, also has a novel approach toward partitioning, optimally designed to improve the performance of queries made against very large data sets. In this model, the source data is loaded into a relational database. Within that database, the data is stored in a partitioned table. Each partition equates to roughly one hour of data, which is combined and then distributed to four partitions per day on the cube side.

Storing the data this way gives SQL Server 2008 R2 Analysis Services the ability to read and process the data at a much faster rate than if the data were not stored in partitioned tables. As a result, performance is improved for queries made against very large data sets.

The TAO infrastructure now includes a 2-petabyte Hadoop cluster that feeds 1.2 terabytes of raw data each day into 11g Real Application Clusters in the third-party database. From this point, this data is

ultimately compressed and 135 gigabytes of data per day is sent to a SQL Server 2008 R2 Analysis Services cube. The cube produces 24 terabytes of data each quarter, making it the largest known SQL Server Analysis Services cube in the world.

Microsoft has developed the SQL Server Connector for Apache Hadoop, which is designed to facilitate efficient data transfer between Hadoop and SQL Server 2008 R2.Using this solution, enterprise customers would be able to move large volumes of Hadoop data to SQL Server 2008 R2 and could gain deeper business insights from structured and unstructured data.

The SQL Server Connector for Hadoop could potentially give Yahoo! the ability to load data even faster. The company plans to analyze the result of big-data processing jobs on Hadoop in familiar analysis tools like SQL Server 2008 R2 Analysis Services.

Yahoo! is also partnering with Microsoft to determine the best way to get the data from Hadoop to the SQL Server 2008 R2 Analysis Services cube. To provide further integration between Hadoop and the Microsoft BI environment, Microsoft has been working on a prototype connector to Hive, the data warehouse infrastructure built on top of Hadoop. One area of research is the use of the Hadoop Hive Open Database Connectivity (ODBC) Driver, a software library that implements the ODBC API standard for Hive. By using this driver, currently in prototype, Yahoo! would theoretically be able to pull/push data directly from Hadoop to the SQL Server 2008 R2 Analysis Services cube.

35

Page 4: Yahoo! Improves Campaign Effectiveness, Boosts Ad Revenue ...download.microsoft.com/.../710000001707/Yahoo_SQLServer2012…  · Web viewCalifornia-based Yahoo! operates one of the

Microsoft is also using the same Hive ODBC driver to integrate Hadoop with xVelocity in-memory analytics engine (VertiPaq) in PowerPivot for Excel. The connector will also integrate with the xVelocity memory-optimized columnstore index feature that speeds data warehouse query processing in SQL Server 2012.

BenefitsThe new TAO solution has already helped Yahoo! improve campaign performance and boost customer spending on advertising. Additionally, Yahoo! can offer more relevant advertising data to its customers, and it can load and retrieve analytic data faster than before.

Increases Advertising Spending and Campaign EffectivenessWith the new TAO infrastructure, with SQL Server 2008 R2 as a key component, Yahoo! has seen major benefits in both advertising spending and campaign effectiveness. As advertisers see improving returns from their investment in campaigns running on Yahoo! properties, they are more willing to expand their ad purchases.

On the supply side, TAO also helps Yahoo! improve the monetization of its inventory by tracking metrics such as effective cost per thousand impressions (eCPM) across a variety of dimensions. In general, higher eCPMs means that Yahoo! and its publishing partners generate higher revenue for their ad inventory.

Yahoo! advertising executives attribute these gains in part to the use of the SQL Server 2008 R2 Analysis Services cube, which gives Yahoo! advertisers an easier

way to target segments of web users more precisely.

Provides More Relevant Advertising DataWith improved advertising analytics gained from the new Microsoft solution, Yahoo! can provide more relevant advertising data, which translates into better performance for advertisers and more revenue for Yahoo!

The company can provide more relevant data because the newly enhanced TAO infrastructure surfaces customer segment performance for Yahoo! campaign managers and advertisers. Before the new solution was implemented, Yahoo! campaign managers and advertisers were less effective in gauging the effectiveness of advertising campaigns. Now, the interactions between the SQL Server 2008 R2 Analysis Services cube, custom web applications, and Tableau present a much clearer picture of how well specific campaigns are working, and how well the Yahoo! sites are monetizing.

Overall, the new solution gives Yahoo! better insight into advertising data, and makes it available to a wider audience of business users.

Loads Data Faster and Produces Faster QueriesThe partitioning design of the new TAO infrastructure was crucial in speeding the time it takes to load data into the cube. Partitioning was fundamental to the success of the new Microsoft solution, because it helped facilitate faster processing throughput from the source staging database to the analytic cube.

45

Page 5: Yahoo! Improves Campaign Effectiveness, Boosts Ad Revenue ...download.microsoft.com/.../710000001707/Yahoo_SQLServer2012…  · Web viewCalifornia-based Yahoo! operates one of the

This partitioning strategy has also contributed to faster query times. For Yahoo! TAO users, the average query time from Tableau Desktop is six seconds, and the average query time for the company’s custom optimization application is two seconds.

Yahoo! plans to continue to expand the solution by adding more data and new features in the future.

Microsoft Server Product PortfolioFor more information about the Microsoft server product portfolio, go to:www.microsoft.com/servers

55

For More InformationFor more information about Microsoft products and services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Information Centre at (877) 568-2495. Customers in the United States and Canada who are deaf or hard-of-hearing can reach Microsoft text telephone (TTY/TDD) services at (800) 892-5234. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information using the World Wide Web, go to:www.microsoft.com

For more information about Yahoo! products and services, call (408) 349-3300 or visit the website at: www.yahoo.com

This case study is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.

Document published February 2012

Software and Services Microsoft Server Product Portfolio− Microsoft SQL Server 2008 R2

Enterprise− Microsoft SQL Server 2012 Enterprise

Microsoft Office− Microsoft Excel 2010

Technologies− Microsoft SQL Server 2008 R2 Analysis

Services− Microsoft SQL Server PowerPivot for

Microsoft Excel

Hardware IBM x3560 server computers